Recursive Object Traversal in Javascript (like PHP’s print_r…)

I’ve been banging my head against the wall for the past few weeks trying to debug Javascript issues in Internet Explorer. (I can hear you cringing already…) At least, with the advent of IE8, there is now an official script debugger included in the new Developer Tools. Unfortunately, it is severely lacking when compared to the debugging capabilities of Firefox with the Firebug plugin — and debugging with Firebug doesn’t guarantee that the script will work in any version of IE.

I use the following function, based on a script by Matt Hackett, in the console window of the IE8 Developer Tools to quickly inspect any object currently in scope in the script I’m debugging. I have this function declared in a Javascript file that I include into all of my pages while I’m developing them. The arguments are fairly straightforward:

  • x is the object being inspected
  • max is maximum depth of recursion into the object structure
  • sMax is the maximum display length for String properties
  • skipMethods skips displaying object properties that are functions (methods in OOP)
  • sep is the string value used to indent the levels of recursion in the display
  • l is the level of recursion for the current function call

All of the arguments, with the exception of x, have reasonable defaults. You can see the default values in the first five lines of the function. Here is the code:

function print_r(x, max, sMax, skipMethods, sep, l) {
    max = max || 10;
    sMax = sMax || 25;
    skipMethods = skipMethods || true;
    sep = sep || '  ';
    l = l || 0;
    if (l > max) return "[Stopped at " + l + " levels]\n";
    var i, r = '', t = typeof x, tab = '';
    if (x === null)
        r += "(null)\n";
    else if (t == 'object') {
        l++;
        for (i = 0; i < l; i++)
            tab += sep;
        if (x && x.length)
            t = 'array';
        r += '(' + t + ':' + l + ") :\n";
        for (i in x) {
            if (!(typeof x[i] == 'function' && skipMethods)) {
                try {
                    r += tab + '[' + i + '] : ' +
                         print_r(x[i], max, sMax, skipMethods, sep, l);
                }
                catch (e) {
                    return "[ERROR: " + e + "]\n";
                }
            }
        }
    }
    else {
        var sLen = '';
        if (t == 'string') {
            sLen = '[' + x.length + ']';
            if (x.length > sMax)
                x = x.substr(0, sMax) + '...';
        }
        if (!(t == 'function' && skipMethods)) {
            r += '(' + t + sLen + ') ' + x + "\n";
        }
    }
    return r;
}

This function prints all of the properties and, optionally, methods of the object passed as the x argument recursively to a depth of the max argument. It makes it quick and painless to inspect values while debugging. It also labels each level of recursion and shows the length of String properties, which is handy when they’ve been truncated by the sMax argument.

Hopefully, someone will find this useful. A text version of the source is available here.

Filed in Javascript, Technology No Responses yet

An Error Logging Decorator for the MySQLi Object (eMySQLi.php)

Okay, the first question I can hear people asking is:  Why? Why another database wrapper or Decorator object? Aren’t there enough of them out there already? Well, frankly, yes…  but I had a ton of this in my code and no easy pre-packaged solution I could find to resolve it:

$sql = 'SELECT * FROM table';
if ( $mysqli->query($sql) ) {
    <do something cool here...>
}
else {
    $errMsg = '[' . date('Y-m-d H:i:s') . '] ' . __FILE__ . ':' . __LINE__ . "\n";
    $errMsg .= 'Database Error: (' . $mysqli->errno . ') ';
    $errMsg .= $mysqli->error . "\n";
    error_log($errMsg, 3, '/var/log/apache/php-err.log');
}

I like knowing when things go wrong. However, I got sick of writing else statements to log errors whenever a MySQLi method returned false. So, I decided to wrap the MySQLi calls in an object that would take care of logging any errors automatically. I also wanted to improve on a generic escaping function I had written to handle user-submitted data. I only use MySQL, so I don’t need an entire database abstraction layer like PDO, PEAR DB, or others. They seem like overkill to me. A simple Decorator seemed the best solution.

Be advised I’m not making any claims of improving the MySQLi class. This is merely a Decorator to handle error logging and escaping, nothing more. Here’s an overview of the object I created:

class eMySQLi extends MySQLi
{
    public $errLog = '/path/to/php-err.log';
    public $errMail = 'you@yourmail.com';

    private $_lastErr;
    private $_lastErrNo;

    const HOST = '127.0.0.1';
    const USER = 'db_user';
    const PASS = 'db_pass';

    private static $_connections = array();

The class variables and constants are fairly straightforward. The public $errLog and $errMail variables tell the object where to log, and optionally send, error messages. The private $_lastErr and $_lastErrNo variables are placeholders for the most recent MySQLi error information. The HOST, USER, and PASS class constants contain the relevant database connection information. The private static $_connections array is a bit more interesting and will be explained with the next block of code.

    public static function getConnection($dbName,
                                         $host=self::HOST,
                                         $user=self::USER,
                                         $pass=self::PASS) {

        if ( !isset(self::$_connections[$dbName]) )
            self::$_connections[$dbName] = new eMySQLi($host, $user, $pass, $dbName);

        return self::$_connections[$dbName];
    }

I like the idea of using singletons for database connections. However, in my applications, I frequently need access to more than one database at a time (e.g. application data, session data, etc…). This is where the private static $_connections array comes into play.

The getConnection function accepts up to four parameters:  the name of the database and the host name, user name, and password for the database connection. The database connection information is optional and defaults to the previously defined class constants. The function first checks to see if a key of $dbName exists in the $_connections array. If not, it creates a new connection and stores it in the array. It then returns the connection to the caller. Since the function and the array are declared static, I can use the following snippet to easily share a connection anywhere in my code:

$mysqli = eMySQLi::getConnection('db_name');

Luckily, I am using PHP, so I don’t have to worry about multiple threads, concurrency, mutual exclusion, or other nasty issues that can cause subtle errors with singleton code. Since the connection isn’t created until I need it, this is an example of lazy instantiation. Since the method can return different connections based on the $dbName parameter, it’s a very simple example of the factory method pattern, too. I won’t go any deeper into those concepts here. Use the links if you are interested in more information on these subjects.

Next up is the constructor:

    private function __construct($host, $user, $pass, $db='') {
        parent::__construct($host, $user, $pass, $db);

        // $this->connect_error was broken until PHP 5.2.9 and 5.3.0,
        // so use the procedural call to check for errors
        if ( mysqli_connect_error() ) {
            $this->_lastErr   = mysqli_connect_error();
            $this->_lastErrNo = mysqli_connect_errno();

            // If we can't connect, treat this as fatal error
            $this->_handleError(true);
        }
    }

Notice that I declare the constructor as private, in keeping with the singleton pattern. This prevents an eMySQLi object from being directly instantiated.

// This will produce an error!
$mysqli = new eMySQLi('host', 'user', 'pass', 'db_name');

PHP Fatal error:  Call to private eMySQLi::__construct() from invalid context...

I can only instantiate an object through the getConnection method, which helps to ensure that I create only one shared connection per database. Notice that I use the procedural mysqli_connect_error function to test for errors. The object oriented connect_error and connect_errno methods were broken until PHP 5.2.9. If there is an error, I store the values in the $_lastErr and $_lastErrNo class variables, and call the private _handleError method.

Now, I can finally show you why I went through the trouble of creating this wrapper object!

    private function _handleError($fatal=false, $mailFlag=false) {
        $errString = '[' . date('Y-m-d H:i:s') .
                     "]\n(errno: $this->_lastErrNo) $this->_lastError\n";

        // Grab the backtrace and remove the first entry,
        // which is always the current function
        $trace = debug_backtrace();
        array_shift($trace);

        foreach ( $trace as $t ) {
            if ( isset($t['file']) )
                $errString .= basename($t['file']) . ':' . $t['line'] . '  ';
            else
                $errString .= 'PHP Compiled Code  ';

            if ( isset($t['class']) )
                $errString .= $t['class'] . $t['type'];

            $errString .= $t['function'];

            $args = array();
            if ( is_array($t['args']) ) {
                foreach ( $t['args'] as $a ) {
                    switch ( gettype($a) ) {
                        case 'integer':
                        case 'double':
                            $args[] = $a;
                            break;
                        case 'string':
                            $replace = array("\r\n", "\n", "\r");
                            $a = str_replace($replace, ' ', $a);
                            $a = strlen($a) > 64 ? substr($a, 0, 64) . '...' : $a;
                            $args[] = "\"$a\"";
                            break;
                        case 'array':
                            $args[] = 'Array(' . count($a) . ')';
                            break;
                        case 'object':
                            $args[] = 'Object(' . get_class($a) . ')';
                            break;
                        case 'resource':
                            $args[] = 'Resource(' . strstr($a, '#') . ')';
                            break;
                        case 'boolean':
                            $args[] = $a ? 'true' : 'false';
                            break;
                        case 'NULL':
                            $args[] = 'NULL';
                            break;
                        default:
                            $args[] = 'Unknown';
                    }
                }
            }

            $errString .= '(' . (empty($args) ? '' : implode(', ', $args)) . ")\n";
        }

        if ( $mailFlag ) {
            $subject = "MYSQLI ERROR: $this->_lastError";
            mail($this->errMail, $subject, $errString);
        }

        $r = error_log("$errString\n", 3, $this->errLog);

        if ( $fatal )
            exit();
        else
            return $r;
    }

The _handleError method is the raison d’etre (or the reason for existence, for those of you who didn’t take French in High School) for the eMySQLi object. It encapsulates the logging, and optionally the emailing, of any errors associated with the object. _handleError accepts two parameters:  $fatal and $mailFlag, which both default to false. If true, the $fatal parameter will cause script execution to stop after the error has been logged and the $mailFlag parameter will cause an email notification of the error to be sent to the $errMail address. Since I am using the built-in mail function, $errMail can be one or more email addresses separated by commas. Notice in the constructor above that I set $fatal to true if there is an error connecting to the database. Typically, if one of my scripts can’t connect to the database, there’s no reason to continue running it. As always, your mileage may vary (YMMV)…

The _handleError method uses the debug_backtrace function to gather information about the error. debug_backtrace returns an array that contains the file  name, line number, function name, arguments, as well as other information about each step in the call stack. _handleError uses that information to create a prettified error string. Here is an example of the _handleError output:

[2009-07-16 05:36:45]
(errno: 1054) Unknown column 'category_ids' in 'having clause'
SQL: SELECT COUNT(DISTINCT s.scene_id) as total FROM store st... <snip>
test-search.php:65
eMySQLi->query("SELECT COUNT(DISTINCT s.scene_id) as total FROM store st,scene s...")

This type of error reporting, with file names, line numbers, and function/method calls for the entire stack, makes it much easier to track down bugs. From here, it is just a matter of wrapping the other MySQLi methods where I want error reporting added. Any method I don’t override is still available to eMySQLi objects through the beauty of inheritance. Personally, I make extensive use of the query method and not much else, but I’ve wrapped a few others to be more complete. This class could easily be extended to add error logging to all the MySQLi methods, if one chose to do so. This same idea could also be extended to the MySQLi_STMT and MySQLi_Result classes, too. Here are the other methods of the MySQLi class  I have chosen to override:

    public function select_db($dbname, $fatal=true, $mailFlag=false) {
        if ( parent::select_db($dbname) )
            return true;
        else {
            $this->_lastError = $this->error;
            $this->_lastErrNo = $this->errno;
            $this->_handleError($fatal, $mailFlag);
        }
    }

    public function query($query,
                          $resultmode=MYSQLI_STORE_RESULT,
                          $fatal=false,
                          $mailFlag=false) {

        $res = parent::query($query, $resultmode);
        if ( $res === false ) {
            $this->_lastError = $this->error . "\nSQL: $query";
            $this->_lastErrNo = $this->errno;
            $this->_handleError($fatal, $mailFlag);
        }

        return $res;
    }

    public function multi_query($query, $fatal=false, $mailFlag=false) {
        if ( parent::multi_query($query) )
            return true;
        else {
            $this->_lastError = $this->error . "\nSQL: $query";
            $this->_lastErrNo = $this->errno;
            $this->_handleError($fatal, $mailFlag);
            return false;
        }
    }

    public function store_result() {
        $res = parent::store_result();
        if ( $res === false && ($this->errno !== 0 || $this->field_count > 0) ) {
            $this->_lastError = $this->error;
            $this->_lastErrNo = $this->errno;
            $this->_handleError();
        }

        return $res;
    }

Notice in each case that I merely delegate to the corresponding method of the parent class and then test for error conditions. If an error occurs, I call the _handleError method to take care of it. I also have a basic ability to control the execution of the script during severe errors. In the select_db method above I default the $fatal flag to true, just as I do with the initial connection. If a script can’t select the database it needs to use, there is no point for it to continue running after it has logged the error. However, since $fatal is passed as a parameter to the method, it’s easy to change the default behavior.

Now, I am able to simplify my original code and get rid of the extraneous else statements that did the error logging:

$mysqli = eMySQLi::getConnection('db_name');
$sql = 'SELECT * FROM table';
if ( $mysqli->query($sql) ) {
    <do something cool here...>
}
// No else clause needed, because eMySQLi automatically handles any errors

Short and simple, the way I like it!

Lastly, I also wanted to encapsulate a method for escaping client data submitted to the database. What better place for this type of method than my new eMySQLi class?

It’s a pain in the ass, but every single piece of data that originates from the client’s browser MUST be properly escaped to avoid SQL Injection attacks. There are two main ways to accomplish this:  using a function like mysql_real_escape_string on each piece of data or using parameterized queries that do the escaping for you. To me, there are tradeoffs between both methods. Escaping each piece of data can be error prone, if you are not careful. Using parameterized queries typically requires creating a Statement object and binding parameters to that Statement. For queries that are repeated several times in looping structures, using a Statement object can speed things up dramatically. However, for queries that are only executed once, I prefer skipping the extra typing involved in creating another object and binding parameters.

So, how do we escape different types of values correctly if we are not using parameterized queries? Enter the public escape method:

    public function escape($value, $type='none') {

        $value = trim($value);
        switch ( $type ) {
            case 'none':
            case 'escape':
                // Only perform the escaping functions
                $value = get_magic_quotes_gpc() ? stripslashes($value) : $value;
                $value = $this->real_escape_string($value);
                break;
            case 'string':
            case 'text':
                $value = get_magic_quotes_gpc() ? stripslashes($value) : $value;
                $value = "'" . $this->real_escape_string($value) . "'";
                break;
            case 'int':
            case 'long':
                $value = intval($value);
                break;
            case 'bigint':
                $value = ctype_digit($value) ? $value : 0;
                break;
            case 'float':
            case 'double':
                $value = floatval($value);
                break;
            case 'date':
            case 'datetime':
                if ( empty($value) )
                    $value = "'0000-00-00'";
                else {
                    $value = get_magic_quotes_gpc() ? stripslashes($value) : $value;
                    $format = ($type == 'date' ? 'Y-m-d' : 'Y-m-d H:i:s');
                    $ts = strtotime($value);
                    if ( $ts && $ts > -1 )
                        $value = "'" . date($format, $ts) . "'";
                    else
                        $value = "'0000-00-00'";
                }
                break;
        }

        return $value;
    }
} // End of eMySQLi class

The escape method accepts two parameters: $value and $type. The $value parameter is, obviously, the value that needs to be escaped. The $type parameter determines what kind of escaping takes place. By specifying the correct type, you can ensure that the data received from the client is either quoted and escaped (string) or cast to the correct value (int, float, double). The date and datetime types ensure $value is a valid date and then format it for the database. In practice, my typical implementation of the escape method would be:

$mysqli = eMySQLi::getConnection('db_name');
$sql = sprintf('SELECT * FROM table WHERE stringField = %s AND intField = %s',
               $mysqli->escape($_POST['myString'], 'string'),
               $mysqli->escape($_POST['myInt'], 'int'));
if ( $mysqli->query($sql) ) {
    <Do something cool...>
}

Here, I use the sprintf function and my escape method to accomplish the same task as the Statement object would with a parameterized query. The data from a client’s POST is sanitized and made safe for submitting to the database. This just seems simpler to me than the extra object creation and parameter binding that the Statement object requires. I will leave it up to you to decide which method is better suited to your needs.

Hopefully, this exercise has given you an idea of how easy it is to subclass and extend the functionality of even built-in PHP classes. A Decorator merely adds extra abilities to an existing object. In my case, I was tired of writing else statements to handle database error logging in the event a MySQLi method failed. Now, errors are always logged automatically without my having to worry about it and my code is shorter, cleaner and easier to read. I also didn’t have to resort to a large abstraction layer like PDO or PEAR DB, and add that extra overhead to my scripts, just to get some logging features. MySQLi is already object oriented and, as I said, MySQL is the only database I use. So, I didn’t feel another layer was necessary.

If you made it all the way through this post, thanks for reading! Feel free to let me know what you think. A full version of the eMySQLi.php class, with code commenting, is available here.

Filed in PHP, Technology No Responses yet

Recursive Directories Into Multi-dimensional Array

I was messing around the other day with creating a multi-dimensional array of file names from recursively searching a given file system path. I tried the RecursiveDirectoryIterator and the RecursiveIteratorIterator from the SPL. While they are both very useful and fairly simple to implement, I needed to keep the multi-dimensional nature of the file system tree intact rather than flatten it out as the SPL recursive iterators tend to do. I also needed a count of the files processed at each level of the array, so each directory would know how many files were in all of it’s subdirectories — without having to foreach through all the dimensions of the array and count() every subarray.

Anyway, the solution was pretty straightforward. Here’s my implementation:

$fileArray = array();
$filePath = '/var/www';
$filter = '#\.(gif|jpe?g|png)$#i';
$fileCount = getFileInfo($filePath, $fileArray, $filter);

function getFileInfo($filePath, &$fileArray, $filter='') {
    $leaves = 0;
    $d = new DirectoryIterator($filePath);
    foreach ( $d as $f ) {
        $name = $f->getFilename();
        if ( substr($name, 0, 1) == '.' )
            continue;
        elseif ( $f->isDir() ) {
            $fileArray[$name] = array();
            $sub_leaves = getFileInfo("$filePath/$name", $fileArray[$name], $filter);
            $fileArray[$name]['fileCount'] = $sub_leaves;
            $leaves += $sub_leaves;
        }
        elseif ( !$filter || preg_match($filter, $name) ) {
            $fileArray[] = $name;
            $leaves++;
        }
    }
    return $leaves;
}

The code above will create a multi-dimensional array ($fileArray) which contains all the directories/files below the given path ($filePath), optionally filtering the file names by the perl regex supplied ($filter). The getFileInfo() function calls itself recursively, creating a new subarray for each directory it finds. Each array in the structure also has a fileCount key which is a count of all the file names in every subarray below it. The function returns a total count of all file names in the structure.

For my application, I wanted to automatically filter any files/directories that started with a dot character and not just the normal ./ and ../ files (e.g. .svn/, .cvs/, various config files, etc…). Your mileage may vary…

Lastly, this function requires PHP 5+ with the Standard PHP Library (SPL) available. The prettified source is available here.

Filed in PHP, Technology 2 Comments so far

Change in CURLOPT_NOBODY option

Be careful when upgrading PHP, if you’re using cURL. Somewhere between PHP 5.2.6 and 5.2.9, the definition of the CURLOPT_NOBODY option was changed. After upgrading (under duress — but that’s a different story…), all my ecommerce scripts that handled posting customer data to our processing company were completely fuxored. My main ecommerce site was on it’s proverbial ass, and for no apparent reason.

As it turns out, the cURL developers decided to change CURLOPT_NOBODY and make it reset the request type of the cURL object to a HEAD request. In my case my processor expects a POST, with the data for the transaction in the body of the request. Changing my requests to HEAD obviously caused a problem and it was pretty difficult to find the culprit.

Don’t ask me why I was using CURLOPT_NOBODY. It was part of some example code that I copied when first experimenting with cURL a long time ago. It worked, at the time, and that’s all I was concerned about. Anyway, after much hand-wringing and gnashing of teeth, I finally figured it out and removed the offending option. I just wish they would announce changes like this a little more broadly. Making subtle changes behind the curtain that break existing code is not good programming practice — especially for a library as popular as cURL.

Filed in PHP, Technology No Responses yet

Problems with vi, vim and the paste command

Have you ever tried to cut and paste text into vi or vim through a terminal, only to find the formatting completely screwed up? Yeah, me too…  I finally came across an answer to this pesky problem and it’s so simple, I’m surprised it wasn’t easier to find. The next time you want to paste text with tabs into vi or vim, try the following:

Make sure you’re in command mode by pressing escape, then enter

:set paste

Don’t forget the colon at the beginning of the command. Now, when you enter edit mode, you’ll see

-- INSERT (paste) --

at the bottom of the screen. Also, anything you paste into the terminal will retain it’s formatting. Pretty simple, eh? To exit this mode, just type

:set nopaste

Once again, make sure you are in command mode before you type this and don’t forget the leading colon. Hopefully, this will save someone else the hassle of digging around for hours trying to find this simple fix…

Filed in Bash Scripting, Technology No Responses yet

Quick NIS passwd to LDAP script

I found this awk script while browsing around on the web. I was looking for a simple way to migrate an NIS+ passwd file to LDAP config and this is what I found. Unfortunately, when I went to cut and paste it into a script on my server, I realized it had a bunch of smart quotes in it. So, here is a cleaned up version:

#!/bin/bash

awk -F: '{
            print "dn: cn="$1",ou=People,dc=mydomain,dc=com"
            print "objectClass: top"
            print "objectClass: person"
            print "objectClass: organizationalPerson"
            print "objectClass: inetOrgPerson"
            print "objectClass: posixAccount"
            print "objectClass: inetLocalMailRecipient"
            print "objectClass: shadowAccount"
            print "uid: "$1
            gfields = split($5,gecos,",")
            namefield = split(gecos[1], fullname, " ")
            print "sn: " fullname[namefield]
            print "givenName: "fullname[1]
            print "cn: " $1
            print "userPassword: {crypt}"$2
            print "loginShell: "$7
            print "uidNumber: "$3
            print "gidNumber: "$4
            print "homeDirectory: "$6
            print "gecos: "$5
            print "mail: "$1"@mydomain.com"
            print "displayName: " gecos[1]
            print ""
        }'

Follow m0j0’s instructions for using it from the link above — and don’t forget to change the mydomain string, if necessary. I also found it helpful to capture my NIS passwd info into a file before running this script. It gave me a chance to remove old stuff I didn’t need anymore. To do that, just ypcat passwd > nis_passwd, make your edits, and then cat nis_passwd | nis2ldap > users.ldif.

Thanks, m0j0!

Filed in Bash Scripting, Technology One Response so far

New Blog

I’ve been wanting a place to put my ramblings for a while and, since I’m a programmer, I finally berated myself enough to install Wordpress. I guess it’s about time, since I’ve had this domain sitting here dormant for over ten years. (!!)

So, welcome…

Filed in Life No Responses yet