Reinventing the Wheel

15 04 2011

by Talsi1

Recently,  I experienced yet another “anti-PHP” interview which seems to be the latest trend in interview techniques. It was an interesting experience and I’d like to share it with you.

At the interview, after briefly exchanging pleasantries, the interviewer quickly proceeded to ask various technical questions.   One, in particular, seemed so easy that I wondered if I might have misheard it.  “Could you repeat the question?”, I asked.  “Sure,” Mr. Hiring Manager smiled, “How you would transform two indexed arrays, whose alphabetical values are unordered, such that the two arrays are merged into one and all the values appear sorted in ascending order?”

If you pay close attention to the question, you may realize that it actually contains its own answer, which is as follows:

$array =  array_merge($arr1, $arr2);
sort($array);

The above solution relies on two PHP built-in functions, namely array_merge() and sort().  Array_merge() will return an array which represents a merger of the two other arrays referred to in its parameters. The sorting function is based on the QuickSort (see http://us2.php.net/sort and http://en.wikipedia.org/wiki/Quicksort) algorithm and is very fast. The only caveat is that sort() doesn’t work well with an array of mixed values, such as one containing both numbers and strings, in which case you’d need to devise your own sorting algorithm. But, for yesterday’s technical question the caveat is irrelevant.

The interviewer responded by expressing dissatisfaction with my answer even though I had done it according to The PHP Way, . What’s that??? It means recognizing that PHP is not C nor Java and the way you would work with those languages is not necessarily prudent when working with PHP, owing to its unique and very flexible design.  For example, if  you have a choice between using PHP’s built-in functions or writing your own code to achieve the same results, unless you have a compelling reason, it is preferable to use the built-in functions. Written in the C programming language and already compiled, the built-in functions are easily comprehensible to the parser; PHP doesn’t need to waste time interpreting them. But for any code you write in PHP, the parser will need to interpret that code which makes it slower than using the built-in functions.  In other words, The PHP Way is pragmatic and discourages anyone from reinventing the wheel.

PHP has an abundance  of built-in functions (PHP core and its extensions) at your disposal. BTW, you can see what functions are available in your configuration by running the following snippet:

<?php
$arr = get_defined_functions(); 
$n=0;
foreach ($arr['internal'] as $key => $value){ 
     echo $value.'<br />';
	 $n++; 
} 
echo $n;

“Well, using functions has to slow things down,” the interviewer maintained.  “Solve the problem algorithmically without using PHP’s built-ins which will surely be faster.”   With respect to array_merge(), he may be right (see http://www.bitbybit.dk/carsten/blog/?p=203. So, here’s my algorithmic solution that can quickly be coded on a whiteboard:

// two disordered arrays
$arr1 = array('a','g','d','n','l','z','x');
$arr2 = array('b','h','e','o','m','w','y');

// merging without array_merge()
foreach ($arr2 as $e) {
    $arr1[] = $e;
}

// counting without count()
$count=0;
foreach ($arr1 as $element) {
      $count++;
}

// sorting without sort() 
for ($i=0; $i < $count; $i++)
{
      for ($j=$i; $j < $count; $j++) {
		$temp = '';
		if (  $arr1[$i] > $arr1[$j] ) {
		      $temp = $arr1[$i]; // save it
			  $arr1[$i] = $arr1[$j];
			  $arr1[$j] = $temp;
		}
	  }
}
var_dump($arr1);  // should be sorted in alphabetical order

You could replace the foreach loop for doing the merger with a for loop as follows:

// merging without array_merge()
for($i=0, $max = count($arr2); $i < $max; $i++) {
    $arr1[] = $arr2[$i];
}

I found the recommendation in a comment at http://www.bitbybit.dk/carsten/blog/?p=203. However, when I ran the code on the above tiny dataset, I found that the foreach actually was faster. But, the result might be different with an input of much greater magnitude.

Performance is a complex issue. Apparently a small input may run better with an inefficient algorithm whereas a large data input needs an efficient one (see comment by “fts_technology” at http://forums.devshed.com/php-development-5/php-built-in-sorting-algoritm-fast-357037.html. On the other hand, array size isn’t necessarily the only factor that may effect performance (see http://www.cforcoding.com/2009/07/plain-english-explanation-of-big-o.html.). For example, the memory usage of an efficient algorithm could be greater than that of an inefficient algorithm and thereby negatively impact performance.

Someone on StackOverflow( see http://stackoverflow.com/questions/2473989/list-of-big-o-for-php-functions suggests using the union operator ( + ), saying that it’s faster than using array_merge() noting that the behavior a little different which is truly an understatement. Let’s see what we get by modifying the code accordingly:

<?php
$arr1 = array('a','g','d','n','l','z','x');
$arr2 = array('b','h','e','o','m','w','y');
// merging without array_merge()
$start = microtime(false);
$arr1 += $arr2;

// counting minus count()
$count=0;
foreach ($arr1 as $element) {
      $count++;
}

// sorting without sort()
for ($i=0; $i < $count; $i++)
{
      for ($j=$i; $j < $count; $j++) {
		$temp = '';
		if (  $arr1[$i] > $arr1[$j] ) {
		      $temp = $arr1[$i]; // save it
			  $arr1[$i] = $arr1[$j];
			  $arr1[$j] = $temp;
		}
	  }	
}
$end = microtime(false);
$algorithmic = $end - $start;
var_dump($arr1);  // should be sorted in alphabetical order
var_dump($algorithmic);

Look at the benchmark result and you’re likely to be pleased. Then look at the actual result with respect to what happens to the data and you may experience a very different emotion. What happens when you run the code is a consequence of the array keys in the first array being identical with the keys in the second array. As a result the array elements of the first array are preserved while those in the second array are lost. If you run this code reversing the arrays, then the second array’s elements are preserved instead. The only way union is safe to use is if there are keys in the second array that are different from that of the first and you are okay about loosing elements in the second array whose keys are the same as those in the first array. And, if you reflect back to junior high math, that is the way a union should work, too!

The sorting code I wrote is essentially a bubble sort which is not exactly the fastest kid on the block to win a prize. But it sure is easy to code when you’re on the spot, like in front of a whiteboard. You could implement a QuickSort algorithm in PHP; it’s been done before in recursive and non-recursive formats (see http://istoyanov.blogspot.com/2007/11/quicksort.html.)

Of course, relying on either of the home-made PHP-based QuickSort implementations does beg the question, why reinvent the wheel?

This work is licensed under a Creative Commons License

Advertisements

Actions

Information

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




%d bloggers like this: