Shuffling and Random Thoughts

28 10 2010

A riffle shuffle being performed during a game...

A riffle shuffle being performed during a game of poker at a bar near Madison, Wisconsin (Photo credit: Wikipedia)

Sometimes you need to shuffle things when you work with PHP, that is to order a set of values where each appears in a randomized order. So, we can think of shuffling as anti-sorting as mentioned in a previous post.

Consider the recommendation of a user on Yahoo Answers who recommends PHP’s shuffle() to another user needing to randomize a playlist of mp3 titles. That is a good answer if you’re working with PHP5. But, back when PHP4 was the state of coolness, that function originally failed to work as expected and only later became a viable option.  So, what would you have done if you had been unable to use shuffle() to arrange a deck of virtual cards?  You probably would have looked for a shuffle algorithm or better yet find some one else’s PHP code based on such an algorithm.

There is an interesting read at http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle which discusses two algorithms, that of Fisher-Yates and Durstenfeld’s. The article points out that there are actually two key issues involved with creating the shuffle algorithm, namely working out its logic and using a good random number generator.

Here’s some code that I wrote to generate a random image way back in 2002:

$a = array("104","50","48","332","31",
"307","306","302","204","194","192","162");

$dex = rand(0,(count($a) - 1) ) // generate integer btw 0 and 11
$graphic = "./aceofspace/ace" . $a[$dex] . ".jpg";
echo '<img alt="" src="' . $graphic . '" />';

The code might look like it would generate a random graphic but in truth when I ran it numerous times in succession certain numbers came up with greater frequency than others, so its value as a random number generator seems to live up to the online manual’s description of it as a pseudo-random generator.

PHP’s mt_rand() is supposed to generate better random values so let’s give it a try, and replace rand() with it in the code below:

$a = array("104","50","48","332","31",
"307","306","302","204","194","192","162");

$dex = mt_rand(0,(count($a) - 1) )  // generates an integer between 0 and 11
$graphic = './aceofspace/ace' . $a[$dex] . '.jpg';
echo '<img alt="" src="' . $graphic . '" />';

mt_rand() also generates certain random numbers with more frequency than others, thus it seems similar to rand() in terms of randomness but according to the online manual, it is speedier than rand().

When attempting to devise  a shuffling algorithm, it helps if you can recall your grade school math lessons that dealt with probability. So, if you have 3 elements, how many different ways can they be arranged? Mathematically, we’re looking at 3 * 2 * 1 or 6 possibilities. So if  we have a set of numbers {1,2,3,4,5} we’re looking at 5! (five factorial) possibilities.   Each time a pair of  elements  in an array swap values there is one less swapping choice the next time we repeat this operation.

Back in the days prior to PHP4.3, you had understand the mathematical aspects of shuffling and write your own PHP algorithm or else be fortunate enough to have bought a copy of O’Reilly’s PHP Cookbook by David Sklar and Adam Trachtenberg.   I actually still retain my copy and recollect glossing over their algorithm. At the time, I came to the conclusion that their shuffling algorithm was inapplicable for my project because  I was only seeking one random value. But, now I wonder if  I had used the algorithm and then randomly selected an element from the shuffled arrangement of the jpeg titles in my array, would the result  have been better?  I’ll take up this question again a  little later.

So, here’s the PHP code from the O’Reilly book:

<?php
function pc_array_shuffle($array) {
$i = count($array);

while(--$i) {
$j = mt_rand(0, $i);

if ($i != $j) {
// swap elements
$tmp = $array[$j];
$array[$j] = $array[$i];
$array[$i] = $tmp;
}
}

return $array;
}

This is one way of implementing the Fisher-Yates algorithm. Here’s another way that is my version — note how I  use list() to do the swapping of values n one statement without resorting to a temporary variable. Now I read somewhere that someone thought it took longer with list() — I’ll leave the benchmarking for you to decide as I am fine with it for now. I also eliminated the test to see if $i and $j were the same because  performance is enhanced by eliminating the test and there’s no harm having an element swap its value with itself, strange as that may sound. So, I include below my code based also on Fisher-Yates. Also, I include some debug code so you can see the element swapping in action.

function my_array_shuffle($array) {

echo "The array to shuffle, ie sort in a randomized fashion:
";
var_dump($array);

    $start = count($array) - 1;       // last element in array
    for ($i = $start; $i > 0; $i--) { // swapping choices shrink
        $j = mt_rand(0, $i);

		echo 'Swapping randomly selected [ ';
        print_r($j);
		echo ' ] with [ ';
		print_r($i);
		echo ' ]
';

        // swapping elements ...
         list ( $array[$i], $array[$j] ) = array($array[$j], $array[$i]);
    }
    return $array;
}

$a = array(22,33,44,55);
var_dump(my_array_shuffle($a));

Now, to return to the issue I had before getting a random graphic to display on a web page, let’s see what happens if I shuffle the “deck”, i.e. the jpeg names and then randomly extract one of those images. Will the result be a little bit more random? Let’s find out. While I could use PHP’s shuffle() , I’m going to opt to using a slightly revised version of my_array_shuffle().

function my_array_shuffle($array) {
    $the_end = count($array) - 1;
    for ($i = $the_end; $i > 0; $i--) {
        $j = mt_rand(0, $i);

        // swapping element values ...
         list ( $array[$i], $array[$j] ) = array($array[$j], $array[$i]);
    }
    return array($array, $the_end);
}

$a = array("104","50","48","332",
            "31","307","306","302",
           "204","194","192","162");
list( $shuffled_jpegs, $the_end ) = my_array_shuffle( $a );
$dex = mt_rand( 0, $the_end );
$graphic = './aceofspace/ace' . $shuffled_jpegs[$dex] . '.jpg';
echo '<img alt="" src="' . $graphic . '" />';

The above code  improves the random generation of a graphic because the randomness seems more equally distributed. If you’re interested in issues creating a bias in a random generator, read http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#Fisher_and_Yates.27_original_method, namely where the author discusses “Modulo bias”.

This work is licensed under a Creative Commons License
.

Advertisements

Actions

Information

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




%d bloggers like this: