Posts tagged ‘code’


How to use variable variables in PHP

One of the biggest time-savers in PHP is the ability to use variable variables.  While often intimidating for newcomers to PHP, variable variables are extremely powerful once you get the hang of them.

Variable variables are just variables whose names can be programatically set and accessed.  For example, the code below creates a variable called $hello and outputs the string “world”.  The double dollar sign declares that the value of $a should be used as the name of newly defined variable.

<?php
$a = 'hello';
$$a = 'world'
echo $hello;
?>

When I started with PHP about 10 years ago, everyone was still using global variables.  That meant that anything you passed as a GET variable could be used as a local variable.  It was very convenient, but unfortunately not very secure.  For me, typing $HTTP_GET_VARS[‘count’] just wasn’t as fun as being able to use $count.  I found myself adding long declaration lists to the top of my files that did nothing but convert my GET/POST variables to local variables.  My code started to look like this:

<?php
$salutation = $HTTP_GET_VARS['salutation'];
$fname = $HTTP_GET_VARS['fname'];
$lname = $HTTP_GET_VARS['lname'];
$email = $HTTP_GET_VARS['email'];
...
?>

Do that for a couple dozen variables and you’ll start telling yourself there has to be a better way.  Nowadays you can use $_GET instead of $HTTP_GET_VARS, but the better solution is to use variable variables. Now my code looks more like this:

<?php
// create an array of all the GET/POST variables you want to use
$fields = array('salutation','fname','lname','email','company','job_title','addr1','addr2','city','state',
                'zip','country','phone','work_phone');

// convert each REQUEST variable (GET, POST or COOKIE) to a local variable
foreach($fields as $field)
    ${$field} = sanitize($_REQUEST[$field]);
?>

This has several benefits.  I reduced 14 lines of code down to 3.  I now have one place to sanitize all my external input. And if I ever decide to change a variable name, I have one less place in my code to fix.

This benefit of this technique increases as you use the $fields array throughout your code.  I now utilize the $fields array when saving my form data to the database.  I use it for loading existing user values from the database.  I use it for passing my form fields back to smarty:

<?php
$form = array();
foreach($fields as $field)
    $form[] = $_REQUEST[$field];
$smarty->assign('form',$form);
?>

Variable variables have become one of my favorite features of PHP. They’ve allowed me to tighten up a lot of my code and made it a lot more maintainable.

Have you done anything cool with variable variables?  What other PHP tricks have revolutionized the way you write code?

 22 comments

Managing code releases

Recently I decided to streamline my code release process. I use subversion for my source control which means I push code live by running svn up on each of our production servers. I’m lazy, so I wanted an easier way to do this all at once. The end result is a simple shell script that lets me run svn update commands on multiple servers at once. It shows me the status of svn on each server and gives me chance to confirm that everything is okay before going ahead with the launch.

This example assumes you have two servers (app1 and app2) that are using public key authentication. Obviously, you’ll need to modify this script to work in your own environment. Make sure you replace “/var/www/” with your own document root and change appX.yourdomain.com to the IP address of each production server.

#!/bin/sh

# connect to each server and echo their current status
echo "Connecting to app1...\n"
ssh app1.yourdomain.com 'cd /var/www/; svn status --show-updates; exit'
echo "\nConnecting to app2...\n"
ssh app2.yourdomain.com 'cd /var/www/; svn status --show-updates; exit'
# add additional servers here as needed
tput smso
# confirm the release before publishing
echo "\nDo you want to publish these changes to production? (y/n)\n"
tput rmso
read answer
if [ $answer == "y" ]; then
    # if "y", proceed with the release
    echo "\nPublishing to production..."
    echo "\nPublishing to app1..."
    ssh app1.yourdomain.com 'cd /var/www/; svn up; exit'
    echo "\nPublishing to app2..."
    ssh app2.yourdomain.com 'cd /var/www/; svn up; exit'
    # add additional servers here as needed
    echo "\nDone"
else
    # if "n", cancel the release.
    echo "\nCanceled"
    exit;
fi
 2 comments

Permanent links to profile pictures on twitter

Twitter currently does not offer permanent links for their users profile pictures. This means that if you do any caching of twitter profile pictures, you stand a good chance of the image being gone by the time you try to display it. We’ve been running into this problem a lot recently at EventVue, but until recently I hadn’t taken the time to try and fix it. Thankfully, someone else solved the problem for me.

Last week Pete Warden wrote about a project that Shannon Whitley started that provides a simple solution for the roaming profile picture. The SPIURL project is a small python script that is designed to run on Google App Engine. It caches the profile URL’s, but checks that the profile image still exists before returning the picture. The end result is a static URL that can be used to retrieve any profile picture from twitter. For example, http://purl.org/net/spiurl/joshfraser returns my profile picture even if I upload a new one to twitter.

It’s a great script as it is, but I made a few modifications of my own. The main thing I added was the ability to specify which size of picture you want — either the 48×48 thumbnail or the original. I also added a content-type header to make it easier to view the picture in a browser. You can download my modified version if you’d like.

Thanks Shannon and Pete for sharing! I hope this helps someone else as much as it helped me.

Update 3/13/09: Pete discovered that you may need to add authentication to stop from bumping into the rate limits. You can keep up with the latest updates to this project over at Google Code.

  comments

How to use curl_multi() without blocking

You can find the latest version of this library on Github.

A more efficient implementation of curl_multi()
curl_multi is a great way to process multiple HTTP requests in parallel in PHP. curl_multi is particularly handy when working with large data sets (like fetching thousands of RSS feeds at one time). Unfortunately there is very little documentation on the best way to implement curl_multi. As a result, most of the examples around the web are either inefficient or fail entirely when asked to handle more than a few hundred requests.

The problem is that most implementations of curl_multi wait for each set of requests to complete before processing them. If there are too many requests to process at once, they usually get broken into groups that are then processed one at a time. The problem with this is that each group has to wait for the slowest request to download. In a group of 100 requests, all it takes is one slow one to delay the processing of 99 others. The larger the number of requests you are dealing with, the more noticeable this latency becomes.

The solution is to process each request as soon as it completes. This eliminates the wasted CPU cycles from busy waiting. I also created a queue of cURL requests to allow for maximum throughput. Each time a request is completed, I add a new one from the queue. By dynamically adding and removing links, we keep a constant number of links downloading at all times. This gives us a way to throttle the amount of simultaneous requests we are sending. The result is a faster and more efficient way of processing large quantities of cURL requests in parallel.

function rolling_curl($urls, $callback, $custom_options = null) {

    // make sure the rolling window isn't greater than the # of urls
    $rolling_window = 5;
    $rolling_window = (sizeof($urls) &lt; $rolling_window) ? sizeof($urls) : $rolling_window;

    $master = curl_multi_init();
    $curl_arr = array();

    // add additional curl options here
    $std_options = array(CURLOPT_RETURNTRANSFER =&gt; true,
    CURLOPT_FOLLOWLOCATION =&gt; true,
    CURLOPT_MAXREDIRS =&gt; 5);
    $options = ($custom_options) ? ($std_options + $custom_options) : $std_options;

    // start the first batch of requests
    for ($i = 0; $i &lt; $rolling_window; $i++) {
        $ch = curl_init();
        $options[CURLOPT_URL] = $urls[$i];
        curl_setopt_array($ch,$options);
        curl_multi_add_handle($master, $ch);
    }

    do {
        while(($execrun = curl_multi_exec($master, $running)) == CURLM_CALL_MULTI_PERFORM);
        if($execrun != CURLM_OK)
            break;
        // a request was just completed -- find out which one
        while($done = curl_multi_info_read($master)) {
            $info = curl_getinfo($done['handle']);
            if ($info['http_code'] == 200)  {
                $output = curl_multi_getcontent($done['handle']);

                // request successful.  process output using the callback function.
                $callback($output);

                // start a new request (it's important to do this before removing the old one)
                $ch = curl_init();
                $options[CURLOPT_URL] = $urls[$i++];  // increment i
                curl_setopt_array($ch,$options);
                curl_multi_add_handle($master, $ch);

                // remove the curl handle that just completed
                curl_multi_remove_handle($master, $done['handle']);
            } else {
                // request failed.  add error handling.
            }
        }
    } while ($running);
   
    curl_multi_close($master);
    return true;
}

Note: I set my max number of parallel requests ($rolling_window) to 100 5. Be sure to update this value according to the bandwidth available on your server / servers you are curling. Be nice and read this first.

Updated 3/6/09: Fixed a missing semi-colon. Thanks to Steve Gricci for catching the typo.

Updated 4/2/09: Made some changes to increase reusability. rolling_curl now expects a $callback parameter for a function that will process each response. It also accepts an array called $options that let’s you add custom curl options such as authentication, custom headers, etc

Updated 4/8/09: Fixed a new bug that was introduced with the last update. Thanks to Damian Clement for alerting me to the problem.

 116 comments

How to start MAMP on port 80 without a password

I’m a big fan of MAMP. It’s the fastest way for anyone to get set up with a local PHP/MySQL development environment on a mac. One of the small annoyances with MAMP is that it requires you to enter your password all the time if you want to run it on port 80 (which I do). To be fair, it’s got more to do with UNIX security than MAMP… but it’s still bloody annoying!

I tried Steve Stringer’s technique of using launch daemons, but it just couldn’t get it to work for me.

The trick to getting MAMP to start behind the scenes is knowing that all that pretty GUI does is call a couple shell scripts. Specifically, those scripts are /Applications/MAMP/bin/startApache.sh and /Applications/MAMP/bin/startMysql.sh (assuming you installed MAMP at the default location).

The second thing you should know is that startApache.sh must be run as root, but startMysql.sh must be run as the current user. I created a new shell script to call those scripts appropriately:

sudo /Applications/MAMP/bin/startApache.sh
/Applications/MAMP/bin/startMysql.sh
exit 0

I then added added an exception for that script to my sudoers file so I didn’t need to enter a password when I used sudo. The easiest way to add this exception is to use the ‘visudo’ command as root.

Finally, I used Automator to wrap the whole thing up as an application I could add to my dock. It works! One less daily annoyance in my life!

Since writing this, Damian Gaweda has posted a more elegant solution that’s worth checking out.
 13 comments

How to detect the RSS feed for a blog

Every wondered how to automatically figure out the RSS feed for a blog?

Generally speaking, it’s a simple task — just download the HTML for the given blog and use a fancy regular expression to find the associated RSS feed. In PHP, it looks something like this:

$bloghtml = file_get_contents($blogurl);
preg_match('/<link.*types*=s*["']*application/rss+xml["']*.*hrefs*=s*["']?([^'" >]+)['" >]/i', $bloghtml, $match);
$rssurl = $match[1];

The main problem with this approach is that some blogs take a long time to load — and that often translates to your application being slow as well. On top of that, it’s frustrating to have to download and process an entire page of HTML just to extract one URL.

Recently Google came out with a better solution in the form of their AJAX Feed API. Using their API, detecting feeds is now easier, faster and more reliable:

$lookup_url = "http://ajax.googleapis.com/ajax/services/feed/lookup?v=1.0&q=".urlencode($blogurl);
$result = curl($lookup_url);

I’ve been using this API for about a month now and have really appreciated the improvements. If you need to detect feeds, give it a try. I think you’ll like it.

 8 comments

Auto detect a time zone with JavaScript

This blog post will attempt to explain how to automatically detect your user’s time zone using JavaScript. If you’re in a hurry, you can skip directly to the JavaScript timezone detection code on Github.

Previous attempts to solve this problem:

Server side:

Time is not included in an HTTP request. This means that there is no way to get your user’s time zone using a server side scripting language like PHP.

IP address geocoding:

Another method that people have used to address this problem is to geocode your visitors IP address. IP geocoding is what is used when you go to a website and are shown an ad to “meet other singles in Boulder”. Unfortunately, for simply detecting a timezone, IP geo-coding is an expensive way to go. Just check out the prices for Maxmind and ip2location. There’s no way I’m paying for that. I did find a free provider called hostip, but it is worthless as it couldn’t decide whether I live in CA or NC.

With JavaScript:

The common JavaScript that is used to detect a visitor’s timezone is:

var myDate = new Date();
document.write(myDate.getTimezoneOffset());

As I started reading up on the getTimezoneOffset code I realized it was too buggy to be used in any critical application. The function returned inconsistent results in different browsers and it never seemed to account for daylight saving time correctly. It quickly became clear that I was going to have to write my own script if I wanted this to work.

How I ended up doing it:

There are basically two things needed to figure out a visitor’s time zone. First, we need to determine the time offset from Greenwich Mean Time (GMT). This can easily be done by creating two dates (one local, and one in GMT) and comparing the time difference between them:

var rightNow = new Date();
var jan1 = new Date(rightNow.getFullYear(), 0, 1, 0, 0, 0, 0);
var temp = jan1.toGMTString();
var jan2 = new Date(temp.substring(0, temp.lastIndexOf(" ")-1));
var std_time_offset = (jan1 - jan2) / (1000 * 60 * 60);

The second thing that you need to know is whether the location observes daylight savings time (DST) or not. Since DST is always observed during the summer, we can compare the time offset between two dates in January, to the time offset between two dates in June. If the offsets are different, then we know that the location observes DST. If the offsets are the same, then we know that the location DOES NOT observe DST.

var june1 = new Date(rightNow.getFullYear(), 6, 1, 0, 0, 0, 0);
temp = june1.toGMTString();
var june2 = new Date(temp.substring(0, temp.lastIndexOf(" ")-1));
var daylight_time_offset = (june1 - june2) / (1000 * 60 * 60);
var dst;
if (std_time_offset == daylight_time_offset) {
    dst = "0"; // daylight savings time is NOT observed
} else {
    dst = "1"; // daylight savings time is observed
}

Once I had this code written, the next step was to compile a list of the various time zones around the world along with their opinions on DST. I actually ended up using the list of time zones from Microsoft Windows. It was rather time consuming to compile this list, so I hope you can make use of my work to save yourself some time.

Please let me know if you have any comments, questions or problems with this code. As with anything that I post on this blog, feel free to use this code however you want. Just don’t blame me if it breaks.

Update (06/27/07):

My code wasn’t correctly detecting timezones in the lower hemisphere. I have added hemisphere detection for all our Aussie friends out there. I also fixed a bug in the convert() function that was leaving off the + sign at certain offsets. Thanks Val for pointing this out and helping me with the fix.

Update (10/24/08):

Fixed the bug that Rama and Will pointed out in the comments.

Update (12/22/10):

Jon Nylander has taken my original code and written a a more robust solution. Use his version instead.
 134 comments