Home arrow Blog arrow Analysis of URL

Analysis of URL

If your PHP software is going to be installed by other people without any personal intervention, some tricky problems arise. You probably need to figure out how the site is being accessed - it is not just a matter of domain names pointing to web sites. There can be different mechanisms for access, such as http or https, there can be use of non-standard ports. Also, software may be installed in a subdirectory rather in the document root.

We assume that the received URI has not been subjected to a SEF transformation, or that it has already been returned to its standard form. The assumption is also made that entry to the software is always through a file that is index.php or something like index2.php (that is a single digit added to 'index'). This is often a sensible practice, and provides an easy way to protect against direct execution of PHP code using the web server configuration (rather than relying on putting blocking code into all PHP code files).

    $splituri = preg_split('#/index[0-9]?\.php#', $_SERVER['REQUEST_URI']);
    $subdirectory = dirname($splituri[0]);
    if (1 == strlen($subdirectory)) $subdirectory = '';
    $config['subdirlength'] = strlen($subdirectory);

At this point, we've figured out how much of the URI is subdirectory - if length was one, it would have been only a slash.

    $scheme = isset($_SERVER['HTTP_SCHEME']) ? 
        $_SERVER['HTTP_SCHEME'] : ((isset($_SERVER['HTTPS']) AND
                    strtolower($_SERVER['HTTPS'] != 'off')) ? 'https' : 'http');

Deriving the scheme was relatively easy, although it had to take account of different ways in which information is presented through the $_SERVER super global. Getting at the port is more difficult:

    if (isset($_SERVER['HTTP_HOST'])) {
        $withport = explode(':', $_SERVER['HTTP_HOST']);
        $servername = $withport[0];
        if (isset($withport[1])) $port = ':'.$withport[1];
    }
    elseif (isset($_SERVER['SERVER_NAME'])) {
        $servername = $_SERVER['SERVER_NAME'];
    }
    else trigger_error(T_('Impossible to determine the name of this server'),
        E_USER_ERROR);
    if (!isset($port) AND !empty($_SERVER['SERVER_PORT'])) {
        $port = ':'.$_SERVER['SERVER_PORT'];
    }
    if (isset($port)) {
        if (($scheme == 'http' AND $port == ':80') OR 
        ($scheme == 'https' AND $port == ':443')) $port = '';
    }
    else $port = '';

Note that there is a possibility of failing to obtain the name of the server, which results in an error. Otherwise, we've got everything broken up, which allows us to save information about the site URI as accessed and to deduce what would be the 'secure' equivalent, if available:

    $afterscheme = '://'.$servername.$port.$subdirectory;
    $config['live_site'] = $config['secure_site'] = 
        $_SESSION['aliro_live_site'] = $scheme.$afterscheme;
    $config['unsecure_site'] = $_SESSION['aliro_unsecure_site'] 
        = 'http'.$afterscheme;

The code is believed to work reliably in a variety of circumstances, but if you know better, please add your comment!

#128011 • 02/23/2011 3:23pm by Martin Brampton • Vote: Up votes (557) Down votes (212)