Code release: preg_find() – A recursive file listing tool for PHP

Version 2.1

I originally wrote this a few years ago and never really promoted it beyond the realms of the #php IRC channel on EfNet.  However, it has managed to find its way into applications such as WordPress and many other PHP apps.  It is gratifying to know that others are finding it useful.

So what is preg_find() anyway? A short summary for those who have never encountered it: Imaging a recursive capable glob() with the ability to filter the results with a regex (PCRE) and various arguments to modify the results to bring back additional data.

Well today I thought I would add one commonly requested feature. Sorting.  Using the power of PHP’s anonymous (lambda-style) functions, preg_find() now creates a custom sort routine based on the arguments passed in, filename, dir+filename, last modified, file size, disk usage (yes those last 2 are different) in either ascending or decending order.

Download preg_find.phps
Download preg_find.php in plain text format

A simple example to get started – we’ll work on my PHP miscellaneous code directory:

Example 1: List the files (no directories):

Code:

include 'preg_find.php';
$files = preg_find('/./', '../code');
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

Now let us look at a recursive search – this is easy, just pass in the PREG_FIND_RECURSIVE argument.
Example 2: List the files, recursively:

Code:

$files = preg_find('/./', '../code', PREG_FIND_RECURSIVE);
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

Lets go futher, this time we don’t want to see any files – only a directory structure.
Example 3: List the directory tree:

Code:

$files = preg_find('/./', '../code', PREG_FIND_DIRONLY|PREG_FIND_RECURSIVE);
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

It should be obvious by now that we are using constants as our modifier arguments. What might not be immediately obvious is that these constants are “bit” values (.e.g. 1, 2, 4, 8, 16, …, 1024, etc) and using PHP’s Bitwise Or operator “|” we can combine modifiers to pass multiple modifiers into the function.

How about a regex? Files starting with str_ and ending in .php
Example 4: Using a regex on the same code as example 1:

Code:

$files = preg_find('/^str_.*?.php$/D', '../code');
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

What about that funky PREG_FIND_RETURNASSOC modifier?
This will change the output dramatically from a simple file/directory array to an associative array where the key is the filename, and the value is lots of information about that file.

Example5: Use of PREG_FIND_RETURNASSOC

Code:

$files = preg_find('/^str_.*?.php$/D', '../code', PREG_FIND_RETURNASSOC);
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

As I mentioned earlier, I added sorting capability to the results, so let us look at some examples of that.

Example 6. Sorting the results (of example 1)

Code:

$files = preg_find('/./', '../code', PREG_FIND_SORTKEYS);
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

Example 7. And reverse sort.

Code:

$files = preg_find('/./', '../code', PREG_FIND_SORTKEYS|PREG_FIND_SORTDESC);
foreach($files as $file) printf("<br>%sn", $file);


You can see the result here

Ok, thats all well and good, what about something more interesting?

Example 8. Finding the largest 5 files in the tree, sorted by filesize, descending.

Code:

$files = preg_find('/./', '../code',
  PREG_FIND_RECURSIVE|PREG_FIND_RETURNASSOC|PREG_FIND_SORTFILESIZE|PREG_FIND_SORTDESC);
$i=1;
foreach($files as $file => $stats) {
  printf('<br>%d) %d %s', $i, $stats['stat']['size'], $file);
  $i++;
  if ($i > 5) break;
}


You can see the result here.

Or what about the 10 most recently modified files?

Example 9.

Code:

$files = preg_find('/./', '../code',
  PREG_FIND_RECURSIVE|PREG_FIND_RETURNASSOC|PREG_FIND_SORTMODIFIED|PREG_FIND_SORTDESC);
$i=1;
foreach($files as $file => $stats) {
  printf('<br>%d) %s - %d bytes - %s', $i,
    date('Y-m-d H:i:s', $stats['stat']['mtime']), $stats['stat']['size'], $file);
  $i++;
  if ($i > 10) break;
}


You can see the result here.

I am keen to receive feedback on what you think of this function.   If you have used it in some other application – great, I would love to know.  Suggestions, improvements, criticisms are also always welcome.

Compiling PHP, OCI8 on Sparc64 Solaris 10 with Oracle10g

This problem beat me about the head for most of yesterday until I worked out that PHP 5.0.5 doesn’t actually know about Oracle 10.    8 and 9, sure thing – otherwise it decides it is an older version (very silly).

The other problem is that when PHP tries to link to the oracle client libraries, by default it attempts to link against the 64 bit libraries – which with PHP being a 32bit app just isn’t going to fly.

So here I will attempt to guide you in all that is good with PHP and Oracle 10.

The first thing to do is ensure you have a working Solaris 10 install with Oracle 10g already
installed.   As this was to be an actual server machine I installed the full database server including client libraries (which happens by default when you install server).  However the purpose of this is not to help you install Oracle – there are plenty of guides out there for that.  This is to help you get PHP compiled in this environment – there are no guides for that.

So lets unpack the PHP source:
#->tar xf php-5.0.5.tar
#->cd php-5.0.5
php-5.0.5-#->

Now, If you run a straight ./configure –with-oci8 it will most likely fail being unable to find the oracle install:
checking Oracle version… configure: error: Oracle (OCI8) required libraries not found

We need to tell it where to find the oracle libraries.
./configure –with-oci8=/u01/app/oracle/product/10.2.0/Db_1
(assuming this is where your default database was installed to)

This will enable configure to complete.

Next, naturally, we try to make php – all should go well right up until the final link:
php-5.0.5-#->make
… [snip] …
ld: fatal: file /u01/app/oracle/product/10.2.0/Db_1/lib/libclntsh.so: wrong ELF class: ELFCLASS64
ld: fatal: File processing errors. No output written to sapi/cgi/php
collect2: ld returned 1 exit status
make: *** [sapi/cgi/php] Error 1

This fails because PHP has decided to link against lib/libclntsh.so when it should have linked against lib32/libclntsh.so

No amount of adding –includedir= and –libdir= on the configure command will result in make doing the right thing and linking against the lib32 version.

The solution? We need to edit the configure script to tell it that lib isn’t the be-all and end-all of oracle libraries.  This is a pain, I know, but hopefully the PHP people will fix this for 5.0.6 and above.

At line 64660 in configure you will see the line:
  elif test -f $OCI8_DIR/lib/libclntsh.$SHLIB_SUFFIX_NAME.10.1; then

Change /lib/ to /lib32/

And at line 69134 you’ll notice that it is missing any reference to Oracle 10.1, so we need to add it – add the following two lines just before the 9.0 line:
  elif test -f $ORACLE_DIR/lib32/libclntsh.$SHLIB_SUFFIX_NAME.10.1; then
    ORACLE_VERSION=10.1

At line 64977 change:
  if test -z "$OCI8_DIR/lib" || echo "$OCI8_DIR/lib" | grep ‘^/’ >/dev/null ; then
    ai_p=$OCI8_DIR/lib
to:
  if test -z "$OCI8_DIR/lib32" || echo "$OCI8_DIR/lib32" | grep ‘^/’ >/dev/null ; then
    ai_p=$OCI8_DIR/lib32

Line 64368: add
  OCI8_SHARED_LIBADD="-L$OCI8_DIR/lib32"
  LIBS="$LIBS -L$OCI8_DIR/lib32"

Now make clean;
cd to your database and rename the lib directory to lib.unused temporarily so that PHP cannot link against it and leave the lib32 one as is.

Switch back to php dir. Run your configure command, make (which should now complete) and make install.

Go back and rename the lib.unused back to lib as other things will need this to exist.

Finally, make sure you add the lib32 path to your LD_LIBRARY_PATH variable before starting apache/php

LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/u01/app/oracle/product/10.2.0/Db_1/lib32"

Your PHP should now be working fine.

Files to help: My "configure" command:
‘./configure’
‘–prefix=/usr/local/apache2’
‘–includedir=/space/app/oracle/product/10.2.0/Db_1/rdbms/public’
‘–oldincludedir=/space/app/oracle/product/10.2.0/Db_1/rdbms/public’
‘–libdir=/space/app/oracle/product/10.2.0/Db_1/lib32’
‘–with-apxs2=/usr/local/apache2/bin/apxs’
‘–with-oci8=/u01/app/oracle/product/10.2.0/Db_1’

Diff of the configure script to the regular one supplied with PHP 5.0.5
#->diff php-5.0.5/configure php-5.0.5-working/configure                           6:39AM
64367a64368,64369
>   OCI8_SHARED_LIBADD="-L$OCI8_DIR/lib32"
>   LIBS="$LIBS -L$OCI8_DIR/lib32"
64660c64662
<   elif test -f $OCI8_DIR/lib/libclntsh.$SHLIB_SUFFIX_NAME.10.1; then

>   elif test -f $OCI8_DIR/lib32/libclntsh.$SHLIB_SUFFIX_NAME.10.1; then
64977,64978c64979,64980
<   if test -z "$OCI8_DIR/lib" || echo "$OCI8_DIR/lib" | grep ‘^/’ >/dev/null ; then
<     ai_p=$OCI8_DIR/lib

>   if test -z "$OCI8_DIR/lib32" || echo "$OCI8_DIR/lib32" | grep ‘^/’ >/dev/null ; then
>     ai_p=$OCI8_DIR/lib32
69133a69136,69137
>   elif test -f $ORACLE_DIR/lib32/libclntsh.$SHLIB_SUFFIX_NAME.10.1; then
>     ORACLE_VERSION=10.1

Note to PHP developers if they read this – this patch is not one that can be dropped into the regular build – it will only help people who have difficulty installing PHP with OCI8/Oracle10 on Solaris10.

I hope this proves useful to others – it took me >24 hours work to get to this point.

PHP: HTTP Authentication via PHP

When combining sessions with HTTP Auth in order to maintain state. The difficulty surrounding HTTP Auth is that even after you "logout", the browser will continue to send the correct username and password with each request. Thus immediately logging you back in again – unless you use the states to keep track carefully.

In this example we will use two session variables to maintain state and we tell the page that we want to login our logout via an argument in the query string, e.g. ?login ?logout

The two state variables are:

    * LOGGEDIN – Very simple state – either you are logged in or not
    * LOGGEDOUT – Will be TRUE if we have logged out. It’s primary purpose is to scupper the browser provided password and prevent the authentication routines from running. It gets reset to FALSE when we want to login

Additional benefits to this method are that we only need to authenticate upon login once. Normal code implemented HTTP Auth routines authenticate with every page request

Source code: Example page protected with PHP HTTP Auth

Source code: PHP HTTP Auth include file

In order to use this to protect any page you need to copy the auth.inc.php file to your server and then simply include or require it in any page.
You may wish to set the variable $HTTP_AUTH_REALM to a string before including this as this will change the Basic Realm information in the auth dialog box to a string of your choice.

You should also look at the checkpw() function and replace that with something that will check your user credentials correctly. Input is username, password and it should return TRUE or FALSE if the credentials supplied are OK or not.

Finally, on any page to effect a change of state from logged in to logged out or vice-versa, you simply have to make a link to a page with "login" or "logout" in the url’s Query String (that is the bit after the ?), e.g. page.php?login

A working example is provided over in my projects section, the default username is "paul" and password is "gregg"

I hope this code serves as useful learning material. Good luck.

String Case Conversion in PHP

Occasionally I read through some comments on the PHP Manual, sometimes to get ideas on different methods of doing things, other times just to try to keep current with some of the vast array of functions available.

Sometimes, I see things that really scare me – code that is written and published with the best will in the world from the author – but yet displays a lack of a deeper understanding of how to solve a problem.  One such case was the invert_case() and rand_case() functions which basically looped through each character in a string doing whatever it had to do to each character as it went.  Highly inefficient.

Remember, the only difference in ASCII between an uppercase letter and a lowercase letter is a single bit that is 0 for uppercase and 1 for lowercase.

This brief tutorial is based on code available at:
http://www.pgregg.com/projects/php/code/str_case.phps
and you can see example output at:
http://www.pgregg.com/projects/php/code/str_case.php

Surely it would be possible to write some code that would simply flip this bit in each character to the value you want:
– AND with 0 to force uppercase
– OR with 1 to force lowercase
– XOR with 1 to invert the case
– randomly set it to 1 or 0 to set random case.

There are two methods to achieving this, the first makes a simple character mask and performs a bitwise operation on the string as a whole to change it as required.  This method is designed to help teach how this works.  The second method uses the power of the PCRE engine by using a regex to calculate the changes and apply them in one simple step.

Both solutions are, I believe, elegant and are presented here for you.

Solution #1:

Code:

// Code that will invert the case of every character in $input
    // The solution is to flip the value of 3rd bit in each character
    // if the character is a letter. This is done with XOR against a space (hex 32)
    $stringmask = preg_replace("/[^a-z]/i", chr(0), $input); // replace nonstrings with NULL
    $stringmask = preg_replace("/[a-z]/i", ' ', $stringmask); // replace strings with space
    return $input ^ $stringmask;


The method here is to generate a string mask, in two stages, that will act as a bitmask to XOR the 3rd bit of every letter in the string.  Stage 1 is to replace all non-letters will a NULL byte (all zeros) and Stage 2 is to replace all letters with a space (ASCII 32) which just happens to be a byte with just the 3rd bit set to 1 i.e. 00100000
All we have to do then is XOR our input with the string mask and magically the case of all letters in the entire string are flipped.

Solution #2:

 

Code:

return preg_replace('/[a-z]+/ie', ''$0' ^ str_pad('', strlen('$0'), ' ')', $input);


Much more compact and works by using a regex looking for letters and using the i (case insensitive) modifier and most importantly the e (evaluate) modifier so we can replace by executing php code.  In this case, we look for batches of letters and replace them with itself XORed with a string of spaces (of the same length).

Similar principles apply to the random case example, but we complicate this slightly by adding and invert mask (to the solution 1 method). This invert mask is created by taking a random amount of spaces (between 0 and the size of the input string). We then pad this out to the size of the original string with NULL bytes and finally randomise the order with str_shuffle().  We then bitwise AND the stringmask and the invertmask so we create a new mask where randomly letters in the mask have spaces or NULLs.  We then XOR this to the original string as before and before you know it you have a randomly capitalised string.
The Solution 2 version requires you to remove the + so that we only match a single letter at a time (or else our randomly chosen case would apply to words at a time), and we use a termary to randomly decide on using a space or a NULL:

 

Code:

return preg_replace('/[a-z]/ie', '(rand(0,1) ? '$0' ^ ' ' : '$0')', $input);


I hope this has been a worthwhile read and I would certainly welcome feedback on this article.

PHP: Number base conversion in PHP.

I thought I might “publicise” some of the code buried within this site.  There are some useful things in here (even if I do say so myself) ;). And it would be a shame not to ‘blog’ from the rooftops.  Some of the code isn’t much use in the real world, but it serves its purpose of teaching both programming methods and how to do funky things in PHP.

For first up, is arbitrary number base conversion in PHP.  I know PHP has a built in base_convert() function, plus several other specific base conversion functions, e.g. decbin(), but I wanted to a) show how to do it, and b) implement one which could do bases 2-62 (default PHP ones does 2-36).

http://www.pgregg.com/projects/php/base_conversion/base_conversion.php