Playing with Assembly

So I was reading Slashdot yesterday when I came across a link to this article which is basically a crash course in writing super small programs using assembly language (nasm in Linux). I was completely in awe of their acquired 45 byte executable and took it upon myself to learn the basics of assembly.

My first attempt was to write a hello world (the article writes a 45 byte executable that just returns the number 42). That worked super well and was as easy as I thought, so I decided to play around with a couple of more system calls and write a program that sends hello world to a file in /tmp/. There were a few errors that I came across, so I got some help from the friendly guys in #asm on freenode IRC and figured out the last part. Here is the code, and after it is an explanation of the two parts the screwed me up.

; tiny.asm
BITS 32
		org		0x08048000

	ehdr:                                       ; Elf32_Ehdr
		db			0x7F, "ELF", 1, 1, 1, 0         ;   e_ident
		times		8 db      0
		dw			2                               ;   e_type
		dw			3                               ;   e_machine
		dd			1                               ;   e_version
		dd			_start                          ;   e_entry
		dd			phdr - $$                       ;   e_phoff
		dd			0                               ;   e_shoff
		dd			0                               ;   e_flags
		dw			ehdrsize                        ;   e_ehsize
		dw			phdrsize                        ;   e_phentsize
		dw			1                               ;   e_phnum
		dw			0                               ;   e_shentsize
		dw			0                               ;   e_shnum
		dw			0                               ;   e_shstrndx
  
	ehdrsize		equ		$ - ehdr
  
	phdr:                                       ; Elf32_Phdr
		dd			1                               ;   p_type
		dd			0                               ;   p_offset
		dd			$$                              ;   p_vaddr
		dd			$$                              ;   p_paddr
		dd			filesize                        ;   p_filesz
		dd			filesize                        ;   p_memsz
		dd			5                               ;   p_flags
		dd			0x1000                          ;   p_align
  
	phdrsize		equ	$ - phdr

_data:
		msg		db		"Hello, World!", 0xa
		len		equ   $ - msg
		file		db		"/tmp/test.out", 0x0

_start:
		; create the file /tmp/test.out
		mov		eax, 5
		mov		ebx, file
		mov		ecx, 66
		int		0x80

		; save the file descriptor
		mov		ebx, eax

		; set read/write permissions on the file
		mov		eax, 94
		mov		ecx, 0x1ff
		int		0x80

		; write the string to the file
		mov		eax, 4
;		mov		ebx, 1			; uncomment this to print to stdout instead
		mov		ecx, msg
		mov		edx, len
		int		0x80

		; close the file
		mov		eax, 6
		int		0x80

		; exit
		mov		eax, 1
		mov		ebx, 0
		int		0x80

filesize			equ	$ - $$

You can build and execute this yourself straight from Linux command line if you have nasm installed by saving it as tiny.asm and executing “nasm -f bin -o a tiny.asm && chmod +x a”. I have the ELF headers embedded directly into the assembly and we use ZERO external libraries so there is no linking needed. In fact, since pretty much everything executed is system calls, you can then run strace on the resultant executable (named “a”) and it will give you a pretty nice dump of exactly what it does.

The first thing that kind of tripped me up was when I created the file, it creates by default with no permission flags set at all. To fix this, I added an fchmod call right after I opened the file. The hex value I pass to as the second parameter (ecx) is just the hex value for 777 permissions. That set the permissions right but originally when I went to write to the file it still wouldn’t write. Thanks to the IRC help I figured out that when I was opening the file, I was opening with O_CREATE (62) when what I really wanted to do was open it with O_CREATE | O_RDWR so that I could write to it as well, so I changed the mode parameter of the call to open to 64 instead of 62 et voila, it works!

Hope you enjoy and maybe it’ll help somebody else out who is trying to write to a file with assembly in Linux.

Parallelization in PHP

This is a simple example you can re-use for splitting up processing of data across processes for faster execution. Put all of the data into the $set and fill in function process with what you want to do on the data, and let ‘er loose! I’m personally using it for telnet scripts because the amount of time spent waiting for a single telnet session is horrible and I can run many sessions at once while I wait for the responses.

/**
 * Splits the given set into $count subsets that are of approximately equal size
 */
function array_split($set, $count) {
   $subset_size = ceil(count($set) / $count);
   return array_chunk($set, $subset_size);
}

/**
 * Forks into $process_count separate processes and executes the function
 * named in $job in each process to split up handling of the data in
 * $set across the processes.
 */
function fork_exec($set, $job, $process_count) {
   $subsets = array_split($set, $process_count);
   $children = array();

   // launch all of the children and store process list
   foreach ( $subsets as $a_set ) {
      $pid = pcntl_fork();

      if ( $pid == -1 ) die("Error forking");
      else if ( $pid == 0 ) { call_user_func($job, $a_set); exit(0); }
      else $children[] = $pid;
   }

   // wait for each process to end
   while ( count($children) > 0 ) {
      $pid = array_shift($children);
      pcntl_waitpid($pid, $status);
   }
}

// example set to work on
$set = array('a','b','c','d','e','f','g','h','i','j');

// Process the job with 3 threads and time it
$time = microtime(true);
fork_exec($set, 'process', 3);
$diff = microtime(true) - $time;
echo $diff . ' seconds for full run'."\n";

// This is the job to run on the set. Make sure it is multi-process safe!
function process($set) {
   foreach ( $set as $item ) {
      echo "Process [" . posix_getpid() . "] executing '" . $item . "'\n";
      sleep(1);
   }
}

Freestyle Nerds

<djahandarie> we ain’t here to do e-c-e
<djahandarie> we’re here to do c-s-e on the w-e-b
<djahandarie> listen to me spit these rhymes
<djahandarie> while i program lines
<djahandarie> and commit web accessibility crimes
<djahandarie> word, son
<http402> You talk like your big on these I-Net kicks,
<http402> But your shit flows slower than a two-eighty-six.
<http402> I’m tracking down hosts and nmap scans,
<http402> While Code Igniter’s got you wringing your hands.
<http402> Cut the crap rap,
<http402> Or I’ll run ettercap,
<http402> Grab your AIM chat,
<http402> N’ send a PC bitch-slap!
<http402> peace
<djahandarie> you’re talkin bout down hosts and nmap scans
<djahandarie> while i got other plans
<djahandarie> you’re at your new job, but you can’t even do it right
<djahandarie> you just create a plight with your http rewrites
<djahandarie> i’ve been on the web since the age of three
<djahandarie> you just got on directly off the bus from mississippi
<djahandarie> respect yo’ elders, bitch
<http402> You’ve been webbin’ since three, but still ain’t grown up,
<http402> Gotta update your config and send the brain a SIGHUP.
<http402> You say you’re that old? No wonder you’re slow!
<http402> You’re knocking at the door while I run this show!
<http402> Elders my ass, you’re shit’s still in school,
<http402> Hunt and pecking at the keyboard like a spaghetti-damned fool,
<http402> Rim-riffing your hard drive like a tool,
<http402> Face it. I rule.
<djahandarie> i erase my harddrives with magnets (bitch)
<djahandarie> all you can do is troll on the fagnets
<djahandarie> and son, my brain’s wrapped in a nohup
<djahandarie> it wont be hurt by the words you throwup
<djahandarie> dont mind me while i emerge my ownage
<djahandarie> while you’re still over there apt-getting your porridge
<djahandarie> you say i’m still in school
<djahandarie> but the fact is that i know the rule
<djahandarie> cuz you need to go back to grade three
<djahandarie> and you better plea, that they take sucky graduates from c-s-e
<http402> Time to bend over and apply a patch,
<http402> Your brain’s throwing static like a CD with a scratch.
<http402> Your connection got nuked and you’ve met your match.
<http402> You run a single process like a VAX with a batch.
<http402> I’d pass the torch to a real winner
<http402> But it’d just scorch a while-loop spinner
<http402> Caught in a loop that you cant escape,
<http402> I run clock cycles around your words and flows,
<http402> Cuz your rhyme is like a PS fan: it’ blows,
<http402> Your water-cooled lyrics leak and it shows,
<http402> Take your ass back to alt.paid.for.windows.
<djahandarie> Good god, I can’t even respond to that. 😛
<djahandarie> You win haha
* http402 takes a bow

FTCache – Generic PHP Cache System

I am excited to announce the release of the FTCache library. This is a generic caching library written in PHP for PHP applications to do whatever type of caching you need. It is designed to be completely modular so you can select the algorithm (Strategy) for managing the cache along with the storage mechanism (Container) on instantiation, and switch between them with no change in functionality. For example, if you want to test out different caching algorithms, you can focus completely on the algorithm and the system can handle the storage mechanism for you. It would be perfect for algorithm comparison as well.

The core of the system is heavily and thoroughly tested and should be rock solid, so you have a strong foundation to build upon. The library currently has one strategy and two containers included with the library. More will come as I need them

For anybody looking to integrate it into Code Igniter or another framework, it is completely object oriented, so it should be seamlessly easy. If you have problems, questions or requests, please email me at justin@fugitivethought.com. The official home page of the project is:

http://fugitivethought.com/projects/ftcache/ 

Code Igniter vs. Prado

Intro

I have been corresponding with one of our readers who has been interested to learn about my recommendations for Prado vs. Code Igniter or other frameworks. I thought it might be interesting to the rest of you to hear my recommendations as well, as there seems to be very little material out there regarding Prado. Below are the relevant portions of my initial recommendation:

Initial Recommendation:

Code Igniter and Prado are very different approaches to frameworks, so it really depends on what you are looking for. Prado has an everything-and-the-kitchen-sink approach and use a Code Behind pattern from separating look and logic.

Code Igniter is very light-weight framework, but much more easily extended, and it uses the Model-View-Controller pattern to separate look and logic.

Prado has some built in internationalization support (http://www.pradosoft.com/demos/quickstart/?page=Advanced.I18N), but if you decide on Code Igniter, this may help: http://codeigniter.com/wiki/Category:Internationalization::Internationalization_Views_i18n/

As for web services, both have some built in support for the more popular services, and I’m sure that no matter which service you are looking for, someone has implemented it for Code Igniter.

Overall, I am personally a big fan of the Code Igniter framework, mostly because it is the one I have used the most. Prado (as you can see from my benchmarking post) has some performance issues and is significantly more difficult to learn. Code Igniter is also more easily expandable. However, I have had issues with Code Igniter on very large applications.

In summary, I would recommend Code Igniter only for simple or medium-difficulty applications. If you are creating a blogging system or very basic shopping cart, or even a web forum, I would say Code Igniter is the way to go to make your life easier. If you are building more complex applications that involve large amounts of form processing, more complicated calculations or if you are already familiar with some form of code behind (ASP .Net for example) then Prado would be more to your liking.

On a final note, if you are interested in something with much more complex and fine-grained permission management, I have found that CakePHP has the cleanest and most intuitive access control list implementations out of any framework and it is well integrated with the rest of the framework.

Response

At this point the reader pointed out some anomalies in my recommendation, so I had to rethink my position somewhat. The most glaring issue being that the Prado Benchmarks post I made shows major performance issues with Prado, so why would I recommend it for larger applications?

Second Recomendation

With Code Igniter, I found a lot of issues with mapping URI’s to controllers for more complicated URLs. This becomes a very big problem when you need to do advanced filtering. As I noted in my Fugitive Thought post on Pretty URLs, Code Igniter does not allow you to use both the nice URL mapping along with GETs, which means that if I want to filter by any more than 1 field, I have to either write a lot of extra code to figure out which parameters are provided, which are not and assign them to proper variables, or else I have to give up the pretty URLs in favor of everything being pure $_GET values and all pages going to index.php. For example, if I have an advanced search that I want people to be able to bookmark / link to with three different fields, a normal search file would have a URI like this:

http://myserver/blogs/search?a=blah&b=bloh&c=bleh

With Code Igniter, I can easily map /blogs/search to a specific controller, but I cannot use the $_GET values at the same time. What I have to do is generate something that will use this instead:

http://myserver/blogs/search/blah/bloh/bleh

This is all well and good, but what happens when each of these fields is optional? There are some obvious solutions, but they all require to implement extra code simply because the framework is saying “no”. I ran into a few other aggravating issues with Code Igniter as well that felt like they were limiting me far too much on large applications that need more complicated features, including session management and permission handling, etc.

Don’t get me wrong, Code Igniter does a LOT of things right, and it is very lightweight. If you put in the proper add-ins like the Smarty templating engine and make use of its template compilation and caching, you can do a lot of advanced stuff and keep it very fast. The reason I recommended Prado for larger applications despite the benchmarks is that a lot of the Prado stuff seems to scale fairly decently. The benchmarks I have displayed are ludicrously slow for a small application, but I don’t think it gets too much worse as you grow into larger applications because that overhead is a result of loading pretty much the entire framework in the beginning. I’m not positive, but I’m reasonably sure that the overhead can be mitigated with caching. And if the application is very large, then you are going to have some loading overhead anyway, and the very large suite of features that Prado handles will make up for a lot of the overhead in a lower development time and easier to debug code.

Code Igniter is nice for adding in just about any existing library because of the way they implement the “libraries” features; any existing class that does not rely on $_GET variables to the URL will pretty much just drop right in. This makes it wonderfully expandable, but also adds in a lot of redundancy when you need a lot of libraries since they are all disparate and not integrated and end up re-implementing a lot of the same features, whereas Prado has a lot of the features already built in and integrated to the framework so that all components can share a lot of features and code.

Akelos looks interesting, but you’re right about it being very young. One reason that both Prado and Code Igniter are high on my recommendation list is the amount of documentation. That really is priceless in making a framework usable. If you decide to go with something like Akelos, I recommend using an IDE with IntelliSense (http://en.wikipedia.org/wiki/IntelliSense) if you don’t already (Eclipse PDT is very nice for PHP intellisense! – http://www.eclipse.org/pdt/. NuSphere’s PHPed is also good, but it costs).

Flow of Control Database Design

Introduction

I have been working on a number of projects recently that have to do with passing resources between a number of different steps. Each step has a method for determining who has permission to see the resource at that step, and who has permission to modify certain portions of the resource at that step. I am also applying for some jobs that have to do with this type of process flow control on much larger scales, so I thought that it might be a good time to formalize some of the patterns that I have found useful in implementing these kinds of systems.

This entry is going to focus largely on the database design needed to support this system. If there is interest, I can write future entries describing more of the application level design that utilizes these databases.

Problem Statement

The easiest way to describe these patterns is by the description of a practical use-case, so for the entry, we are going to design a basic ticket management system. In large enterprises, especially in the customer support departments, ticket management systems are very popular. The ticket management system provides some flow of control so that when a customer first contacts the department, a ticket is created and some information is attached to it, usually a problem description. If the first person that the customer talks to can resolve the issue on their own, then the ticket will probably only pass through that one step. Often, however, the information has to be passed to some other person, who will perform some action, add a note about it to the ticket, and then pass it back to some other person who will communicate the changes to the customer and repeat the feedback process until the issue has been resolved.

For our very basic ticket management system, we will assume five general entities:

  1. Customer – a person who has a problem
  2. Call Center – people who speak directly with the customer. Can solve basic issues like answering common questions.
  3. IT – people who handle technical problems
  4. Management – people who handle human problems
  5. Auditors – people who can look at all tickets in the system so that they can review the processes

In our case, the resource is the ticket, we have a list of groups of people, and we have a good idea of who should be able to have access and when. The customer may be able to view the entirety of a ticket about them up to its current state. Auditors can view the entirety of any ticket. Call Center, IT and Management can view a ticket only when it is at a point where they would need to, and they can only add new additions to the ticket; they cannot modify or remove previous entries on the ticket. We will assume the figuring out what group a person belongs to is trivial.

The general database design for the ticket resource will be like this:

|---------------------|          |----------------------------|
| Tickets             |          | Notes                      |
|---------------------|          |----------------------------|
| id (int)            | <--|     | id (int)                   |
| customer (varchar)  |    |-----| ticket (int)               |
| create_date (date)  |          | author (varchar)           |
| problem (text)      |          | create_date (date)         |
|---------------------|          | message (text)             |
                                 |----------------------------|

The tickets are the main resource and is uniquely identified by its id (primary key). Each ticket will have zero or more notes attached to it, and each note can be uniquely identified by its own id. The note is associated with a ticket by the “ticket” field, which is a foreign key to Tickets.id. The original problem is stored in Tickets.problem, and any messages about the ticket are stored in Notes.message.

Status Pattern

One possible design solution I will call the Status Pattern. In this approach, we have a large table of all of the tickets in the system, and each ticket will have a “status” property. Whenever the ticket is passed to a new entity, the status is changed. In this case, we would add the “status” field to the Tickets table. The status fields can be any of the following values:

    OPEN – ticket was just created, no notes attached yet
    CALL CENTER – the problem has been stated and the call center is currently workign to solve it.
    IT – the problem has been assigned to IT to fix.
    MANAGEMENT – management has to do something to resolve the ticket.
    CLOSED – ticket has been resolved in some way, and is no longer a concern

Note that there is only one entry for “CALL CENTER”. Once IT is done with the ticket, they can pass it back to the call center, it will show up on a call center person’s list, at which point they can look at it and most likely call the customer back to tell them that the issue is supposedly resolved.

Advantages

This system has the advantage of being very simple. There is a single place to look at all tickets, and in order to generate the list of which tickets a person can see, we just look for all tickets with a status belonging to that person’s group.

Another nice feature of the system is that it is easily expandable. If we have a new group that may need to view the tickets, all we have to do is add a new option for status values.

Disadvantages

While this system is quite simple, there can be problems with it. First of all, the database will end up cluttered with CLOSED tickets over time, which will decrease the access speed for tickets that are not yet resolved. This means that the database will get slower and slower as time goes on, which is not a good thing. Archiving old tickets is a matter of going through and finding all tickets with a CLOSED status, and storing them somewhere else, but this means that we no longer have the one central location to look at all tickets both current and passed.

If you are very security minded (perhaps these tickets contain incredibly sensetive information), then there is another problem with this design: we cannot easily separate permissions across ticket statuses. A lot of database security specialists will tell you that creating a different SQL account for each user with only the permissions that user can have are essential to keeping your information secure. This is discussed in more detail in the next section where we resolve this problem.

Queuing Pattern

This solution takes a lesson from queuing theory and is the type of approach used in areas that require more severe separation of concerns. Instead of having all of the tickets in one centralized table, we will create multiple tables. For example, we will have a table that holds all of the CALL CENTER tickets, a table that holds all of the IT tickets, a table that holds all of the MANAGEMENT tickets, and a table that holds all of the CLOSED tickets. Each of these tables can have the exact same set of fields, and will essentially be duplicates of the original Tickets table. Since the tickets and the notes go hand in hand, this actually requires duplication of both tables. As a basic example, we would have tables like this:

|---------------------|          |----------------------------|
| Call_Center_Tickets |          | Call_Center_Notes          |
|---------------------|          |----------------------------|
| id (int)            | <--|     | id (int)                   |
| customer (varchar)  |    |-----| ticket (int)               |
| create_date (date)  |          | author (varchar)           |
| problem (text)      |          | create_date (date)         |
|---------------------|          | message (text)             |
                                 |----------------------------|

|---------------------|          |----------------------------|
| IT_Tickets          |          | IT_Notes                   |
|---------------------|          |----------------------------|
| id (int)            | <--|     | id (int)                   |
| customer (varchar)  |    |-----| ticket (int)               |
| create_date (date)  |          | author (varchar)           |
| problem (text)      |          | create_date (date)         |
|---------------------|          | message (text)             |
                                 |----------------------------|

(etc.)

With this type of database schema, the operation for passing tickets from one group to another more closely resembles its real world function. When the call center wants to pass a ticket to the IT department, the data in the Call_Center_Tickets and Call_Center_Notes tables will be duplicated into the IT_Tickets and IT_Notes tables, and removed from the Call_Center tables. Essentially, we are physically moving the ticket.

Advantages

One advantage of this pattern is that it is easy to generate the most common reports. For example, when a Call Center employee views a list of all of the tickets they need to handle, it is simply a matter of selecting everything from the Call_Center_Tickets table, joined with the appropriate Call_Center_Notes. The only exception to this rule is for an Auditor. The Auditor will have to view all of the tickets, which may require a more complex UNION across multiple sets of tables. However, since there are generally very few Audit reports generated compared to the other groups, this is normally note a problem.

Secondly, this system will remain fast over time, since only the tickets that the users care about are stored in the table that they are accessing rather than cluttering up a single table with lots of CLOSED tickets.

Third, archival of old tickets is very easy, since all of them will be stored in the Closed_Tickets and Closed_Notes tables.

A fourth and very important advantage of this pattern is security. As mentioned in the Status Pattern, a very secure system should require individual permission levels inside of the database itself. If you do not understand this concept, read this decent guide to SQL injection, and pay close attention to the segregate users section. For this database schema, we would have a tickets_call_center user, a tickets_it user, and tickets_management user, etc. Each user will have onyl the permissions that they need to do their job. For example if a client logs in to look at their ticket, the system would connect to the database with the “tickets_client” user. The tickets_client user will only have read permission on tables, so that even if the user is able to find an application vulnerability in the system, they would not be able to change any of the tickets. A call center employee would be connected to the database using the tickets_call_center user, and would have INSERT and SELECT access to the call_center_tickets and call_center_notes tables so that they could create tickets and notes, and then DELETE access to the call_center_* tables and INSERT access to other tables so that they could pass the ticket on to other groups. In the worst case scenario, if they call center employee turns malicious and finds an application level vulnerability, they would only be able to affect call center tickets, and these they would only be able to delete. This helps to contain the damage they can do.

Disadvantages

However, this extra speed and security does not come without some sort of price paid in overhead. As mentioned before, the Auditor user will have to use more complicated queries to view all of the tickets together. This is not usually a major concern. Secondly, the system is significantly more complex. If it is coded properly with the correct use of transactions, there will not be any lost tickets when moving tickets between statuses, but this requires good design and a lot of testing to make sure it is correct. Also, there is a lot of duplicate types of information here that formal models for normalization may not take kindly to.

Pointer Pattern

As with any system, there are many possible solutions, and I cannot cover all posibilities, however this last pattern is another alternative to the previous two, and is in a way a hybrid of the two. With this pattern, we would store all of the tickets in a centralized pair of tables called simply “Tickets” and “Notes”. However, instead of having a status field to manage the permissions, we can have a set of individual tables for each type of user. One could be named “call_center”, and would simply be a list of all of the Tickets.id values that correspond to tickets that are currently assigned to the call center. This pattern would be duplicated by tables such as “it”, “management”, etc. This allows us to emulate the real world process again because we can pass a ticket from one group to the next by removing it from one list and adding it to another. However, this pattern does not provide any advantage over the previous two since we cannot secure permissions between entities, and we are required to do JOINS across tables for any sort of request. I am mentioning it only because I have seen it used.

Conclusion

Keep in mind that these example are simplified. In the real world, you will need to take into account other facts such as making sure that call center employees can still view older tickets in case the person calls back a long way down the line. Your choice of the pattern to use depends a lot on personal preference, on the abilities of your DBMS, and on the application where it is being used. For example, some DBMS’s may allow you to have more complex permission schemes, so that you can properly secure data in one table rather than having to separate it across multiples ones. When you take this into consideration, the Queuing pattern may not have as many advantages. Any professional application development requires some judgement on the part of the developer, and there is no silver bullet.

Pretty URLs – htaccess friendly url-to-action mapping

Thanks to djahandarie for pointing out a code error: PHP_SELF has been replaced with REQUEST_URI.

While I may not be a huge fan of Code Igniter, their approach to URLs is absolutely incredible. I honestly never knew before I started using it that it was even possible to append a bunch of stuff after a filename in the URL without using the question mark. For those of you who don’t know what I’m talking about, these two URLs will load the same file:

http://www.fugitivethought.com/index.php

http://www.fugitivethought.com/index.php/foo/bar

If you don’t believe me, try it out right now! What’s so great about this? Well first of all, it (at least mildly) looks like you are browsing folders. There are no ampersands or question marks mucking up the URL. The second cool thing is that a very simple mod_rewrite rule allows you to remove the index.php from there, to create a very pretty URL that would look like this:

http://www.fugitivethought.com/foo/bar

The .htaccess file that will allow this is taken directly from the Code Igniter user guide (http://codeigniter.com/user_guide/general/urls.html):

RewriteEngine on
RewriteCond $1 !^(index\.php|images|robots\.txt)
RewriteRule ^(.*)$ /index.php/$1 [L]

If you don’t get the importance of pretty, human interpretable URLs then you can read more about it here:http://www.aardvarkmedia.co.uk/about/articles/007.html

Now in order to make this approach to URLs useful we need a parser inside of index.php with a very simple map that will allow us to tell it what class and function to call based on the different parts of the URL. While looking to implement the best parts of Code Igniter for a personal project, I wrote a script file that does exactly this.

Basically, you keep a set of “Controller” classes inside of a directory named Controllers. You can change the name of this directory by editing line 34 in the code below (35 in the downloaded file). The map variable (lines 11 to 18) is simply a mapping of what class and function to call based on the first part of the URL (in our above entries, this would be the “foo” portion). If there is no entry on the list to match this first step, then it will use the entry labeled “default”. For each entry it calls, it pops the top entry (first entry) off the stack of ‘/’ separated portions of the URL and passes the rest of them to the function. So if we visit http://hostname/foo/bar/cat, the function will figure out which function to call for “foo”, and then call it passing an array containing “bar” and “cat”. Any further redirecting of the URL can be done inside the function that is called.

  1. // Define the path root for the application.
  2. // Example: A site rooted at http://foo.com/bar/index.php would put bar/index.php here
  3. define(‘_PAGE_ROOT_’‘bar/index.php’);
  4. // Parse the URL
  5. $uri = substr($_SERVER[‘REQUEST_URI’], strlen(_PAGE_ROOT_) + 2);
  6. $pgs = explode(‘/’$uri);
  7. // Describe the URI map, further mapping can be done at each controller
  8. // inside of the map method.
  9. // default – load views.php and the Views class and call the home() method
  10. $map[‘default’]    = array(‘views’,   ‘home’);
  11. $map[‘year’]       = array(‘views’,   ‘year’);
  12. $map[‘month’]      = array(‘views’,   ‘month’);
  13. // note the use of “Map” here
  14. $map[‘control’]    = array(‘control’‘map’);
  15. $map[‘request’]    = array(‘actions’‘request’);
  16. $map[‘csv’]        = array(‘feeds’,   ‘csv’);
  17. // Map the first section of the URI to a controller and action
  18. if ( isset($map[$pgs[0]]) ) {
  19.     $action = $map[$pgs[0]];
  20.     array_shift($pgs);
  21. else {
  22.     $action = $map[‘default’];
  23. }
  24. // Cleaner format
  25. $controller_class  =  ucfirst($action[0]);
  26. $controller_action =  $action[1];
  27. $controller_params =& $pgs;
  28. // Load the controller
  29. require ‘Controllers/’.$action[0].‘.php’;
  30. // Instantiate the controller
  31. $controller = new $controller_class;
  32. // Call the action
  33. $controller->$controller_action($controller_params);

or you can click here to download a local copy for perusal.

In the example code in the provided file, the map calls a function named “map” in the “control” class if the first string is “control”. The map function in the control class looks at the next entry in the stack of URL nibblets and calls a particular function for that. For example, it could call the “bar” method of the control class and pass it whatever it deems necessary.

One major advantage of using this over the Code Igniter code is that I am not artificially blocking using GET in the URL. If you want to have a URL like:

http://hostname/foo/bar/?blah=moo&moreblah=moremoo

then you are perfectly welcome to do so. The controller functions will be called always and be able to use the built in $_GET global in PHP as any normal PHP application would. And you still get to have pretty URLs.

Prado Benchmark

As you may see from previous posts, I have been spending a lot of time lately in search of the ultimate framework for PHP. I have been involved with a team developing an open source component oriented framework since the beginning of last summer because the only equivalent framework we could find was Prado, and the team lead was very dissatisfied with the load and performance times on it. He tested it when version 3 had just been released, so we were unsure if performance increasing patches would be applied later. The tests were run on RedHat Enterprise Server and Suse, both on dual Xeon 2.0GHz servers with 4GB RAM and 15k RPM SCSI drives, and the load time for the blog app provided with Prado (not a very complicated app) was between .3 and .4 seconds per page. This is not horrible, but scale to a large user base, and it becomes impracticable.

Since this benchmark was a couple of years out of date, I decided to re-benchmark it using the latest Prado version 3.1.1r2290. This benchmark is run in Windows Vista on a dual core Intel Centrino with 3GB of RAM. Using the same application, the load time was on average .5 seconds (ranging between .48 and .53). Compare this with the FugitiveThought home page load time of .02 seconds (without caching), and we see a massive discrepancy in speeds.

With Prado, the index.php file that handles the loading of the framework and instantiation of the application has a run time of .085 seconds on average (between .08 and .10 seconds). This means that even before we get the application itself, Prado takes up more time than a directly written page. While these times can be brushed over by the proper use of caching, this only applies to some pages. Often times, the rich, interactive applications that you would be using Prado and its components for will not be cachable.

Smarty Caching in Code Igniter

My review of Code Igniter drew in a lot of email requests for the Smarty caching integration code, so I’ve made an article specifically to address that issue. First of all, my claim of a factor of 15 on the performance gained is not something pulled out of thin air. I monitored Apache’s use of the processor during page load requests. Each page load was for a single Code Igniter driven page which ran a single MySQL query. The request bumped Apache up from 0% processor usage to 3.0% usage (as monitored using top from the console). With caching enabled, a cache miss still only went up to 3.0% processor usage, so there was only negligible overhead for the cache checking, and the cache generation. A cache hit on the same page bumped Apache up to only 0.2% processor usage. Doing the math of 3.0 / 0.2 gives me the factor of 15 that I claimed in the original article.

In order to make the caching the most effective, we want the cache hit to reduce PHP run time to a bare minimum. To this end, I have the cache mechanism running off of a pre_system hook in Code Igniter. My management class is named CacheManager, so my hooks.php file in /system/application/config has in it:

  1. $hook[‘pre_system’] = array(
  2.     ‘class’ => ‘CacheManager’,
  3.     ‘function’ => ‘do_cache’,
  4.     ‘filename’ => ‘CacheManager.php’,
  5.     ‘filepath’ => ‘hooks’,
  6.     ‘params’ => array()
  7. );

The template for the cache manager itself is shown below, and is saved inside of CacheManager.php in the /system/application/hooks directory.

  1. class CacheManager {
  2.     function do_cache() { }
  3. }

do_cache is the method that gets called at the beginning when the system is loaded. So far this is pretty basic and can be taken directly from the Code Igniter manual entry on hooks. The issue we have here is that since Code Igniter itself is not yet created or prepared, we cannot use any of its handy library loading functions to load Smarty and check for the existance of a cached version of the requested page. In fact, we can’t even use Code Igniter to parse the URL and figure out which which page we are checking for the cache on. If I had written a generic cache manager that could apply to all web sites automatically, I could have just provided the file and would have no need for this explanation, so unfortunately, you have to read this tutorial and then build your own cache manager for your particular website.

For this example, we are going to assume a basic blog application. We have a front page accessible through http://myserver/, we have individual blog entries accessible through http://myserver/story/The_Title_Of_The_Article, and we have an admin section where we post and edit stuff accessible from http://myserver/admin. We want the blog entries that other people can see to be cached, so that if we have a popular story, it does not bring our web server to its knees. The admin section will not be cached, because (a) we want only ourselves to see it when we log in, and (b) it is not high traffic enough to warrant the complexity of caching.

The first thing we need to do is load up the Smarty library itself. Since we cannot use Code Igniters methods, we have to load it directly. If you followed the instructions from here for using Smarty with Code Igniter, then the following code will work to load Smarty. It may need tweaking if you have a different method for using Smarty.

  1. $file = dirname(__FILE__).‘/../libraries/Smarty-2.6.18/libs/Smarty.class.php’;
  2. require $file;
  3. if ( !defined(‘BASEPATH’) )
  4.     define(‘BASEPATH’, dirname(__FILE__).‘/../../’;
  5. class CacheManager {
  6.     function do_cache() {
  7.         // Turn on Smarty
  8.         $s = new Smarty();
  9.         $s->caching = true;
  10.         $s->cache_dir = BASEPATH . ‘cache/smcache/’;
  11.         $s->template_dir = BASEPATH . ‘application/views/’;
  12.         $s->compile_dir = BASEPATH . ‘cache/’;
  13.     }
  14. }

Code Igniter already has directory for some caching, but it is limitted to compiling PHP and Smarty templates and such, so I created a directory inside it to hold the results of static HTML caching. So we now have a web server writable directory in /system/cache/smcache/. The above code loads Smarty and sets up all of the proper paths (the defaults used by the previously mentioned Smarty library for Code Igniter). It also turns on caching. Next we need to parse the URL. Since all of this work will be done behind the rewrite rules, we are guaranteed to have an index.php somewhere in the URL, and we can use this to locate ourselves in the URL. We add code to do_cache to make it look like this:

  1. $file = dirname(__FILE__).‘/../libraries/Smarty-2.6.18/libs/Smarty.class.php’;
  2. require $file;
  3. if ( !defined(‘BASEPATH’) )
  4.     define(‘BASEPATH’, dirname(__FILE__).‘/../../’;
  5. class CacheManager {
  6.     function do_cache() {
  7.         // Turn on Smarty
  8.         $s = new Smarty();
  9.         $s->caching = true;
  10.         $s->cache_dir = BASEPATH . ‘cache/smcache/’;
  11.         $s->template_dir = BASEPATH . ‘application/views/’;
  12.         $s->compile_dir = BASEPATH . ‘cache/’;
  13.         // Parse the URL into sections.
  14.         $fields = explode(‘/’, PHP_SELF);
  15.         $base = array_search(‘index.php’$fields) + 1;
  16.     }
  17. }

This breaks the URL down into sections, finds index.php and then starts looking one step ahead of it. If our base URL is http://myserver/index.php, we will then have whatever comes after index.php be the base. For the home page (/), the rest of the array will be empty. For an individual story, we will have two more entries, one with “story” and one with the title of the story. For the admin page, we will have one or more entries, with the first entry being “admin”. By our schema, if we are looking at “admin”, then we are not caching at all. If we are looking at an empty entry, then we know we are on the front page and it is a basic cache. If we have “story” as the entry, we are caching each story individually and the unique ID for the cache entry will be the title itself. With this as the basis, we will have the following code:

  1. $file = dirname(__FILE__).‘/../libraries/Smarty-2.6.18/libs/Smarty.class.php’;
  2. require $file;
  3. if ( !defined(‘BASEPATH’) )
  4.     define(‘BASEPATH’, dirname(__FILE__).‘/../../’;
  5. class CacheManager {
  6.     function do_cache() {
  7.         // Turn on Smarty
  8.         $s = new Smarty();
  9.         $s->caching = true;
  10.         $s->cache_dir = BASEPATH . ‘cache/smcache/’;
  11.         $s->template_dir = BASEPATH . ‘application/views/’;
  12.         $s->compile_dir = BASEPATH . ‘cache/’;
  13.         // Parse the URL into sections.
  14.         $fields = explode(‘/’$_SERVER[‘PHP_SELF’]);
  15.         $base = array_search(‘index.php’$fields) + 1;
  16.         // Base case assuming that nothing beyond index.php is provided
  17.         if ( count($fields) <= $base )
  18.             $fields[$base] = ;
  19.         // Ignore these entries, since they are not cached
  20.         if ( in_array($fields[$base], array(‘admin’)) )
  21.             return;
  22.         // If we have nothing, then this entry is cached with no cache id
  23.         if ( $fields[$base] ==  ) {
  24.             $page = ‘home.tpl’;
  25.             $id = NULL;
  26.         }
  27.         // If we are looking at a story, then entry is cached with a cache id
  28.         if ( $fields[$base] == ‘story’ ) {
  29.             $page = ‘story.tpl’;
  30.             $id = $fields[$base + 1];
  31.         }
  32.         // Check if we have the entry cached
  33.         if ( $page && $s->is_cached($page$id) ) {
  34.             $s->display($page$id);
  35.             exit;
  36.         }
  37.     }
  38. }

From this code we basically have the entirety of the schema. It figures out which template and cache id to use and then checks if it is already cached. If it is, then it displays the cache and exits immediately. If it is not cached, or we do not try to cache it, then it simply returns and Code Igniter continues loading and operating as normal. There are two more parts to make the caching work. We need to make sure caching is enabled in the Smarty object that Code Igniter will use, and we need to generate the cache files. Cache file generation is exactly the same as it normally is with Smarty; the cache is generated when you call $smarty->display(‘page.tpl’, ‘cache_id’), and the id of the cache entry is optional. For a more in depth read on this last part, you can read the Smarty documentation. As for turning on caching in Smarty itself, we need to modify the Mysmarty library that is included from Code Igniter. If you followed the instructions from http://devcha.blogspot.com/2007/12/smarty-as-template-engine-in-code.html for installing Smarty with Code Igniter, then open up /system/application/libraries/Mysmarty.php and below the $this->compile_dir declarations, add in the following lines:

  1. $this->caching = false;
  2. $this->cache_dir = (!emptyempty($config[‘smarty_cache_dir’]) ?
  3.                            $config[‘smarty_cache_dir’]
  4.                          : BASEPATH . ‘cache/smcache/’);

This will configure the caching to use the same directories as described above. If you have used a different method for combining Smarty with Code Igniter, then you are basically on your own, although the same changes will apply.

Code Igniter Review

Code Igniter Review

Code Igniter is a framework for developing PHP applications. As a framework, it is designed to take a lot of the monotony and repetition out of coding by providing pre-packaged solutions to a lot of common problems. Code Igniter is designed around the MVC (Model-View-Controller) design pattern. The first step in learning Code Igniter is to realize that your programs no longer go directly to the PHP file you write, but rather they go to the frameworks loading file (index.php), which then parses the rest of the URL, figures out which controller is supposed to handle the request, and then loads that controller PHP file that you wrote and passes it all of the information to handle the request.

The default URLs for a Code Igniter driven application are of the format http://your-server/the-site/index.php/foo/bar. Through a trick of mod_rewrite, you can use the cleaner URL of http://your-server/the-site/foo/bar. This is one point that is a definite plus for Code Igniter; the URLs it produces are extremely clean and pretty. There are some difficulties to this practice, however, since I have had issues when trying to have some controllers get redirected using the internal _remap function and leave others to their defaults. Another major issue I had was that Code Igniter no longer allows the use of GET parameter passing. Not only does it prohibit the use of a question mark in an URL (it errors out and displays a friendly message saying that your URL has invalid characters in it if you try), but it also calls unset on $_GET inside the framework before it gets to your controller.

There is a config option to re-enable the use of GET, but for some reason this automatically disables all of those pretty URLs and forces you into using index.php?c=foo&m=bar instead of the original /foo/bar. This is terribly aggravating for anything that you actually need to use GET for when you still hope to have pretty URLs. I used Code Igniter on a fairly large project which had six different parameters to filter pages by, each of them optional. Using GET variables this is fairly easy since the presence or absence of the variable indicates whether the parameter is being used. Rewriting this into a URL schema without GET is much more difficult. I understand that disabling GET is a design decision to help keep the URLs pretty and enforce good practices by not allowing people to actually upload data through GET, but this particular instance was a perfectly valid use of GET.

Code Igniter provides a method inside of routes.php that allows you to use regular expressions to redirect requests in the URLs to a specific controller and method. They don’t provide much help to explain how to do it, so I assume that using it is a fairly non-standard practice. Some of the rewrites that should have been incredibly easy were not. For example, a rule like

$route[‘year’] = “events/year”;
$route[‘year/(.*)’] = “events/year/$1”;

Is needed to handle a URL like /year if there can be anything else coming after year. If you have only the first rule and you go to http://your-server/the-site/year/2007, then the server will have the default controller and default method handle the request instead of events->year(). If you have the second rule only, it will have http://your-server/the-site/year/ be handled by the default controller/method. This last issue does not make sense to me at all, since the .* should mean that it will handle something OR nothing.

Code Igniter tries to be very light weight, and to some degree it does succeed. But given a lot of the other features that have been included, I was frankly amazed that they did not integrate a decent templating engine. Thankfully, some nice third parties have created Smarty libraries that you can add to your Code Igniter project. I highly recommend using it, especially if you work now or intend to work with a separate designer in the future.

Another nice feature of Code Igniter is their use Active-Record style database queries. You can write your own complete queries, or use a lot of abstraction that they included so that your application can be reasonably database independent and porting to other database servers in the future will not be difficult. Their abstraction system has almost all of the features that I would want in an abstraction layer, and the way it handles results is quite clean. More details on that can be found in the Database help in the Code Igniter user manual.

Speaking of the User Manual, this is one area where Code Igniter is pretty much unbeatable. The user manual itself is very well written and organized, I am a very big fan of the Table Of Contents drop down on every page, and they have a plethora of examples on almost every topic. The wiki helps out some too, although it is much harder to navigate.

Code Igniter is supposed to be expandable and it provides hooks into the different sections of the framework while it runs to allow you to perform actions at different times. This is very useful, but the first time I had occassion to use a hook, I found that it did not have a hook at the point I needed, and I ended up having to make an ugly work around to handle it. I was trying to integrate Smarty’s caching system into the rest of the application, since the Code Igniters caching is poorly documented and does not seem to be as customizable. Since the cache needs to know which page is going to displayed before it can check for that cache file, I needed a hook after the controller and method had been selected and the controller instantiated, but before the method was called. Since this was not provided, I ended up having to write a hook into the earliest part of the framework (before the framework itself is even loaded). In retrospect, this probably ended up better after all, since the caching mechanism kicked in faster and significantly reduced the PHP run time and the server load time (by a factor of 15!).

Overall, I give Code Igniter an 7.5 out of 10. It has a lot of good features, it documents it well and it does fairly well at enforcing good practices. For small projects and websites with fairly simple to organize data, you can have a full blown website with administration sections and database driven page generation in very little time. For more complicated data, Code Igniter seems to break down and I ended up having a little too much bloat. As Code Igniter gets more use, I am sure a lot of these wrinkles will get ironed out. I would rate it lower, but the fact that the internal code is clean and that the documentation is so good are both signs that it is a well organized project and as such, improvements are virtually guaranteed to come.