Storing PHP Sessions in CouchDB

One of the more common architectural tasks when designing a web based system that you expect to scale horizontally is deciding how to handle and store sessions. This is because each front end server is running its own PHP install and their session data is stored locally on the disk instead of being shared between environments. This creates a split brain problem between your servers.

Common solutions include forcing users to always be routed to the same web server (“sticky sessions”), so that they’ll end up back on the server that initiated and is storing their session data, and using a common storage area for the servers’ session data. Sticky sessions can be a pain to set up, decreases the fault tolerance of your front line web servers, can cause bad user experience if they get routed to the “wrong” server, and doesn’t take full advantage of any load balancing you’re doing.

There are some load balancers that will store sessions at their layer of the architecture, but we’ll assume you’ve got a fairly small architecture or don’t want to keep bolting on complexity.

Since your application likely already has a common persistence layer (a la database) that you put thought into making fault tolerant, and you obviously chose CouchDB, we’re going to look at storing your session data there. Here’s a quick break down of how we’re going to do it:

  • We’re going to use PHP’s session_set_save_handler(), which lets us provide callbacks (functions) for opening, closing, reading, writing, destroying, and garbage collecting sessions.
  • Connecting to CouchDB will be done with Sag v0.4. Our session CRUD functions map nicely to HTTP’s verbs, which Sag directly exposes to us.
  • Each session will be stored in its own document, using PHP’s session ID as the document’s _id. It will also store when the session was last written to for garbage collection purposes. We would be worried about index and database sizes when using large IDs, but garbage collection of expired sessions will constrain the file sizes nicely. And more sessions should mean more users, and therefore cash and fame, so that’s a nifty problem to have.
  • The database name will be the session name (from session_name()). That means we can have different applications, or sub-applications, using this methodology more easily by having them use their own session names.
  • Because there can be a lot of I/O operations on a session, we don’t want to have to go back to the database every time, especially when CouchDB’s MVCC architecture requires that we retrieve a document from the server before updating it. To protect against this we’re going to use Sag’s MemoryCache, which stores the document’s object in memory during the script’s execution. Sag’s caching uses Etags to cache the docs locally, much like your web browser does with web pages.

The Code: class CouchSessionStore

Refer to the code here: http://gul.ly/zr (GitHub)

CouchSessionStore is set up to have little impact on your application’s code, exposing a series of static functions that will act as PHP’s session CRUD callbacks. These callbacks are provided to PHP with session_set_save_handler() at the bottom of the file, after the class definition. One great place for improvement of this class would be to use the factory design pattern to set up CouchSessionStorage and call session_set_save_handler(), moving this work out of the global scope.

There’s an additional hook at CouchSessionStore::setSag($sag) that accepts an initialized Sag object. This means you can specify a different SagCache implementation, use your own server info and credentials, etc., overwriting the default configuration. If you pass CouchSessionStore::setSag($sag) NULL it will revert back to its default Sag configuration. The only thing that you cannot change through this hook is the database name: CouchSessionStore will always set this to the PHP session name in lower case, decreasing the risk of bugs.

If you really want to use a different naming scheme you can extend CouchSessionStore and re-implement setSag($sag), like this:

<?php
require_once 'CouchSessionStore.php';

class SuperCouchSessionStore extends CouchSessionStore
{
  public static function setSag($sag)
  {
    //use CouchSessionStore to set everything up, so our $this->sag == $sag
    parent::setSag($sag);

    //overwrite the baked in database naming, creating it in Couch if it doesn't exist
    $this->sag->setDatabase('super-database-name', true);

    //obey our parent class's definitions
    return $this->sag;
  }
}
?>

Design Document Creation

One of the really neat things that we do is check whether our design document, the index that maps creation times to documents for easy garbage collection, exists when we open the session, creating it if it does not. This is a great example of the power a schemaless database gives you – we do not have to worry about deploying new schemas or too much about what the data will look like before developing our application.

This also allows us to roll out new design document code as we develop our application, baking our “schema” and querying into our application’s versioning. For example, instead of just checking whether the design document exists, we could retrieve it and compare its map reduce code to ours, sending the new code if it did not match. You could also define your application’s version into the design document and compare against that if you are worried about earlier code versions overwriting your newer versions’ code.

Comments

Contact Us

We'd love to hear from you. Get in touch!

Phone

+1 617-283-2807

Mail

P.O. Box 961436
Boston, MA 02196