Achieving highly available HTTP database session persistence spanning multiple cells with Websphere v6.1

By 01

Sunday, August 24, 2008

This article represents the first in a series of thoughts I'm hoping to put together in a pseudo-organized fashion to create a new Websphere and J2EE blog/website.  I tend to sprinkle my articles with a combination of the technical content and opinions I am attempting to convey as well as a running commentary on the state of the universe.  For those who know me, it may cut down on the extra details I throw into general conversation, we'll see if that happens.

For the past eighteen months or so, I have been working with IBM Websphere v6.1.  I've completely left this world out of any articles that I have written during this time; of course, there hasn't been many.  A project that I worked on for sometime required significant high availability from its J2EE application server environment--namely, Websphere.  Previously, my J2EE experience both as a developer and administrator was with Weblogic (v7.0 & v8.1).  I had always known Websphere, through stories and folklore, to be somewhat unwidey compared to Weblogic.  So, when an opportunity presented itself to get my hands dirty, I was skeptical, but ready to learn.  Since then, I would characterize Websphere as a more complicated beast than Weblogic, but it can also do some very interesting things.

Technology companies tend to come in two flavors: companies started by engineers and companies started by marketing people.  The first provides documentation for all of its products online without requiring an account, blood sample, or dime.  The latter will have a very neat website void of all technical detail--until you sign an expensive support contract.  In a similar fashion, software products come in two forms: software written by developers and software written by administrators (or, at least, people who have been administrators).  The first tends to be an engineering marvel the likes of which the world has never seen (and may never see again), which exudes complexity ; the latter, in my experience, tends to accomplish the required functionality in the cleanest way possible.  This is how I would describe Weblogic and Websphere.  Again, while Websphere is more complicated, one can do many things with it.

IBM Documentation describes a number of Websphere v6.1 features that are required to provide a highly available websphere installation: Proxy Plugin (for IPlanet, Apache, IIS, etc), HTTP Session Replication, LTPA to propagate user credentials, etc.  We are going to talk about HTTP Session Persistence in Websphere v6.1 today.  There are two ways you can accomplish this: Memory-to-Memory Persistence or Database Persistence. 

HTTP Session JDBC Persistence is a feature of Websphere v6.1; it is referred to as database session persistence in the Websphere documentation.  In its purest form, this feature can be used to provide persistency of HTTP Session data (from here on referred to simply as session data) in the event of a container failure so that another container in the same cluster can seamlessly take over for the first.  How future requests are routed away from the failed container to a healthy container is outside this discussion's scope.  More interestingly, the database session persistence feature can also accommodate failover scenarios where requests are routed to containers in a different cluster (same cell) or even containers in a different cluster in a different cell.

The best approach to building a highly available Websphere environment is beyond the scope of this discussion, but how to configure session database persistence to work with containers running the same code in different clusters in different cells is what we are here to discuss.  I chose this scenario because it is the most complex and most interesting from a feature standpoint-in my opinion.
Before we go into the more complex setup, let's look at how database session persistence would work within one cluster in one cell.  As with all things, there will be some assumptions--to keep things simple. This discussion is based upon Websphere v6.1 Managing HTTP Sessions documentation.  For our discussion, Session Management entails session tracking, session recovery, and session clustering[1]. Let's assume


  1. There is one cell.
  2. This cell has a cluster with two containers.
  3. These two containers reside in different nodes on different physical machines.
  4. The cell has three nodes: Deployment Manager node, and two application nodes.
  5. There is an application, we'll call it /tester that initializes HTTP Session.


What I have written here is not a step-by-step tutorial.  The IBM documentation generally provides that.  This article suggests an advanced configuration that the documentation doesn't necessarily cover.

So, our goal is to have data stored in HTTP Session on container1 in Node1 available to a future request that may be directed to container2 in Node2.  This is accomplished by configuring Websphere Web Container to session data in a database table.  In theory, any SQL database that has a JDBC driver could be used to store HTTP Session information.  But, in general, Oracle and DB2 are probably the most common--IBM has an understandable propensity towards DB.  Although, I believe any database with a supported JDBC provider can be used.

In whatever database you chose to use, a table called SESSIONS & an index needs to be created that is defined here.  This link only defines the table for DB2, but with a little know-how it can be modified to work with any SQL database; otherwise, open a PMR with IBM to get a table definition for your database.  It is possible to rename this table; however, in early versions of 6.1.x.x, it wasn't possible to change the table name when using Oracle.

I recommend configuring the JDBC Provider and Datasource at the server scope; so, each individual server will have its own set of database connections.  This is more work up front, but it eliminates the possibility of one container consuming all of the database connections that would be available to all servers in the cluster. Instructions for configuring a new JDBC Provider and Datasource.  Make sure the datasource has access to a SESSIONS table described in the previous paragraph.  Also, make sure that the datasource is configured to be a non-XA driver.

The use of an HTTP Session Persistence store is configured for each individual web container.  For instructions regarding how to do this, see the following IBM Documentation.  The details of doing this are also outside of the scope of this discussion.

I prefer to track HTTP Session with cookies rather than the other available methods.  However, I try to track session in each cluster with a unique JSESSIONID cookie.  But, you can use any of the available methods.

You also need to chose how session information will be written to the database.  There is a trade off between reliability and performance.  Websphere can write every value stored in session to the database at the conclusion of every request (most reliable failover, highest performance penalty) or it can queue modified parameters for a batch update to the database later(not a perfect failover, but performs well)--there are also scenerios between these two extremes.  For most installations, something between these two extremes would probably be sufficient.

At this point, you should be able to access the /tester1 application, have a JSESSIONID cookie set to track HTTP Sessions for each user, reliably persist session data to a database table, recover session data from the table in another container in a failover scenario to provide session recovery--all of this is happening within a cluster's member containers(what IBM refers to as session clustering).  Now, we are going to extend this concept beyond the local, cluster level to containers that are not only in a different cluster, but also in a different cell.

  1. This brings us to another list of assumptions.  Let's assume:
  2. There are now two cells that satisfy the assumptions listed in scenario one.
  3. Each cell's containers have a Datasource pointing at the same database schema.
  4. The /tester1 EAR is deployed to the cluster in each cell.
  5. Each server is configured to use the same JSESSIONID cookie name.
  6. A homogenous configuration exists across each server's session persistence configuration.
  7. The same Virtual Host is configured and used by /tester1 in each cell.
  8. The same context root is used by the application in each cell, cluster, and container.

Assumption one creates a situation where there are essentially two copies of the same cell running in an environment.  The only items that may be different are hostnames, IP addresses, and ports.  Using the same naming conventions; the mirror containers in each environment should have the same names--I've found that keeping track of what is happening in easier this way.  Just remember to refer to a cell name when discussing containers.

Likewise, there will have to be some kind of load balancer that can redirect requests from the primary cell's containers to the secondary cell's containers.  Such functionality exists in several of the hardware load balancer product offerings currently available on the market.  This is outside the scope of the current discussion, but would be an interesting topic to explore in the future.  For our purposes, assume that this is possible.

Once a failover event has occurred, future requests for a user's session will be routed to a new container--in the secondary cell.  In order to provide a seamless user experience, finding the users session information in the SESSIONS table is required.  A side effect of how IBM Implemented the database session persistence makes it possible for this to occur.

The SESSIONS table definition has the following three columns:

    ID
    PROPID
    APPNAME

These columns could be used to create a primary key, if you wanted to create an index this way instead of using the original IBM schema.  THe ID column represents the users session tracking identifier--this is part of the value stored in the JSESSIONID cookie.  The second is a property name.  If the application code places a name:value pair in HTTP Session with a name of property1, this field would contain a value 'property1'.  The APPNAME field is the concatenation of the Virtual Host and context root used by the /tester1 application.  For a generic URL, the context root can be defined as follows:

    http://host:port/context-root/URI

The context-root is a configurable parameter in Websphere and most J2EE Containers--it generally defaults to the name of the WAR file.

As long as the following conditions are met:

the JSESSIONID cookie value remains unmodified between the old container and new (which it should)

the secondary cell should be able to read a user's HTTP Session information from the shared database session persistence store.  Which achieves our goal of having HTTP Session information available in containers in our secondary cell.


The same idea could be used with containers in another cluster in the same cell--the same Virtual Host needs to be used and the EAR deployed to both clusters.

References:

[1]Managing HTTP Sessions

[2]SESSIONS Table Definition

[3]Configuring a JDBC Provider

[4]Configuring a DataSource

[5]Configuring for database session Persistence

[6]http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/topic/com.ibm.websphere.zseries.doc/info/zseries/ae/rprs_sest.html

[7]http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp

 

 

 

 

 

 

©2008 www.thinkmiddleware.com

All copyrights & trademarks belong to their respective owners.

The comments and opinions herein are that of the author.

Please direct all comments to 01.

While the information presented on this web site is believed to be correct, the author is not responsible for any damage, loss of data, or other issues that may arise from using the information posted here.

Made with CityDesk
Last Modified: Sunday, 09-Nov-2008 10:48:28 MST