Saturday, October 6, 2012

Web Session Synchronization patterns in AWS

Statelessness is an important property that every application needs to posses in order to achieve Scalability and Availability in AWS cloud. Usually any web application will maintain the user web sessions on its server side to authorize them at every request. Also Sessions are used map the user requests to user data in some designs. When developing a new PHP/Java application, a common approach is to store session data in memory. Unfortunately, this approach doesn't scale very well for Large scale distributed systems . The moment your application scales out beyond a single web/app server, it becomes a necessity that the session state must be shared between multiple web/app servers. The common solution is to store session state on altogether different data store and share it with all the Web/App Servers. Though this approach is little complex than in-memory sessions, single point of failure and scalability can be achieved using this architecture pattern. 

Let us explore some popular patterns for web session synchronization on AWS cloud world.

Pattern 1: Synchronization using JGroups

Since AWS Infrastructure currently does not support Multicast protocol, the application layer software should synchronize the session data between them directly using Unicast TCP mechanism. Java based Application servers can use JGroups to synchronize the session data to other Java Web/Application EC2 instances in Unicast TCP of JGroups. This pattern will suffice if there are ~5-10 EC2 Web/App servers in the infrastructure. Also since this pattern follows a distributed approach towards web session synchronization there is no SPOF.  On other hand, If your web application layer has few hundreds of EC2 instances then Unicast TCP will be inefficient because of following reasons; there will lots of sync traffic data among Java Web/App EC2,  EC2 instances will be busy using considerable amount of CPU cycles for synchronization rather than serving web requests and EC2 web/app memory will filled with redundant session data.  
Note : vCider or VPNCubed seemed to support Multicast on AWS. vCider is acquired by Cisco and have stopped new downloads currently. 

Pattern 2: Synchronization using Terracotta Web Sessions

Terracotta(TC) web sessions can be used synchronize the session data in Java based applications. This product comes with client plugin(Terracotta driver) that needs to attached to the Web/App JVM of the EC2 instance and a Terracotta server that can be installed on a separate EC2 instance. The web session data will be optimally* serialized using TCP protocol from the Java Application server to the Centralized Terracotta EC2 server instance.  In case a web request earlier served by App EC2 A reaches App EC2 B because of Round Robin algorithm or App EC2 A failure, the later will check with centralized TC EC2 server instance for the session authenticity and proceed with processing the request without causing problems for the end user. This way the application server can designed to behave stateless and pick up any web requests from any load balancers. In addition to web sessions Terracotta can be extended with their other product lines which can help us store much more information on TC server EC2 instance efficiently. The Terracotta server EC2 instances can also be set HA mode to avoid SPOF in your application. Terracotta also follows an optimized serialization protocol where by only the changed piece of data is sent over the wire and not the entire serialized content. This saves lots of data flowing over the Amazon EC2 NW and proves very handy and effective on scalable deployments.In our experience we found ,Terracotta web sessions are mainly suitable for Java based application servers. It is better to run Terracotta EC2 server instance with High IO SSD EC2 instance type or other larger Memory EC2 instances in AWS cloud because TC EC2 server performs really well when they are given lots of memory and NW leverage. TC server using high memory EC2 instances can use ephemeral with TC HA mode (or) RAID 0+EBS optimized+ PIOPS as underlying storage on AWS for better performance. We have observed this centralized architecture pattern using TC server is suitable for ~20-30 app server farms with heavy traffic in AWS. This is not surely suitable of large web applications running couple of hundreds app servers in their web/app farms. Since the numbers mentioned are primarily dependent upon connections/sec, volume of data and nature of application traffic, I would strongly suggest the teams using this model to do their own benchmarks before concluding.  

Pattern 3: Synchronization using Distributed MemCacheD / Amazon ElastiCache

This is one of the most widely used architecture patterns in the Highly Scalable applications for web session synchronization. Applications written in Java, .Net, PHP, Python etc can use this model to synchronize their sessions. Web Sessions from the Application layer are synchronized(PUT) in the distributed MemcacheD / Amazon ElastiCache nodes using a MemCacheD client(namely spymemcached for Java, respectively for others). Web/App EC2 instances can GET the web session from MemcacheD and authorize them. Though the session data is distributed, the Memcached client has the logic to access and get data from the correct MemcacheD EC2 server. Clients can use TCP,UDP in binary or string format to communicate with MemCacheD Server.
Amazon ElastiCache is an AWS implementation of MemCacheD with better features and control. MemCached can be installed on any Amazon EC2 instance whereas ElastiCache is an Amazon service which exists in a separate tier of your tech stack. It can be accessed thru same memcache protocol over TCP. As best practice Multiple MemcacheD / ElastiCache nodes should be created in multiple AZ's in AWS for storing sessions and other user data. Since the session data is distributed into multiple cache nodes a single MemcacheD node failure will not much impact the user experience. In case session data redundancy is needed you can write(2 or more PUTs) the session data to multiple cache nodes and get from single node. This offers better HA on the distributed cache layer for Cache dependent sites but comes with small latency hick during PUT operation. MemCacheD performs better on High Memory instances with better IO in AWS. Depending upon the Cache Infra architecture we can keep a separate layer of High memory EC2 instances for MemcacheD/ ElastiCache Tier or share the unused space of web/app/other EC2 servers memory using MemCached EC2 installations. Both works well for MemCacheD dependent applications. Since session auth is an important step, TCP with Binary or String mode is recommended for session synchronization in Memcache protocol. We have seen that this architecture can easily handle tens of thousands of concurrent requests/sec in highly scalable applications.

Pattern 4: Synchronization with RDS Master (MySQL,Oracle, SQL)

This is one of the most used architecture patterns in the entry level or some poorly designed applications for web session synchronization. Web Sessions from the Application layer are synchronized in the Centralized Database Master. Though applications written in Java, .Net, PHP, Python etc can use this model to synchronize their sessions, it is not recommended for applications which demands heavy traffic and scalability. The database master will be unnecessarily pounded with Session requests and will be overloaded to perform normal transaction/queries. It is strongly recommended to offload this traffic from the master using other patterns suggested in this article. Example : m1.large RDS MySQL instance type comes with default max connections of 600. If 250 users are going to access your web app then only 350 connections are available for others to perform txns overall. In case your DB is overloaded then you can observe heavy downfall in your txn throughput as well as session authorization requests in this architecture.

Pattern 5: Synchronization with RDS Master+ Multiple Read Replica's

This is an extended version of the pattern 4. The Web Session is created and written in RDS Master and subsequent Session reads are done from the Read Replica slaves. In a web application, session is accessed multiple times during its lifetime and this model offloads that traffic from the RDS Master. One core point to understand in this pattern is; RDS Master asynchronously replicates data to Read Replica Slaves, so in case your DB is heavily pounded with heavy queries/txns there could be a replication lag between Master and Read replica's.  Which means first time when you create a session and immediately try to read it from read slave it may* not be replicated in the slave.This exception has to handled at application code level. This pattern is suggested for applications which are totally designed with DB as the core component and cannot easily migrate to other patterns in short term. On the long term we recommend you to migrate other patterns in AWS.   

Pattern 6: Synchronization with Amazon DynamoDB

Amazon DynamoDB is a NoSQL database that can handle massive concurrent read and writes. You can configure in Amazon DynamoDB console how many read/writes you want per second, accordingly Amazon DynamoDB will provision the required infrastructure at the backend. The infrastructure provisioning operations that happens in the backend are completely abstracted from the end user. The biggest advantage of Amazon DynamoDB is its predictable performance,low latency with seamless scalability. Internally all data items are stored on Solid State Drives (SSDs) and are automatically replicated across three Availability Zones in a Region to provide built-in high availability and data durability. After the launch of DynamoDB, we have architected solutions for some customers using it as a web session data store. It was used for highly scalable applications that need tens of thousands of concurrent requests/sec with persistent session data and specific cache data requirements. Since the web session data are usually in the range of few bytes to KB we were easily able to accommodate this in DynamoDB structure. It is recommended to use "Strong Consistent reads" while transacting with session data in DynamoDB. Amazon DynamoDB also provides APIs and session state extensions for variety of languages like Java,.Net etc. 

Related Articles
Part 1: Understanding Amazon ElastiCache Internals : Connection overhead
Part 2: Understanding Amazon ElastiCache Internals : Elasticity Implication and Solutions
Part 3: Understanding Amazon ElastiCache Internals : Auto Discovery
Part 4: Understanding Amazon ElastiCache Internals : Economics of Choosing Cache Node Type
Launching Amazon ElastiCache in 3 Easy Steps
Caching architectures using Memcached & Amazon ElastiCache
Web Session Synchronization patterns in AWS

1 comment:

Raj Shekar said...

Good article Harish... A few more Microsoft specific session handling is also available viz. asp .net state service, SQL server, Couchbase (third party) etc.

Need Consulting help ?


Email *

Message *

All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.