Imagine you have an online application which has following system components:
- 25+ Java based web/app servers deployed on Amazon EC2's in Availability Zone -A
- 5+ Amazon ElastiCache nodes in a single cluster in Availability Zone -A
- Web/App Servers use Java based memcached client ( example spymemcached client) to connect to the Amazon ElastiCache(Memcached) nodes
Now you are planning to run an online sales promotion and choose to add few ElastiCache nodes in the cluster to match your growing memory and performance requirements. The java based memcached clients cannot immediately identify and recognize the new ElastiCache nodes, you would have to add the cache end points manually in the client configuration and restart the the web/application server process(in this case 25+ web/app EC2's). This re-initialization action can result in temporary disruption of some services and even downtime on some architectures.
Amazon ElastiCache has recently introduced an Auto discovery feature on the spymemcached client(java based) to eliminate this complexity. You can download this patch from Github. Soon AWS will make this feature available for all popular memcached clients.
Using Amazon ElastiCache Auto Discovery feature, customer applications now transparently adapt to the addition/ deletion of cache nodes in the cache clusters. The applications are automated to react quickly to changes in your cache cluster without downtime. Amazon ElastiCache clusters now include a unique Configuration Endpoint(DNS Record). This record contains the DNS names of each of the cache nodes that belong to the cluster. Amazon ElastiCache service will ensure that the Configuration Endpoint always points to at least one such “target” cache node. A query to the target cache node then returns endpoints for all the nodes in the cluster. AWS team has implemented this config command(which queries the target node) as an extension to the Memcached ASCII protocol. Since ElastiCache remains 100% Memcached-compatible, you can keep using your existing Memcached client libraries with new and existing clusters, but to take advantage of Auto Discovery you must use an Auto Discovery-capable client provided by AWS.
How Auto Discovery (AD) works ?
- First the Web/Application AD client resolves the configuration endpoint's DNS name. Since the configuration endpoint maintains CNAME entries for all of the cache nodes inside the cluster, the DNS name resolves to one of the nodes
- Second, the Web/App AD client then connects to that particular node(resolved above) and requests the configuration information for all of the other nodes. Since each node maintains configuration information for all of the nodes in the cluster, any node can pass configuration information to the Web/App AD client upon request
- Third,the Web/App AD client receives the current list of cache node hostnames and IP addresses. Auto Discovery library then connects asynchronously to all of the other nodes in the cache cluster.
- Since the cache nodes can be added/removed, consistent hashing (ketama) is built in the Auto Discovery client. To know more about why consistent hashing is needed refer article : http://harish11g.blogspot.in/2012/12/amazon-elasticache-memcached-consistent.html
- Auto Discovery uses a 1 minute Polling frequency by default. Usually Cache nodes will not be frequently(in minutes/hours) added or removed in production, this frequency is more than enough for most production cases. Also before reducing this polling frequency, it is recommended one should test and know how it impacts the Web/App process and compute cycles.
- Since there will be a gap of ~60 seconds between cache node removal and Auto Discovery polling detection, it is recommended to design Web/App clients with the exception handling and fall over capability to back end data store.
Benefits of using Auto Discovery
- It avoids manual intervention and downtime.
- When you scale up a cache cluster, the new nodes register themselves with the configuration endpoint and with all of the other nodes. When you scale down the cache cluster, the departing nodes de-register themselves. In both cases, all of the other nodes in the cluster are updated with the latest cache node metadata. .
- Client programs poll the cluster at adjustable interval (default is per minute). If there are any changes to the cluster configuration, such as new or deleted nodes, the client receives an updated list of metadata and acts accordingly(connects or disconnects from them).
Getting started with Auto Discovery:
To get started, download the Amazon ElastiCache Cluster Client by clicking the “Download ElastiCache Cluster Client” link on the Amazon ElastiCache console. Before you can download, you must have an Amazon ElastiCache account; if you do not already have one, you can sign up from the Amazon ElastiCache detail page. After you download the client, you can begin setting up and activating your Amazon ElastiCache cluster by visiting the Amazon ElastiCache console. More details can be found here.
Part 1: Understanding Amazon ElastiCache Internals : Connection overhead
Part 2: Understanding Amazon ElastiCache Internals : Elasticity Implication and Solutions
Part 3: Understanding Amazon ElastiCache Internals : Auto Discovery
Part 4: Understanding Amazon ElastiCache Internals : Economics of Choosing Cache Node Type
Launching Amazon ElastiCache in 3 Easy Steps
Caching architectures using Memcached & Amazon ElastiCache
Web Session Synchronization patterns in AWS