Wednesday, December 26, 2012

Part 4: Understanding Amazon ElastiCache Internals: Economics of Choosing Cache Node Type


Amazon ElastiCache has variety of node types and often users end up choosing cheapest instance types and scaling out whenever the cache memory needs increase.  But “Cheap may not always be the best.”  Adding to this it may not be optimal for your requirements in terms of performance and cost and as a customer you may end up paying more because of bad choices.
In this article I have taken a sample cache memory requirements and application characteristics, based on that i have explored the economics of choosing a cache node type in Amazon ElastiCache which makes sense.
Following table illustrates the sample cost you will be spending on your Amazon ElastiCache for 100 GB of cache memory size.  Multiply this cost by 10X in case your Cache memory requirements are in TB range. The cost of building 100 GB Amazon ElastiCache cluster using various cache nodes are listed below:



Category 1 Cache Node types for Heavy Utilization
Network IO:  These node types come with High Network IO capacity. They can be used for Cache dependent sites which has heavy cache utilization in terms of requests and data transfer.
Compute utilization: Amazon ElastiCache internally uses memcached engine. Memcached is written using libevent model. This model allows the memcached to scale better with multiple cores, which effectively means better throughput. For applications which needs better concurrency and request throughput it is recommended to plan capacity with multiple cores for cache node types.  The cache nodes under “category-1” have 4-8 cores which are ideally suitable for application which utilizes cache heavily.
Elasticity: Any growing web application has elastic caching needs. In this case, the cache nodes will be added or removed on daily/weekly/monthly basis depending upon the needs. This activity affects the object remapping, node discovery and cache warming process.  Number of existing cache nodes, proportion of addition/removal of cache nodes, cache node type and frequency of addition/removal are some of the parameters that needs to carefully considered and planned during elasticity. Consistent hashing and Auto Discovery are some of the techniques that need to be adopted to minimize the complexity of this action. In general, it is better to build a strategy combining the below points:
  • ·         Proper base node type selection based on the current and project needs. Consolidating them as the cache memory size grows
  • ·         Distribute the data in multiple nodes (more is better than few)
  • ·         Add in smaller proportions with better cache warming procedures


Category 2 Cache Node types for Moderate Utilization
Network IO: Same as Category 1: High IO but depends/regulated upon node types (size).
Compute utilization:  Suitable for moderate workloads. Since cache node types in this category have only 2 cores, they are not suitable of heavy concurrent and compute driven cache access workloads.
Elasticity: All points discussed on category 1 apply here.

Category 3 Node types for Low utilization  
Most of them are Costlier, Some of them are performance wise poor (NW IO and compute) and offers less value for money for bigger cache memory requirements

Economics of choosing Cache Node Types:
From the table we can observe following points in the economics of choosing cache node type:
  • For a cache memory size requirement of 100 GB cache.m3.2xlarge, cache.m1.xlarge, cache.m2.2xlarge, cache.m2.4xlarge are good candidates. Amazon ElastiCache clusters built with these node types are very good package and offers overall better value to the users in terms of Elasticity, Proportions, Price and Performance
  • Smaller Cache node types (cache.m1.small) with lower per hour usage price need not be overall cost efficient when your cache memory need grows (~ 100 GB or more)
  • High CPU EC2 (cache.c1.xlarge) will be the costliest if the cache memory needs grows to GB/TB’s. This cache node type is not a good candidate for most Cache heavy (size) use cases. The only use case I can think of using cache.c1.xlarge when I have ~32 GB of Cache memory requirement which is not elastic and which will be heavily utilized. But anything above 32 GB of cache memory requirement will bleed your cost if you use this node type
  • Moderate IO Cache node types (like cache.m3.xlarge, cache.m1.medium, cache.m1.small) are costlier and less efficient compared to High IO peers. You end up paying more if your utilization is heavy in case your cache cluster is composed of Moderate IO Cache node types
  • The below point emphasizes how important it is choose the right cache node type for cache memory size. Planning the right cache node type according to your current and future requirements helps you save costs. Also I recommend you to consolidate your cache node type periodically till you reach a size of few 100 GB’s to save costs. Post 500 GB, building your cache clusters with High memory Cache node types are usually cost effective. Example: We can see for 100 GB cache, 2 X Cache.m2.4xlarge are sufficient.  But choosing this node type for this cache memory size will not be ideal because of the distribution and churn problems (in case if you are going to expand the memory size frequently). On the other hand if the cache memory size will be increased to or it is 1 TB then building the cache cluster using 15 X Cache.m2.4xlarge makes lots of sense in terms proportion and cost.

Choosing the Cache Node types: 

Application Characteristics
Cache Node Types
·         Cache dependent application with heavy cache data usage
·         Distributed Cache size:  100 GB -> 1 TB & above
·         Varied message sizes
·         High Concurrency of cache requests
·         Cache nodes will be added or removed on daily/weekly/monthly basis
·         Better performance and value for money

Category 1 Cache node types: Heavy Utilization nodes
cache.m3.2xlarge
cache.m1.xlarge
cache.m2.2xlarge
cache.m2.4xlarge
·         Moderate  Cache dependency & usage
·         Distributed Cache size:  50-100 GB
·         Varied message sizes
·         Moderate concurrency requirements
·         Not very Elastic needs
·         Moderate performance and cost efficient

Category 2 Cache node types: Moderate Utilization nodes
cache.m2.xlarge
cache.m1.large
·         Low Cache dependency & usage
·         Distributed Cache size:  ~50 GB and above
·         Small message sizes
·         Low concurrency requirements
·         Not very Elastic needs
·         Moderate performance and I am ready for $$$ leakage when memory needs grow
Category 3 Cache node types: Low Utilization nodes
cache.m3.xlarge
cache.m1.medium
cache.m1.small
cache.c1.xlarge

No comments:

Need Consulting help ?

Name

Email *

Message *

DISCLAIMER
All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.

Followers