Distributed key-value Cache design

In this entry we will discuss a distributed key-value storage cache design. The App will use this cache as intermediate storage. If there is a cache miss, data will be read from database to cache and from cache to application. Application is  not suppose to directly hit the database for data read.

There are three kind of Caching strategies.
  1.  Write through Cache: Caching system where write goes through Cache and succeed only when write to Cache and database BOTH succeed. This is useful in the application where you write and re-read the information quickly but the write latency would be higher as we need to write to two separate system.
  2. Write around Cache: In this type of system Write directly goes to database. The caching system read information from database in case of cache miss. This ensure low write load to cache and faster write. This is not useful for the application that write the data and re-read it quickly.
  3. Write back Cache: This type of system directly write to Cache and Write is confirmed as soon as cache write is done. The cache asynchronously sync this written data to database.This implies quick write latency and high throughput. But there is risk of losing data in case of caching layer dies. We may use replication of cache data to multiple cache layer to avoid this type of data lose.  

Before going into detail of design, we will first list out the Features our system is going to support.

List of Features:

1. Amount of data to Cache : 30TB.  Remove one or more entry in case of memory is not available.

2. Query per second: 10 Million/ Sec. caching system should be inherently low latency. For low latency we need to store all cache data in main memory. Production level machine has capacity of 72GB or 144GB main memory. Minimum number of machine required for 30TB data = 30TB/72GB ~420 machines

3. Important matrices :  Low Latency, High available and eventual consistency.

Design Strategy:

Lets start start with single machine caching system and then we will extend it to distributed caching system. For a single machine and single threaded caching system  a HashMap and double linked list is sufficient to implement a cache. Double linkedList(DLL) will be used to store the data and HashMap is used to store key and address of the data stored in DLL as a key-value map. Least recently access data will be moved to front of the DLL and address is update in hashMap accordingly.

For Single machine and multiThreaded system we can use java Concurrent util data structure for hashmap and deque.

In Java we can implement LRU cache using LinkedHashMap by overriding a method of LinkedHashmap.

  @Override
   protected boolean removeEldestEntry(Map.Entry<Integer, String> eldest){
      return size() > this.capacity;
   } 

Implement LRU cache using LinkedhashMap

To cache 30TB(internet scale data) we can not store all the data on one machine. We need distribute it on multiple machine. To store it on multiple machine we will shard( partition the data) and store it on different machine. To read about distributed sharding of data please read the entry "Consistent hashing".


Comments

  1. I was hyped to read this, and then the article stopped before getting into the distributed parts.

    ReplyDelete

Post a Comment