You remember our exciting startup FADU bakery shop. We have been working with great Desi brain :P. Recently our engineering team organised a DESIGNathon. It was amazing. Let's talk about what we have designed.
"A shard scheme for KEY-VALUE storage system"
Lets list out the features that our design would supports.
1. Amount of data to be stored - 100TB. Data will keep growing with the rate of 1TB/day.
2.Machine storage capacity - 72GB RAM and 10TB Secondary storage. Bare Minimum machine needed to store 100TB data would be 10.
3. We will begin with 20 Machines and machines will be added in future.
4. All key value data will be independent and query will be based on key.
Observations:
The data grows at 1TB/day. This implies we are genrating data that would feel the storage of one machine(10TB) in 10 days. We try to keep storage utilization <=80. Under this assumption we would add new machine every 8 days.
One way for shard is that the data within the shard will fit on single machine only. In our case. data is growing with faster pace. In this case if we have fix number of shards, data within the shard will keep growing and exceed our machine capacity ie 10TB mark. So in our case shards will have to increase with time.
Sharding strategy :
Lets begin with fix number of shard S. For every key calculate the Hash value of the key using some hashing mechanism and store the data to shard corresponding to shard number H%S. Where H is the computed hash value of the data key and S is the number of shards. But in our case data is growing faster and we will have to increase number of shard as data grows. So in this case the new Shards count becomes S+1. The H%(S+1) changes for every key causing us to relocate each key data in our data store. Duh! this gonna be extremely expensive and undesirable :(.
Do we have better strategy to avoid this undesirable behavior ?? How we handle the case when a shard goes down and we need to redistribute the data on remaining shards???
Yes, Consistent Hashing will rescue us. So, What the heck is the consistent hashing?
Consistent hashing maps data to same shard, as far as possible. It means when a new shard is added, Its takes its share of object from all other shards and when it removed its objects are shared among remaining shard.
Consistent Hashing based strategy:
It also hash the object key but instead of geting the mapped value for each object each shard is assigned a range of hash values to store the objects. Now we will map each shard to range of hash values.
Range of hash value for each shard:( Below is just example, Hashvalue are generally bigger)
Mapping of objects to different shards.
Now if I add additional shard the only change, each shard will get a new range of Hash values. Any object whose hash value is within the range of current shard will remain in same shard. For objects whose hash value is not within the range of its current shard will be mapped to another shard. But the number of such objects are very few using consistent hashing algorithm compare to basic hashing algorithm.
Lets add another shard. The below diagram re-illustrate the point.
"A shard scheme for KEY-VALUE storage system"
Lets list out the features that our design would supports.
1. Amount of data to be stored - 100TB. Data will keep growing with the rate of 1TB/day.
2.Machine storage capacity - 72GB RAM and 10TB Secondary storage. Bare Minimum machine needed to store 100TB data would be 10.
3. We will begin with 20 Machines and machines will be added in future.
4. All key value data will be independent and query will be based on key.
Observations:
The data grows at 1TB/day. This implies we are genrating data that would feel the storage of one machine(10TB) in 10 days. We try to keep storage utilization <=80. Under this assumption we would add new machine every 8 days.
One way for shard is that the data within the shard will fit on single machine only. In our case. data is growing with faster pace. In this case if we have fix number of shards, data within the shard will keep growing and exceed our machine capacity ie 10TB mark. So in our case shards will have to increase with time.
Sharding strategy :
Lets begin with fix number of shard S. For every key calculate the Hash value of the key using some hashing mechanism and store the data to shard corresponding to shard number H%S. Where H is the computed hash value of the data key and S is the number of shards. But in our case data is growing faster and we will have to increase number of shard as data grows. So in this case the new Shards count becomes S+1. The H%(S+1) changes for every key causing us to relocate each key data in our data store. Duh! this gonna be extremely expensive and undesirable :(.
Do we have better strategy to avoid this undesirable behavior ?? How we handle the case when a shard goes down and we need to redistribute the data on remaining shards???
Yes, Consistent Hashing will rescue us. So, What the heck is the consistent hashing?
Consistent hashing maps data to same shard, as far as possible. It means when a new shard is added, Its takes its share of object from all other shards and when it removed its objects are shared among remaining shard.
Consistent Hashing based strategy:
It also hash the object key but instead of geting the mapped value for each object each shard is assigned a range of hash values to store the objects. Now we will map each shard to range of hash values.
Range of hash value for each shard:( Below is just example, Hashvalue are generally bigger)
Shard
|
Range of Hash Values
|
Shard 0
|
0000…~3fff…
|
Shard 1
|
3fff… ~7fff…
|
Shard 2
|
7fff…~bffd…
|
Shard 3
|
Bffd… ~ffff
|
Mapping of objects to different shards.
Object
|
Hash Value
|
Shard Mapped To
|
Music 1
|
4b4s…
|
Shard 1
|
Music 2
|
Ecb2…
|
Shard 3
|
Music 3
|
5082…
|
Shard 1
|
Image 1
|
B5e7…
|
Shard 2
|
Image 2
|
9433…
|
Shard 2
|
Image 3
|
1213…
|
Shard 0
|
Movie 1
|
69db…
|
Shard 1
|
Movie 2
|
C4ab…
|
Shard 3
|
Diagrammatically we can see this as follow:
Now if I add additional shard the only change, each shard will get a new range of Hash values. Any object whose hash value is within the range of current shard will remain in same shard. For objects whose hash value is not within the range of its current shard will be mapped to another shard. But the number of such objects are very few using consistent hashing algorithm compare to basic hashing algorithm.
Lets add another shard. The below diagram re-illustrate the point.
Notice How only Movie 2 and music 2 are mapped to new Shard4 and image 1 need to mapped to Shard 3. If we used basic hash mapping then we would most likely have to recalculate the mapping for all objects and remap them accordingly. Imagine how expensive that is for million of objects.
On my next entry, I will talk about some improvement over this design. Stay tune ... :)
No comments:
Post a Comment