aditi5
12/2/2019 - 5:02 AM

System Design

  1. Requirement for cloud storage: a.) Availibility: Data available at any where anytime from any os and platform like smartphones.
    b.) Reliability and durability: 100% reliability and durability of data by keeping multiple copies of the data on different graphcically located servers c.)Scalibility: You have unlimited storage as long as you can pay

Requirement and Goal of System:

  1. User should be able to upload/download their photos from any device 2.) User should be able to share their data with other users
  2. Data sync btw devices.
  3. Support for storage of large files
  4. Maintain ACID props
  5. Support for Offline CRUD operation(after coming online all the changes should be synced to all offline and online devices)
  6. Support for snapshotting of data, so that user can go to any version at any time.

Design considerations:

  1. Huge read and write volume operations
  2. Read to write will be nearly same???? WHy?
  3. Files shall be stored in small part or chunks. Benefits: Retry in case of failure will happen on that small chunk only, If a file upload fails retry will happen only on that part of the file.
  4. Data exchange rate can be reduced by transferrring only updated chunks.( need clearity)
  1. Duplicate chunks can be removed to save storage and bandwidth. 6.) Keeping a local copy of metadata at client side will save round trip to the storage server 7.) For small changes client can only upload the diff

Capacity and estimation constraints: Lets assume:

  1. We have 500M users and total 100M active users(DAU)
  2. Each user connects with 3 different devices
  3. Per user 200 files/ photos means 100 billion total files
  • CAP theorem states that it is impossible for a distributed software system to simultaneously provide more than two out of three of the following guarantees (CAP): Consistency, Availability, and Partition tolerance. When we design a distributed system, trading off among CAP is almost the first thing we want to consider. CAP theorem says while designing a distributed system we can pick only two of the following three options:
  • Consistency: All nodes see the same data at the same time. Consistency is achieved by updating several nodes before allowing further reads.
  • Availability: Every request gets a response on success/failure. Availability is achieved by replicating the data across different servers.
  • Partition tolerance: The system continues to work despite message loss or partial failure. A system that is partition-tolerant can sustain any amount of network failure that doesn’t result in a failure of the entire network. Data is sufficiently replicated across combinations of nodes and networks to keep the system up through intermittent outages.