Tips for System Design Interview

6 min readJan 16, 2023

Tips to crack the system design interview at a Tech Company.

If we are dealing with a read-heavy system, it’s good to consider using a Cache.

A read-heavy operation refers to a workload or system that primarily performs read operations, as opposed to write operations. In other words, there are more read requests than write requests. And for write-heavy operations, it is vice-versa.

If we need low latency in the system, it’s good to consider using a Cache & CDN.
If we are dealing with a write-heavy system, it’s good to consider using a Message Queue for Async processing.
If we need a system to be an ACID complaint, we should go for RDBMS or SQL Database.
If data is unstructured & doesn’t require ACID properties, we should go for NoSQL Database.
If the system has complex data in the form of videos, images, files, etc. we should go for Blob/Object storage.
If the system requires complex precomputation like a news feed, we should consider using a Message Queue & Cache.
If the system requires searching data in high volume, we should consider using a search index, tries, or a search engine like Elasticsearch.
If the system requires scaling SQL Database, we should consider using Database Sharding.
If the system requires High Availability, Performance, and Throughput, we should consider using a Load Balancer.
If the system requires faster data delivery globally, reliability, high availability, and performance, we should consider using a CDN.
If the system has data with nodes, edges, and relationships like friend lists, and road connections, we should consider using a graph Database.
If the system needs scaling of various components like servers, databases, etc. we should consider using Horizontal Scaling.
If the system requires high-performing database queries, we should consider using Database Indexes.
If the system requires bulk job processing, we should consider using Batch Processing & Message Queues.
If the system requires reducing server load and preventing DOS attacks, we should consider using a Rate Limiter.
If the system has microservices, we should consider using an API Gateway(Authentication, SSL Termination, Routing, etc)
If the system has a single point of failure, we should implement Redundancy in that component. if one component fails, another can take over. This can include adding redundant servers, storage systems, or network devices.
If the system needs to be fault-tolerant, and durable, we should implement Data Replication (creating multiple copies of data on different servers)

Redundancy VS Replication
Redundancy and replication are similar in that they both involve creating multiple copies of data or components to improve availability and fault tolerance. However, they have some key differences:
Redundancy: This refers to the practice of adding extra components or systems that can take over if the primary component or system fails. For example, having a redundant power supply in a server can ensure that the server stays up and running even if one power supply fails.
Replication: This refers to the practice of creating multiple copies of data or systems that can be used in case of a failure. For example, replicating a database across multiple servers can ensure that the data is still accessible even if one server fails. Replication can be used in many different ways, such as synchronous replication, asynchronous replication, and multi-master replication.
In summary, redundancy is about adding extra components to take over if the primary one fails, while replication is about creating multiple copies of data or systems to ensure availability in case of a failure.

If the system needs user-to-user communication(bi-directional) in a fast way, we should consider using Websockets.
If the system needs the ability to detect failures in distributed systems, we should consider implementing Heartbeat.
If the system needs to ensure data integrity, we should consider implementing Checksum Algorithm.

A checksum algorithm is a mathematical function that takes an input (or message) and returns a fixed-size string of characters, which is usually a “digest” that is unique to the specific input. The purpose of a checksum is to detect errors or changes in the input data, such as transmission errors or malicious tampering. Some common checksum algorithms include the cyclic redundancy check (CRC), the MD5 (Message-Digest algorithm 5), and the SHA-1 (Secure Hash Algorithm 1).

If the system needs to transfer data between various servers in a decentralized way, we should go for Gossip Protocol.

A gossip protocol is a method for distributing information or state updates in a decentralized network. In a gossip-based system, each node periodically sends a small piece of information (or “gossip”) to a randomly selected subset of its neighbors. Over time, this information spreads throughout the entire network, eventually reaching every node. Gossip protocols are often used in peer-to-peer systems, distributed databases, and distributed systems, to keep all the nodes in the network informed of the state of the system. Gossip protocols are also known for their ability to handle node failures and network partitioning.

If the system needs to scale servers with add/removal of nodes efficiently, with no hotspots, we should implement Consistent Hashing.

Consistent hashing is a technique used to distribute requests or data items across a cluster of servers, such as a distributed hash table (DHT) or a content delivery network (CDN). The basic idea behind consistent hashing is to assign each server and each data item to a point on a virtual ring, based on the hash of their identifier. To determine which server is responsible for a given data item, the algorithm simply finds the first server clockwise from the data item on the ring.
One of the main advantages of consistent hashing is that it allows the cluster to be resized dynamically, without the need to re-assign all of the data items to new servers. When a new server is added to the cluster, only a small subset of the data items need to be moved, and when a server is removed, only a small subset of the data items are affected.
Another advantage of consistent hashing is that it helps to distribute the load evenly across the cluster, and it reduces the number of remapping of keys when a node is added or removed.

If the system needs anything to deal with a location like maps, or nearby resources, we should consider using Quadtree, Geohash, etc.

Quadtree
A quadtree is a tree data structure in which each internal node has exactly four children. It is used to partition a two-dimensional space into rectangles, or “cells,” and is commonly used for spatial indexing, collision detection, and image compression.
The root of the quadtree represents the entire space, and each internal node represents a cell that is divided into four smaller cells. Each leaf node represents a cell that contains a single data point, or a group of data points that are close together.
Quadtrees can be used to efficiently search for data points within a given rectangular region, by traversing the tree and testing only the cells that intersect the region. They can also be used for collision detection by partitioning a space into smaller cells and then checking for collisions within each cell.
Quadtrees are also used in image compression, in which the image is recursively divided into smaller sub-images and each cell is represented by a single color if all the pixels in the cell are of the same color. This representation of the image can be encoded more efficiently than the original image.
Geohash
Geohash is a latitude/longitude geocode system that uses a base32-encoded string to represent a rectangular area on the Earth’s surface. The string is generated by dividing the Earth’s surface into a grid of rectangular cells and then encoding the cell in which a particular point falls. The precision of the geohash is determined by the length of the encoded string, with longer strings representing smaller cells and therefore more precise locations.
Geohash can be used for a variety of applications, such as spatial indexing, proximity search, and data visualization. For example, it can be used to efficiently store and retrieve geospatial data in a database, by using the geohash as the key for a hash table. It can also be used to find all the points within a certain distance of a given point, by finding all the geohashes that are “neighbors” of the given geohash.
One of the main advantages of Geohash is that it can be used to index and search for geographic data efficiently, while minimizing the storage space required to store the data. Also, it can be used to index data by rectangular regions, which is useful in many GIS applications.

Tips for System Design Interview

Written by Vipul Vyas

Responses (1)