A chat app performs different functions for different people. It is extremely important to nail down the exact requirements. For example, you do not want to design a system that focuses on group chat when the interviewer has one-on-one chat in mind. It is important to explore the feature requirements.
It is vital to agree on the type of chat app to design. In the marketplace, there are one-on-one chat apps like Facebook Messenger, WeChat, and WhatsApp, office chat apps that focus on group chat like Slack, or game chat apps, like Discord, that focus on large group interaction and low voice chat latency.
The first set of clarification questions should nail down what the interviewer has in mind exactly when she asks you to design a chat system. At the very least, figure out if you should focus on a one-on-one chat or group chat app.
Questions to ask for exact scope
What kind of chat app shall we design? 1 on 1 or group based? – It should support both 1 on 1 and group chat.
Is this a mobile app? Or a web app? Or both? – both
What is the scale of this app? A startup app or massive scale? – It should support 50 million daily active users (DAU).
For group chat, what is the group member limit? – A maximum of 100 people
What features are important for the chat app? Can it support attachment? – 1 on 1 chat, group chat, online indicator. The system only supports text messages.
Is there a message size limit? – Yes, text length should be less than 100,000 characters long.
Is end-to-end encryption required? – Not required for now but we will discuss that if time allows.
How long shall we store the chat history? – forever
These are the requirements based on the questions above:
Clients do not communicate directly with each other. Instead, each client connects to a chat service, which supports all the features mentioned above. Let us focus on fundamental operations. The chat service must support the following functions:
When a client intends to start a chat, it connects the chats service using one or more network protocols. For a chat service, the choice of network protocols is important.
Requests are initiated by the client for most client/server applications. This is also true for the sender side of a chat application. When the sender sends a message to the receiver via the chat service, it uses the time-tested HTTP protocol, which is the most common web protocol. In this scenario, the client opens a HTTP connection with the chat service and sends the message, informing the service to send the message to the receiver. However, the receiver side is a bit more complicated. Since HTTP is client-initiated, it is not trivial to send messages from the server. Over the years, many techniques are used to simulate a server-initiated connection: polling, long polling, and WebSocket.
Polling – polling is a technique that the client periodically asks the server if there are messages available. Depending on polling frequency, polling could be costly. It could consume precious server resources to answer a question that offers no as an answer most of the time.
Long Polling – in long polling, a client holds the connection open until there are actually new messages available or a timeout threshold has been reached. Once the client receives new messages, it immediately sends another request to the server, restarting the process. Long polling has a few drawbacks:
Websocket – webSocket is the most common solution for sending asynchronous updates from server to client. WebSocket connection is initiated by the client. It is bi-directional and persistent. It starts its life as a HTTP connection and could be “upgraded” via some well-defined handshake to a WebSocket connection. Through this persistent connection, a server could send updates to a client.
Scalability
No technologist would design such a scale in a single server. Single server design is a deal breaker due to many factors. The single point of failure is the biggest among them. We suggest having a presence server.
Here the client maintains a persistent WebSocket connection to a chat server for real-time messaging.
Storage
Selecting the correct storage system that supports all of our use cases is crucial. We recommend key-value stores for the following reasons:
One on One chat flow
Message synchronization across multiple devices
Each device maintains a variable called cur_max_message_id, which keeps track of the latest message ID on the device. Messages that satisfy the following two conditions are considered as news messages:
With distinct cur_max_message_id on each device, message synchronization is easy as each device can get new messages from the KV store.
Group chat
In this chapter, you are asked to design a unique ID generator for a distributed system. Your first thought might be to use a primary key with the auto_increment attribute in a traditional database. However, auto_increment does not work in a distributed environment because a single database server is not large enough and generating unique IDs across multiple databases with minimal delay is challenging.
Here is an example.
Questions to ask for clear scope
What are the characteristics of unique IDs? – IDs must be unique and sortable.
For each new record, does ID increment by 1? – The ID increments by time but not necessarily only increments by 1. IDs created in the evening are larger than those created in the morning on the same day.
Do IDs only contain numerical values? – Yes, that is correct.
What is the ID length requirement? – IDs should fit into 64-bit.
What is the scale of the system? – The system should be able to generate 10,000 IDs per second.
Now here are the requirements gathered from questions above:
Solution
Datacenter IDs and machine IDs are chosen at the startup time, generally fixed once the system is up running. Any changes in datacenter IDs and machine IDs require careful review since an accidental change in those values can lead to ID conflicts. Timestamp and sequence numbers are generated when the ID generator is running.
Timestamp
The most important 41 bits make up the timestamp section. As timestamps grow with time, IDs are sortable by time. Figure below shows an example of how binary representation is converted to UTC. You can also convert UTC back to binary representation using a similar method.
Sequence number
12 bits. For every ID generated on that machine/process, the sequence number is incremented by 1. The number is reset to 0 every millisecond.
There are other alternatives but they don’t work as well as the soluton above according to our requirements.
UUID is worth mentioning here as an alternative. If our requirements include that IDs are 128 bits long instead of 64 bits long or can be non-numeric then UUID will work.
Welcome to my Java tutorial for all levels. Before you dive too deep into my java tutorial I would like to give an idea of what these tutorials are about. What I am teaching here is Java in it’s simplest form so that even a beginner can understand how to program in Java. I share my findings and researches on how to use Java in a practical program or project.
I hope that you will enjoy finding the solution for your bug here! If you are a beginner ride a long with me you never know it might interest you.