Zippy is a lightweight, gRPC-based in-memory data store that offers robust key-value storage capabilities, akin to Redis. Engineered for high performance, Zippy excels at handling multiple operations concurrently, making it an ideal solution for applications demanding efficient, real-time data storage and retrieval.
- Data Management: Efficiently add, update, and delete data with seamless CRUD operations.
- Data Persistence: Data durability through regular merge-on-update snapshotting, providing reliable data recovery.
- Data Expiration: Removes entries after a specified time-to-live (TTL).
- Data Eviction: Manages memory efficiently by evicting less frequently used data.
- Concurrency: Handles multiple clients simultaneously with robust multi-threading and concurrent processing.
This project was born from a question discussed over the dinner table — "How do companies like Uber and WhatsApp manage millions of data transactions every second?". Intrigued by the complexity of this challenge, we delved into the inner workings of high-performance systems, inspired by this Uber Engineering Blog. We got interested in the inner workings of Redis and wanted to learn how it works. What better way to learn and master the art of system design by implementing it by ourselves.
Zippy is crafted for speed, reliability, and scalability, making it the optimal solution for applications demanding rapid data access and manipulation.
We prioritized ultra-low latency and high performance for our data store. Selecting C++ as our primary backend language was driven by the following advantages:
- Extremely fast
- Low level control of the memory
- Flexibility to design data structure
This choice equips Zippy with the capabilities needed to handle intensive workloads efficiently, ensuring swift and reliable data operations.
Our system has 2 main parts:
- Database: This is where we store the data. Think of it like a bed-side table (Storage for books for which you require fast access).
- Registry: This is where we store data for longer. Think of it like a book shelf (Storage for books for which you don't require fast access).
The benchmarking script measures the performance of Zippy by executing three types of operations: SET, GET, and DEL. For each operation, the script evaluates the average throughput and average time per operation across multiple clients.
- SET: Adds a key-value pair to the data store.
- GET: Retrieves the value associated with a key from the data store.
- DEL: Deletes a key-value pair from the data store.
- Multiple Clients: Simulates multiple clients to test concurrency.
- Operation Execution: Each client performs a specified number of requests for each operation.
- Performance Metrics:
- Average Throughput: Number of operations per second.
- Average Time per Operation: Time taken to complete each operation.
- Wall-Clock Time: Total time for the benchmark.
Ensure you have the following installed:
- Git : Version control
- C++ Compiler : For compiling C++ code
- CMake : To build the project
- gRPC : Simplifies and accelerates inter-program communication
- protobuf : Serialization framework for data interchange
-
Fork the repository
-
Clone the repository:
git clone https://github.com/your_username/zippy cd zippy
-
(Optional) Run Hooks & Tests
./scripts/setup_hooks.sh ./scripts/run_tests.sh
-
Build the project:
mkdir build cd build cmake .. make
-
Run the server:
./zippy
This will start the Zippy server
From the root of the project
-
Into Client directory
cd client
-
Install Python dependency (virtual environment)
pip install -r requirements.txt
-
Start Pyhton Client
python3 client.py
You can now communicate with the C++ server and use the in-memory datastore!
-
Implement snapshotting (saving the in-memory data to disk at intervals) -
Handle multiple clients concurrently using multi-threading or an event loop - Make the tests run faster - tests/tDatabase.cpp
- Simple Pub/Sub messaging
- Support for multiple data structures
- Make a web client in flask to monitor the current status of the cache
This project is licensed under the MIT License - see the LICENSE file for details.
Bhanu Gupta |
Pranav Iyer |
Purvav Punyani |