Zippy

Zippy is a lightweight, gRPC-based in-memory data store that offers robust key-value storage capabilities, akin to Redis. Engineered for high performance, Zippy excels at handling multiple operations concurrently, making it an ideal solution for applications demanding efficient, real-time data storage and retrieval.

Features

Data Management: Efficiently add, update, and delete data with seamless CRUD operations.
Data Persistence: Data durability through regular merge-on-update snapshotting, providing reliable data recovery.
Data Expiration: Removes entries after a specified time-to-live (TTL).
Data Eviction: Manages memory efficiently by evicting less frequently used data.
Concurrency: Handles multiple clients simultaneously with robust multi-threading and concurrent processing.

The WHY?

This project was born from a question discussed over the dinner table — "How do companies like Uber and WhatsApp manage millions of data transactions every second?". Intrigued by the complexity of this challenge, we delved into the inner workings of high-performance systems, inspired by this Uber Engineering Blog. We got interested in the inner workings of Redis and wanted to learn how it works. What better way to learn and master the art of system design by implementing it by ourselves.

Choice of Language

Zippy is crafted for speed, reliability, and scalability, making it the optimal solution for applications demanding rapid data access and manipulation.

We prioritized ultra-low latency and high performance for our data store. Selecting C++ as our primary backend language was driven by the following advantages:

Extremely fast
Low level control of the memory
Flexibility to design data structure

This choice equips Zippy with the capabilities needed to handle intensive workloads efficiently, ensuring swift and reliable data operations.

Components of Zippy

Our system has 2 main parts:

Database: This is where we store the data. Think of it like a bed-side table (Storage for books for which you require fast access).
Registry: This is where we store data for longer. Think of it like a book shelf (Storage for books for which you don't require fast access).

Architecture

Benchmarks

The benchmarking script measures the performance of Zippy by executing three types of operations: SET, GET, and DEL. For each operation, the script evaluates the average throughput and average time per operation across multiple clients.

SET: Adds a key-value pair to the data store.
GET: Retrieves the value associated with a key from the data store.
DEL: Deletes a key-value pair from the data store.

Multiple Clients: Simulates multiple clients to test concurrency.
Operation Execution: Each client performs a specified number of requests for each operation.
Performance Metrics:
- Average Throughput: Number of operations per second.
- Average Time per Operation: Time taken to complete each operation.
Wall-Clock Time: Total time for the benchmark.

Getting Started

Prerequisites

Ensure you have the following installed:

Git : Version control
C++ Compiler : For compiling C++ code
CMake : To build the project
gRPC : Simplifies and accelerates inter-program communication
protobuf : Serialization framework for data interchange

Installation

Fork the repository

Clone the repository:

git clone https://github.com/your_username/zippy
cd zippy

(Optional) Run Hooks & Tests

./scripts/setup_hooks.sh
./scripts/run_tests.sh

Build the project:
```
mkdir build
cd build
cmake ..
make
```
Run the server:
```
./zippy
```

This will start the Zippy server

Usage

From the root of the project

Into Client directory
```
cd client
```
Install Python dependency (virtual environment)
```
pip install -r requirements.txt
```
Start Pyhton Client
```
python3 client.py
```

You can now communicate with the C++ server and use the in-memory datastore!

Future Plans

~~Implement snapshotting (saving the in-memory data to disk at intervals)~~
~~Handle multiple clients concurrently using multi-threading or an event loop~~
Make the tests run faster - tests/tDatabase.cpp
Simple Pub/Sub messaging
Support for multiple data structures
Make a web client in flask to monitor the current status of the cache

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributors

_{Bhanu Gupta}

_{Pranav Iyer}

_{Purvav Punyani}

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
client		client
gen		gen
git-hooks		git-hooks
include		include
misc		misc
proto		proto
scripts		scripts
src		src
tests		tests
.dependency		.dependency
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly