The Adaptive Developer

🚀 High-Performance C++20 Distributed Cache

I built this cache to solve production bottlenecks found in traditional solutions like Redis and Memcached:

Beyond Single-Threaded: Moves past the Redis single-threaded event loop and Memcached’s per-slab locking.
Fine-Grained Concurrency: Implements reader-writer locking for high-throughput parallel access.
Intelligence: Distinguishes between "one-hit wonders" and frequently accessed keys.
Stampede Protection: Built-in Request Coalescing to collapse duplicate backend queries when a hot key expires.

C++20 was chosen for deterministic memory layout and zero-cost abstractions.

Feature	Purpose
std::shared_mutex	Striped reader-writer locking.
std::shared_future	Coalescing concurrent requests to prevent "thundering herds."
std::from_chars	Zero-allocation number parsing for the RESP protocol.
Move Semantics	Ensures the SET path is allocation-free after initial construction.
RAII	Exception-safe mutex management via std::unique_lock.

Why not other languages? > * Go: Unpredictable P99 latency due to the goroutine scheduler.

A breakdown of the lifecycle of a SET command:

Ingress: Client sends RESP frame: 3\\r\\n...3\\r\\nfoo\\r\\n$3\\r\\nbar\\r\\n.
Detection: The accept loop (using select() with 200ms timeout) spawns a HandleRespClient thread.
Parsing: ParseRESP validates array counts (<1024) and string lengths (<1MB).
Execution: ExecuteCommand hashes the key to a stripe index, acquires a unique_lock, and inserts the ValueEntry.
Cleanup: EnforceCapacity() triggers a 3-phase eviction if limits are exceeded.
Egress: Returns +OK\\r\\n via SendAll to handle partial writes.

The store is divided into 64 independent stripes to minimize lock contention.

Phase 1 (TTL): Scan stripes and erase expired entries.
Phase 2 (Demotion): If the "Protected Segment" (hot half) is full, demote coldest entries to "Probationary."
Phase 3 (Eviction): Rank probationary candidates by (segment, lfu_count, last_touch_ns) and evict the lowest score.

Throughput: 3.66M ops/sec (In-process, single thread).
Latency: < 0.001 ms (p99).
Deduplication: 87.5% ratio (7 out of 8 redundant requests collapsed during bursts).
Binary Size: 122 KB (dynamically linked).
Build Time: ~2 seconds.

Modern I/O: Moving from thread-per-connection to io_uring or epoll for 10k+ concurrent connections.
Persistence: Transitioning the WAL from in-memory to a file-backed append-only log with fdatasync().
Enhanced RESP: Mapping the SET command's optional arguments to the existing internal TTL infrastructure.