The Cloudflare Blog: Quicksilver v2: evolution of a globally distributed key-value store (Part 1)

Source URL: https://blog.cloudflare.com/quicksilver-v2-evolution-of-a-globally-distributed-key-value-store-part-1/
Source: The Cloudflare Blog
Title: Quicksilver v2: evolution of a globally distributed key-value store (Part 1)

Feedly Summary: This blog post is the first of a series, in which we share our journey in redesigning Quicksilver — Cloudflare’s distributed key-value store that serves over 3 billion keys per second globally.

AI Summary and Description: Yes

Summary: The text outlines the development and operational nuances of Cloudflare’s Quicksilver, a key-value store that has evolved from a global distribution system into a foundational storage structure. The post discusses the introduction of a proxy-replica architecture aimed at optimizing disk space and enhancing scalability while maintaining request latency.

Detailed Description:

– **Project Background**: Quicksilver initially served as a global configuration distribution system but transitioned into a key-value store as its utility expanded within Cloudflare’s product suite.

– **Initial Challenges**: The first version (v1) replicated the entire dataset on every server, leading to inefficiencies as data centers and workloads grew, prompting the need for innovative solutions.

– **New Architecture (Quicksilver v1.5)**:
– **Roles Redefined**:
– **Replica**: Stores the entire dataset.
– **Proxy**: Acts as a persistent cache, which helps reduce overall disk usage.

– **Disk Space Optimization**:
– Moving towards fewer replicas and more proxies frees up to 50% of disk space.
– Introduces persistent caching stored similarly to full datasets using RocksDB.

– **Asynchronous Replication and Consistency**:
– **Problem**: Asynchronous replication can lead to inconsistencies in read returns when replicas are at different update stages.
– **Solution**: Utilize multiversion concurrency control (MVCC) to maintain the consistency of reads.

– **Negative Lookups**: Implemented Bloom filters for efficiently handling requests for non-existent keys, thereby minimizing unnecessary load on the system.

– **Discovery Mechanism**:
– Introduced a Network Oracle for efficient identification of nearby replicas and maintaining performance despite varying data center sizes and statuses.

– **Performance Results**: Quicksilver v1.5 achieved considerable disk space savings without compromising request latency, with proxies sometimes even outperforming replicas.

In essence, this evolution of Quicksilver reflects Cloudflare’s commitment to optimizing resource use and maintaining high performance, which are critical aspects for professionals focusing on infrastructure and cloud computing security. The implementation of the proxy-replica architecture, coupled with new caching strategies and discovery mechanisms, offers significant insights into scalable system design in distributed environments.