Hive: A Globally-Distributed Key/Value Store

White paper

As a cloud provider, we handle a considerable amount of data on a daily basis, especially with our Object Storage products. We needed a way to distribute this data globally with various desired consistencies, replications, and database sharding for linear read and write latency.

So, we designed Hive: a scalable, globally-distributed database designed, built, and deployed at Scaleway. Hive is designed to scale up to thousands of machines across data centers worldwide and billions of database entries.

In this white paper, we cover:
- A global overview of the design of Hive
- The integration within the existing S3 product
- Benchmarks and evaluations of the database
- Exploring what we can build in the future with Hive.
Download the white paper

Introduction

Scaleway is a European cloud provider that runs a large number of cloud-oriented products. One of these products is Scaleway Object Storage, a storage API based on Amazon’s S3. There are strict operational requirements for Scaleway’s production in terms of reliability, performance, and efficiency, and to support continuous growth, any platform needs to be highly scalable. Reliability is one of the essential requirements because even the slightest outage has significant financial and trust consequences.

This project was born to meet the reliability and scaling needs of the Scaleway S3 product. We needed a platform to scale up to millions of different databases with billions of entries while maintaining storage client separation, good latency, and performance. We also needed to build a reliable, consistent, and flexible platform. Indeed, in some regions, we can have multi-data centers replication, while in other regions, we do not have the housing required for such operations.

Hive is a scalable, globally-distributed database designed, built, and deployed at Scaleway. At the highest level of abstraction, it is a database that stores key/value pairs and shards data across many RAFT clusters, which are also called RAFT groups. Replication is used for global availability and geographic locality; clients automatically failover between replicas of a database. Hive is designed to scale up to thousands of machines across multiple data centers worldwide and billions of database entries.

As with any cloud provider, dealing with failures in an infrastructure comprising millions of components is a standard mode of operation; there are always a significant number of failures at any given time. As such, Hive was designed to treat failures as the typical case without impacting availability or performance.

Hive is not meant to become a general-purpose database, like Redis or PostgresSQL. It is a highly optimized key-value store that works well for a few specific operations. That being said, it is still a key-value store to be used as-is for various applications but without performance improvements.

Discover more by downloading the white paper