DBaaS Behind the Scenes

Scaleway Database is now generally available! The database team is very proud to present its managed database service. In this article, we will go behind the scenes to discover how our database service works.

Database Lifecycle

What happens when you click on the “Create Instance” button?
Well, so many things! In this section, we will learn about the complete lifecycle of a Scaleway database instance.

Instance Creation

We use many sub-resources to create your database instance. All those sub-resources are completely abstracted for you. They won’t show up in your console, you'll only see your database instance.

All databases use instances that are created from a custom instance base image. This image is built internally and let our team deploy any version of PostgreSQL we support. We call them golden images because they are the sole reference for our product. The database engine runs in a Debian environment using the official PostgreSQL distribution along with many other internal utilities.
We also use a load balancer in front of the database instance. This load balancer will always redirect your call on the active node and switch over automatically on the hot standby node in case of failure. It ensures that you will always be able to reach your database instance with the same IP address.
On the instance side, we provision an instance with our golden image.
For PostgreSQL with the high-availability option enabled, we use an anti-affinity mechanism to make sure that the instances are running on different hypervisors. In the case of a single hypervisor crash, your database instance is still up and running.
Storage-wise, we use the local volume of the instance to have high-performance disk access for your database instance.

Instance Configuration

Now that our instance is provisioned, we need to configure it for you.
That encompasses user creation with passwords and special options you might set up in the Advanced Settings tab in the console.
To limit access to your database instance, you can restrict access by using the Allowed IPs feature. When enabled, a whitelist IP restriction is configured on your load balancer.

Readiness

When the instance is instantiated and configured, it is ready to serve SQL requests! You are billed only when the database is ready to use.

Observability

Now that your database instance is running, it is our mission to ensure it keeps working correctly. Monitoring and logging help us have a view over the instance on which your database is running.

Monitoring

Your database instances are continuously monitored to ensure they are working correctly. Our monitoring stack is based on Prometheus which is going to probe metrics exporters (including node exporter) that lives on your instance. This setup allows us to have metrics about the instance on which your database is currently running.
For instance, a system is in charge of inspecting that your database instance got enough free space. If it is not the case, it will switch the default transaction mode to read-only, to make you notice the issue while keeping the service up, in a degraded mode. This operation is done to give you enough time to clean your logical databases or upgrade your database instance to a different plan.

Logging

While running, PostgreSQL generates logs. Those logs are here to help understand what is currently happening inside your database engine.
You can fetch those logs from the console.
Select the time interval you wish to extract logs from and they will be made ready for you. When your database instance is highly available, we fetch the logs from all active nodes. When you ask for them on the console you will get a log file per node.

Backup and Restore

Backup and restore is an essential part of building a durable system.
When you create a database instance, by default, your instance backup is automatically performed using a managed backup schedule. You can also perform on-demand managed backup and restore using the console.

Backup

When you enable automated backup for your database instance, we schedule a backup process that will be triggered at a fixed frequency.
A backup task will create a dump of your database as a compressed dump file.
This file is then stored internally in another region using the Scaleway Elements Object Storage.
We use a different geographical region to ensure data sustainability in the event of a disaster in one of our availability zone.
This task can also be triggered manually when you see fit. You can trigger a manual backup from the tab Backup of your administration interface.

Restore

A restore operation will fetch a backup from the object storage and restore this dump as your database.
By default, the destination database is the origin database of the backup.
As destination backup, you can specify a different database name to restore your backup in a different database than its origin.

A backup restoration is not limited to one instance, you can perform a restore to across your database instances. However, it is possible to specify a destination database name where to restore the backup.

High-Availability

Scaleway Database supports high availability setup. This allows your database to still be available if a node crashes. We highly recommend activating backups and high availability for production uses.

PostgreSQL Native Replication

PostgreSQL supports natively many types of replication.
We use a Write-Ahead Log Shipping strategy for replication.
It means that when a write query is sent to the active database, the transaction is acknowledged only when active and standby instances acknowledged the write entry is committed on both sides.
By doing this, we guarantee that in case of the loss of one of the instance node, the system will continue working and stay consistent.
PostgreSQL supports natively replication methods.
However automatic failover is not supported by default.
We need to have a mechanism to redirect calls on the right node. That’s why we need another system to arbitrate where to redirect queries based on the node status (active/standby).

Failover in Case of High-Availability

When you request a high available option on your database instance, we enable a fail-over system which is composed of several elements:

  • A distributed consensus store (DCS) The DCS ensures that only one database will have the principal lock at a time.
    It also makes sure all logical databases got the same consistent data.
    This is where the lock will live.
    A lock is a concurrency concept. It ensures that only one component can be active at a time.
    This consensus system also checks the database engine status before adding a node in the pool. With this system, we can make sure your database instance stays consistent.
  • A configuration daemon. This daemon is running on each database instance. It is in charge of generating the configuration of the engine and decide whether or not it is an active or a standby node based on a consensus mechanism. This daemon will do the heavy lifting of managing the fail-over mechanism inside your database cluster.
    When the system is in a fully operational state, the active database instance will periodically perform a write operation in the DCS. This periodical write operation is named heartbeat.
    It keeps the active database as the owner of the active lock.
    The standby also reads this DCS and sees that an active node exists and therefore stays in standby.If for a certain amount of time, the active database fails to perform a write operation in the DCS, the active database is considered faulty. When a backend is detected as faulty, connections to it are cut off at the Load-Balancer level. The DCS also releases the lock if the active database is not manifesting itself. When this happens, the standby tries to be promoted as active by writing in the DCS. By doing this, the standby node informs the DCS it is ready to become active.
    Once a consensus is established with the DCS, the standby node is promoted as principal and the database instance can keep working.
    In case an instance fails, the internal monitoring system of the database team will be triggered and will generate a new standby instance node to replace the faulty one.

Conclusion

Scaleway Elements provides an extensive catalog of managed PostgreSQL Database services. Benefit from a PostgreSQL managed Database starting at €8/month (or €0.016/hour) and deployed in less than 5 minutes. You can choose the database that matches your project amongst our new offers, with up to 256GB RAM and 585GB SSD NVMe of storage to boost your performance.

Discover how to start your first Database thanks to our Scaleway Elements documentation and FAQ. Or learn how to migrate your PostgreSQL instance to a PostgreSQL Scaleway Elements Database instance via our API documentation.

Recommended articles

Understanding Kubernetes Autoscaling

Kubernetes provides a series of features to ensure your clusters have the right size to handle any load. Let's look into the different auto-scaling tools and learn the difference between them.

KubernetesScaling