You can find the link to the latest version of Duplicity on their website. You may replace the link used with wget
(and therefore tar
and cd
) if a newer version is available.
Backing up your dedicated server on Scaleway Object Storage with Duplicity
- duplicity
- backup
- gpg
- s3
In this article, we will learn how to back up data from a Scaleway Instance, using Scaleway’s Object Storage service and Duplicity. Duplicity is a free and open-source tool that backs up folders to a remote server, in this case, an Object Storage bucket. Using libsync and GPG, Duplicity can make space-efficient encrypted backups.
Our objectives are to:
- Make production-ready automatic backups of our data.
- Save cost using bandwidth and space-efficient incremental backups.
- Preserve a robust layer of security and privacy with asymmetrical encryption.
To achieve this, we will:
- Create a bucket in Scaleway’s Object Storage
- Install Duplicity
- Generate a GPG key
- Combine Duplicity and Scaleway Object Storage
Although this tutorial focuses on backing up an Instance, you can also back up a dedicated server with Duplicity. When you are using a Scaleway Dedibox or a Scaleway Elastic Metal server, the data transfer is free of charge within the same region.
Before you start
To complete the actions presented below, you must have:
- A Scaleway account logged into the console
- Owner status or IAM permissions allowing you to perform actions in the intended Organization
- A valid API key
- An Instance running Ubuntu or Debian
Creating an Object Storage bucket
Follow the instructions to create an Object Storage bucket. This bucket will be used to back up your Instance’s data.
Installing software requirements
In this step, we install the various software needed. As well as installing Duplicity itself, Duplicity needs us to generate a GPG key so that it can encrypt the data. For GPG key generation, we need to create some entropy, so we also install Haveged to generate a small amount of entropy.
Run the following command to update the APT package manager, upgrade the software already installed on the server and download and install Duplicity and the other required software:
apt update && apt upgradeapt install -y python3-boto python3-pip haveged gettext librsync-devwget https://gitlab.com/duplicity/duplicity/-/archive/rel.2.1.2/duplicity-rel.2.1.2.tar.gztar xaf duplicity-rel.2.1.2.tar.gzcd duplicity-rel.2.1.2/pip3 install -r requirements.txtpython3 setup.py install
Creating a GPG key
-
Generate the GPG key with the following command.
gpg --full-generate-keyEnter and remember a passphrase. You are asked to define the characteristics of your keys. We go with the default settings:
- What kind of key you want: (1) RSA and RSA (default)
- What key size you want: (3072)
- How long the key should be valid: 0 = key does not expire
GPG then asks for a name for your key, an address, and a description.
gpg --full-generate-keygpg (GnuPG) 2.1.18; Copyright (C) 2017 Free Software Foundation, Inc.This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.Please select what kind of key you want:(1) RSA and RSA (default)(2) DSA and Elgamal(3) DSA (sign only)(4) RSA (sign only)Your selection? 1RSA keys may be between 1024 and 4096 bits long.What keysize do you want? (3072) 3072Requested keysize is 3072 bitsPlease specify how long the key should be valid.0 = key does not expire<n> = key expires in n days<n>w = key expires in n weeks<n>m = key expires in n months<n>y = key expires in n yearsKey is valid for? (0) 0Key does not expire at allIs this correct? (y/N) yGnuPG needs to construct a user ID to identify your key.Real name: backupsEmail address: me@scaleway.comComment: Scaleway Object Storage backupsYou selected this USER-ID:backups (Scaleway Object Storage backups) <me@scaleway.com>Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? OWe need to generate a lot of random bytes. It is a good idea to performsome other action (type on the keyboard, move the mouse, utilize thedisks) during the prime generation; this gives the random numbergenerator a better chance to gain enough entropy.gpg: key XXXXXXXXXXXXXXXX marked as ultimately trustedpublic and secret key created and signed.pub rsa3072 2020-03-26 [SC]XXXXXXXXXXXXX-FINGERPRINT-XXXXXXXXXXXXXXuid backups (Scaleway Object Storage backups) <me@scaleway.com>sub rsa3072 2020-03-26 [E] -
Copy the GPG key fingerprint (its location in the output is shown above), which could be an 8, 16 or 40 char long hash. You can also find the fingerprint of your key with the following command:
gpg --list-keysgpg: checking the trustdbgpg: marginals needed: 3 completes needed: 1 trust model: pgpgpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u/home/me/.gnupg/pubring.kbx------------------------------pub rsa3072 2020-03-26 [SC]XXXXXXXXXXXXX-FINGERPRINT-XXXXXXXXXXXXXXuid [ultimate] backups (Scaleway Object Storage backups) <me@scaleway.com>sub rsa3072 2020-03-26 [E]
If you lose access to your current server, having the GPG private and public keys stored somewhere else will come in handy. Export the GPG keys with:
gpg --armor --export backupsgpg --armor --export-secret-key backups
Combining Duplicity and Scaleway
As everything is installed and ready, we will now configure and script our interactions between our server and the cloud.
Editing the configuration
- Create our initial, empty script files and log files and make them executable.
touch scw-backups.sh scw-restore.sh .scw-configrcchmod 700 scw-backups.sh scw-restore.shchmod 600 .scw-configrcmkdir -p /var/log/duplicitytouch /var/log/duplicity/logfile{.log,-recent.log}
- Add the following lines to
.scw-configrc
. Make sure you replace the necessary values with the details of your Scaleway API key, Object Storage bucket, and GPG key. You also need to enter a path to the desired backup folder:# Scaleway credentials keysexport AWS_ACCESS_KEY_ID="<SCALEWAY ACCESS KEY>"export AWS_SECRET_ACCESS_KEY="<SCALEWAY SECRET ACCESS KEY>"export SCW_REGION="<REGION OF YOUR BUCKET>"export SCW_ENDPOINT_URL="https://s3.${SCW_REGION}.scw.cloud"# set SCW_BUCKET as follows for duplicity < 0.8.23# for higher versions, see below# export SCW_BUCKET="s3://s3.${SCW_REGION}.scw.cloud/<NAME OF YOUR BUCKET>"# set the next two variables for duplicity >= 0.8.23# it uses the boto3 library, which uses a different naming scheme for bucket namesexport SCW_BUCKET="s3://<NAME OF YOUR BUCKET>"# GPG Key informationexport PASSPHRASE="<YOUR GPG KEY PASSPHRASE>"export GPG_FINGERPRINT="<YOUR GPG KEY FINGERPRINT>"# Folder to backupexport SOURCE="<PATH TO FOLDER TO BACKUP>"# Will keep backup up to 1 monthexport KEEP_BACKUP_TIME="1M"# Will make a full backup every 10 daysexport FULL_BACKUP_TIME="10D"# Log filesexport LOGFILE_RECENT="/var/log/duplicity/logfile-recent.log"export LOGFILE="/var/log/duplicity/logfile.log"log () {date=`date +%Y-%m-%d`hour=`date +%H:%M:%S`echo "$date $hour $*" >> ${LOGFILE_RECENT}}export -f log
The backup policy described here makes a full backup every 10 days and removes all backups older than one month.
Backup script
Using the configuration and Duplicity, we automatize the backup process with the scw-backups.sh
script.
-
Copy the following script to
scw-backups.sh
:#!/bin/bashsource <FULL PATH TO>/.scw-configrccurrently_backuping=$(ps -ef | grep duplicity | grep python | wc -l)if [ $currently_backuping -eq 0 ]; then# Clear the recent log filecat /dev/null > ${LOGFILE_RECENT}log ">>> removing old backups"duplicity remove-older-than \--s3-endpoint-url ${SCW_ENDPOINT_URL} \--s3-region-name ${SCW_REGION} \${KEEP_BACKUP_TIME} ${SCW_BUCKET} >> ${LOGFILE_RECENT} 2>&1# duplicity >= 0.8.23# determine S3_ENDPOINT_URL for scalewayS3_ENDPOINT_URL="https://s3.${S3_REGION_NAME}.scw.cloud"log ">>> creating and uploading backup to Scaleway Glacier"duplicity \incr --full-if-older-than ${FULL_BACKUP_TIME} \--asynchronous-upload \--s3-use-glacier \--s3-endpoint-url ${SCW_ENDPOINT_URL} \--s3-region-name ${SCW_REGION} \--encrypt-key=${GPG_FINGERPRINT} \--sign-key=${GPG_FINGERPRINT} \${SOURCE} ${SCW_BUCKET} >> ${LOGFILE_RECENT} 2>&1cat ${LOGFILE_RECENT} >> ${LOGFILE}fi -
Run the script
./scw-backups.sh
to make sure the configuration is correctly set. Check the Object Storage bucket on the Scaleway console to see the backup files, or examine the logs with the following command:cat /var/log/duplicity/logfile-recent.log
Script the recovery of data
Duplicity also allows you to recover a backup. We will create a script to make the process easier.
- Add the following script to
scw-restore.sh
:#!/bin/bashsource <FULL PATH TO>/.scw-configrcif [ $# -lt 2 ]; thenecho -e "Usage $0 <time or delta> [file to restore] <restore to>Exemple:\t$ $0 2018-7-21 recovery/ ## recovers * from closest backup to date\t$ $0 0D secret data/ ## recovers most recent file nammed 'secret'";exit; fiif [ $# -eq 2 ]; thenduplicity \--s3-endpoint-url ${SCW_ENDPOINT_URL} \--s3-region-name ${SCW_REGION} \--time $1 \${SCW_BUCKET} $2fiif [ $# -eq 3 ]; thenduplicity \--s3-endpoint-url ${SCW_ENDPOINT_URL} \--s3-region-name ${SCW_REGION} \--time $1 \--file-to-restore $2 \${SCW_BUCKET} $3fi - Recover the data you uploaded in the previous section with the following command:
./scw-restore.sh 0D /tmp/backup-recovery-test/
- Alternatively, recover one specific file with the following format from a backup 5 days ago with:
./scw-restore.sh 5D <file> /tmp/backup-recovery-test/
Automation of the backups
- Use the command
crontab -e
to edit your crontab file and add the line to create a script that will run twice a day at 1:00 and 13:00 (1 AM and 1 PM):crontab -e - Select your editor and write:
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/binSHELL=/bin/bash00 1,13 * * * <FULL PATH TO>/scw-backups.sh > /dev/null 2>&1
Conclusion
We created an automatic backup using Duplicity, hosting the encrypted data on Scaleway’s Object Storage. We are able to recover specific files or the entire backup itself at a given date.
To continue your implementation, you may want to consider the following:
- Duply, giving you a front-end view of the backups.
- Backing up more than one folder using
“SOURCE=/”
and the--include=
and--exclude=
options of duplicity. - Adapt the backup policy with more frequent backups, that are conserved.