Replace rdb_host
with your Database Instance’s IP address and rdb_port
with the port, which can be found under Endpoint on the Database information page. Replace <rdb_user>
with the name of the user you granted access to airquality
.
Visualizing time-Series data with TimescaleDB and Grafana
- time-Series
- TimescaleDB
- Grafana
- Database
TimescaleDB is a time series database built on PostgreSQL.
A time-series database typically stores “time-series data”, as the name indicates. It can be seen as a sequence of data points, that measure the same thing over time and store the measurement results in time order.
Each of these measurements is paired with a timestamp, defined by a name and a set of labeled dimensions (or “tags”).
Time-series data is everywhere around us and an important part of our everyday life. Whether it is a factory that measures the production output of a specific machine, a farmer observing the humidity of the soil, or a city measuring the regularity between trains, all of this is time-series data.
In this tutorial, we use a Database for PostgreSQL with the TimescaleDB extension and a Dataset of the air quality at the Franklin D. Roosevelt Metro station in Paris, provided by RATP, the Parisian transport authority. This kind of data is a perfect example of a time-series database, as it monitors certain values over a period of time. Other usages could include data from diverse backgrounds, such as weather data, economic information, or productivity information of an assembly line.
Before you start
To complete the actions presented below, you must have:
- A Scaleway account logged into the console
- Owner status or IAM permissions allowing you to perform actions in the intended Organization
- An Instance running on Ubuntu Bionic Beaver (18.04) or later to host the Grafana interface
- A Database for PostgreSQL
sudo
privileges or access to the root user
Creating a new database
- Connect to the Scaleway console and click the Managed Databases section.
- Click on your Database Instance.
- Click the Managed Databases tab, then on + Create Database.
- Enter the name of the new database. In this tutorial, we use
airquality
. - Click the Users tab, followed by «See more Icon» and Update Permissions.
- Grant your user permissions to the newly created
airquality
database.
Enabling the TimescaleDB extension
-
Connect to your Database using the
psql
client:psql -h <rbd_host> --port <rdb_port> -d airquality -U <rdb_user>Enter the password for your database user when prompted.
Tip -
Enable the
timescaledb
extension for your database.CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;The following output displays in your terminal:
WARNING:WELCOME TO_____ _ _ ____________|_ _(_) | | | _ \ ___ \| | _ _ __ ___ ___ ___ ___ __ _| | ___| | | | |_/ /| | | | _ ` _ \ / _ \/ __|/ __/ _` | |/ _ \ | | | ___ \| | | | | | | | | __/\__ \ (_| (_| | | __/ |/ /| |_/ /|_| |_|_| |_| |_|\___||___/\___\__,_|_|\___|___/ \____/Running version 1.5.1For more information on TimescaleDB, please visit the following links:1. Getting started: https://docs.timescale.com/getting-started2. API reference documentation: https://docs.timescale.com/api/latest3. How TimescaleDB is designed: https://docs.timescale.com/introduction/architectureNote: TimescaleDB collects anonymous reports to better understand and assist our users.For more information and how to disable, please see our docs https://docs.timescaledb.com/using-timescaledb/telemetry.CREATE EXTENSION
Prepare the table for TimescaleDB
-
Create a table called
airquality
, which will be used to store the information.CREATE TABLE airquality(DATETIME TIMESTAMP WITH TIME ZONE NOT NULL,NO INTEGER,NO2 INTEGER,PM10 INTEGER,CO2 INTEGER,TEMP NUMERIC(4,1),HUMI NUMERIC(4,1));The command above creates a new table called
airquality
with seven columns:DATETIME
contains a timestamp which includes the time zone.NO
contains the average NO concentration in µg/m3.NO2
contains the average NO2 concentration in µg/m3.PM10
contains the average PM10 particles concentration in in µg/m3CO2
contains the average CO2 concentration in ppm.TEMP
contains the average ambient temperature in °C.HUMI
contains the relative humidity in %.
-
Convert the table you created into a TimescaleDB hypertable.
SELECT create_hypertable('airquality', 'datetime');The following output displays in the terminal window:
airquality=> SELECT create_hypertable('airquality', 'datetime');create_hypertable-------------------------(1,public,airquality,t)(1 row)
Adding and removing data
-
Add a sample row to the database. The
RETURNING
statement visualizes the data added to the table:INSERT INTO airquality(DATETIME,NO,NO2,PM10,CO2,TEMP,HUMI) VALUES (NOW(),4,38,14,398,27.0,47.4) RETURNING *;An output like the following displays:
datetime | no | no2 | pm10 | co2 | temp | humi-------------------------------+----+-----+------+-----+------+------2019-12-10 15:31:28.116471+00 | 4 | 38 | 14 | 398 | 27.0 | 47.4(1 row)INSERT 0 1 -
Remove the data from the table. The following command deletes all data where the value for
temp
is larger than26
:DELETE FROM airquality WHERE temp > 26;DELETE 1Run a garbage collection using the
VACCUM
command to clean up the table and to free unused space:VACUUM airquality;Manually adding data works for a small-scale database, but when it comes to dealing with timescale data you have large sets of data.
Importing the dataset
In this tutorial, we use an air quality dataset from a Parisian metro station. The data is provided by Paris’ transport authority RATP.
- Download the sample data in CSV-format on your local computer.
- Convert the downloaded CSV file into an SQL database file:
cat qualite-de-lair-mesuree-dans-la-station-franklin-d-roosevelt.csv | awk -F';' '{ printf "INSERT INTO airquality(DATETIME,NO,NO2,PM10,CO2,TEMP,HUMI) VALUES (\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27);",$1,$2,$3,$4,$5,$6,$7;print ""}' > airquality.sql
- Import the file to your database:
psql -h <rbd_host> --port <rdb_port> -d airquality -U <rdb_user> < airquality.sql
Querying Data
Now that you have a large dataset available, you can query the data using the psql
client.
-
Connect to your database.
psql -h <rbd_host> --port <rdb_port> -d airquality -U <rdb_user> -
To receive a list of the 10 most recent entries in the database run the following command.
SELECT * FROM airquality ORDER BY datetime DESC LIMIT 10;A list like the following displays:
datetime | no | no2 | pm10 | co2 | temp | humi------------------------+----+-----+------+-----+------+------2019-12-02 00:00:00+00 | 12 | 42 | 25 | | 14.8 | 36.32019-12-01 23:00:00+00 | 11 | 40 | 23 | | 15.0 | 36.62019-12-01 22:00:00+00 | 14 | 42 | 23 | | 15.0 | 38.12019-12-01 21:00:00+00 | 29 | 48 | 26 | | 14.4 | 41.72019-12-01 20:00:00+00 | 21 | 43 | 30 | | 13.9 | 42.32019-12-01 19:00:00+00 | 18 | 40 | 31 | | 14.1 | 42.32019-12-01 18:00:00+00 | 17 | 40 | 29 | | 14.2 | 41.82019-12-01 17:00:00+00 | 16 | 39 | 27 | | 14.6 | 40.92019-12-01 16:00:00+00 | 15 | 40 | 26 | | 15.0 | 41.42019-12-01 15:00:00+00 | 13 | 40 | 20 | | 14.8 | 41.8(10 rows) -
Run the following command to retrieve the average temperature on all data:
SELECT percentile_cont(0.5)WITHIN GROUP (ORDER BY temp)FROM airquality;An output like the following displays:
percentile_cont-----------------18.7(1 row)
For more information on how to read data from your database, refer to TimescaleDB’s read data documentation page.
Visualizing data with Grafana
You can visualize the time-series data by creating a dashboard using Grafana, an open-source analytics & monitoring solution.
-
Create a new Instance running on Ubuntu Bionic (18.04) from your Scaleway console
-
Log into the Instance using SSH.
-
Add the Grafana repository to the
apt
package cache.add-apt-repository "deb https://packages.grafana.com/oss/deb stable main" -
Install the GPG key of Grafana to validate the packages.
wget -q -O - https://packages.grafana.com/gpg.key | apt-key add - -
Update the
apt
package cache.apt update -
Install Grafana using
apt
.apt install grafana -
Reload the
systemd
daemon, enable the Grafana service to start the application automatically during system boot, and start the application manually.systemctl daemon-reload && systemctl enable grafana-server.service && systemctl start grafana-server.service -
Open a web browser and point it to
http://<instance_ip>:3000/
.NoteReplace
<instance_ip>
with the IP address of your Instance. -
The Grafana login screen displays. Log yourself into Grafana using the username and password
admin
:Upon the first connection, you will be asked to set a new password.
-
Enter it twice and validate the form. The default dashboard displays.
-
Click Add Datasource to configure a new datasource.
-
Enter the credentials of your Database Instance:
- Host: Enter the IP address and the port of your Database Instance separated by a double point.
- Managed Databases: The name of the database (for example,
airquality
) - User: The database user
- Password: The database password
- SSL Mode: The desired SSL validation mode. Set the value to
required
. - Version: This value must correspond to the PostgreSQL version of your Database Instance
- TimescaleDB: Enable this value
-
Click Save & Test to test the database connection and save the values.
-
Click the Grafana logo to return to the base dashboard.
-
Create a JSON configuration to query the database in Grafana.
-
Click + Create > Import and click Import .json File to upload the previously downloaded configuration file to Grafana.
Grafana starts to generate graphs from the time-series data stored in TimescaleDB. You can now browse the data in a visual interface and edit the queries of the database according to your needs:
Conclusion
You now have configured a Timescale database, imported time-series data, and learned how to query, modify, and visualize the data using Grafana. For more information about TimescaleDB, refer to the official Documentation.