Difference between revisions of "ITSquad/Mastodon/Maintenance"

From Pirate Party Belgium
Jump to: navigation, search
m (Storage Box)
(add pgbackrest and borg, remove autofs, and add new steps to upgrade Mastodon)
 
Line 3: Line 3:
 
== Storage Box ==
 
== Storage Box ==
  
Due to the size of the media, we decided to store them on a storage box provided by Hetzner.  
+
Due to the size of the media, we decided to store them on a storage box provided by Hetzner. This can also be used for storing our backups.
  
This storage box is mounted with Autofs and sshfs.
+
This storage box is mounted with sshfs.
  
First, we need to create a ssh key pair for root and install it on the storage box:
+
First, we need to create a ssh key pair for the root user and install it on the storage box. In the ITSquad, we decided to create a sub-account of the storage-box per server, so we can safely overwrite the storage box's authorized_keys file:
 
  ssh-keygen -t ed25519 -f /root/.ssh/storage-box
 
  ssh-keygen -t ed25519 -f /root/.ssh/storage-box
 
  scp -P 23 .ssh/storage-box.pub <user>@<storage box server>:/home/.ssh/authorized_keys
 
  scp -P 23 .ssh/storage-box.pub <user>@<storage box server>:/home/.ssh/authorized_keys
  
Then, we create a map for autofs in /etc/auto.master:
+
Then, we create the directory that will be used as the storage box's mountpoint:
  /mnt/storage-box /etc/auto.sshfs uid=991,gid=991,--timeout=30,--ghost
+
  mkdir /mnt/mastodon
This line tells autofs to mount the content of /etc/auto.sshfs for the user with ID 991<ref>This is needed to avoid HTTP error 500 on file uploads. Source: https://github.com/tootsuite/documentation/blob/master/Running-Mastodon/Docker-Guide.md#using-a-prebuilt-image</ref>. After 30s, the mountpoint is disconnected if there isn't any activity. The ghost option will automatically create the mountpoints for us.
 
  
The content of /etc/auto.sshfs is as follows:
+
We add the following line in /etc/fstab to mount the storage box volume:
  media -fstype=fuse,rw,allow_other,port=23,_netdev :sshfs\#<user>@<storage box server>\:/home/media
+
  # <file system>                    <mount point>  <type>      <options>  <dump>  <pass>
This simply tells autofs to mount the media folder from the storage box server with sshfs on the directory /mnt/storage-box/media.
+
<storage box>:/home/media /mnt/mastodon  fuse.sshfs delay_connect,_netdev,user,idmap=user,transform_symlinks,allow_other,default_permissions,reconnect,uid=mastodon,gid=mastodon 0 0
 +
We give the ownership of this volume to the mastodon user in order to avoid HTTP error 500 on file uploads.<ref>Source: https://github.com/tootsuite/documentation/blob/master/Running-Mastodon/Docker-Guide.md#using-a-prebuilt-image</ref>.
  
We restart the autofs service to make the changes effective:
+
We configure the SSH client config at /root/.ssh/config to define which port, user and private key to use to connect to the storage box:
  sudo systemctl restart autofs
+
Host <storage box>
 +
  User <user>
 +
  Port 23
 +
IdentityFile /root/.ssh/storage-box
 +
PreferredAuthentications publickey,password
 +
 
 +
We mount the storage box volume:
 +
  mount /mnt/mastodon
  
 
Finally, we create a link on the mastodon directory:
 
Finally, we create a link on the mastodon directory:
  ln -s /mnt/storage-box/media /path/to/mastodon/public/system
+
  ln -s /mnt/mastodon/media /path/to/mastodon/public/system
  
 
And we restart the Mastodon instance \o/
 
And we restart the Mastodon instance \o/
 +
 +
'''Note:''' We first tried to use autofs to mount the storage box, but this was resulting in socket errors because the docker container didn't get the information that the volume had been disconnected and mounted again.
  
 
== How to upgrade? ==
 
== How to upgrade? ==
  
A script called '''upgrade.sh''' located in /home/mastodon will reproduce the steps below. Note: this script stops at step 11 and does not go further.
+
A script called '''upgrade.sh''' located in /home/mastodon will reproduce the steps below. Note: this script stops at step 12 and does not go further.
  
 
Before running this script or the following commands, please check [https://github.com/tootsuite/mastodon/releases the instruction for the current release]. Sometimes, aditionnal actions are needed (other than migrate database and compile assets)
 
Before running this script or the following commands, please check [https://github.com/tootsuite/mastodon/releases the instruction for the current release]. Sometimes, aditionnal actions are needed (other than migrate database and compile assets)
Line 35: Line 44:
 
# git fetch -t # update the git branch, including new tags
 
# git fetch -t # update the git branch, including new tags
 
# git stash # prevent changes made to the files to be overwritten (mainly, the docker-compose.yml file)
 
# git stash # prevent changes made to the files to be overwritten (mainly, the docker-compose.yml file)
# git checkout glitch-soc/master # jump to the glitch-soc master
+
# git pull
 
# git stash pop # restore your changes
 
# git stash pop # restore your changes
 
# docker-compose build # build the image
 
# docker-compose build # build the image
# docker-compose exec db pg_dump -U postgres -Fc postgres > ../dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql # backup the database
+
# docker-compose exec -u postgres pgbackrest --retention-diff 3 --stanza main --type incr backup # backup the database (see [[#WAL_archiving with_pgBackRest]])
 
# tail ../your_dump_file.sql # check if the backup worked, with your_dump_file.sql being the dumpfile you have created in the previous step.
 
# tail ../your_dump_file.sql # check if the backup worked, with your_dump_file.sql being the dumpfile you have created in the previous step.
 +
# docker-compose run --rm -e SKIP_POST_DEPLOYMENT_MIGRATIONS=true web rails db:migrate # upgrade the database (pre-install)
 
# docker-compose down # shut down the containers.  
 
# docker-compose down # shut down the containers.  
#: # You'll maybe get "ERROR: An HTTP request took too long to complete" and other errors. Don't mind this and just wait 'till it's done.
+
# docker-compose run --rm web bin/tootctl cache clear # clear the cache
 
# docker-compose run --rm web rails db:migrate # upgrade the database
 
# docker-compose run --rm web rails db:migrate # upgrade the database
# docker-compose run --rm web rails assets:precompile # complie the assets
 
 
# docker-compose up -d # start the mastodon instance (create new volumes)
 
# docker-compose up -d # start the mastodon instance (create new volumes)
 
# docker-compose logs -ft web # (optional) if you want to monitor the progress. Once this is done you ctrl+c
 
# docker-compose logs -ft web # (optional) if you want to monitor the progress. Once this is done you ctrl+c
Line 50: Line 59:
 
=== What to do when something went wrong? ===
 
=== What to do when something went wrong? ===
  
Don't panic. You can restore the database backup as follows:
+
Don't panic.  
 +
 
 +
If you made a dump of the database before upgrading, you can restore the database as follows:
 +
 
 
# docker-compose stop
 
# docker-compose stop
 
# docker-compose start db
 
# docker-compose start db
Line 58: Line 70:
 
# docker-compose down
 
# docker-compose down
 
# docker-compose up -d
 
# docker-compose up -d
 +
 +
If you are using pgBackRest, you can follow these steps: https://pgbackrest.org/user-guide.html#quickstart/perform-restore
 +
 +
As the tool is inside a docker container, you can enter the container like this:
 +
 +
docker-compose run --rm -u postgres db bash
 +
 +
Then, you can execute the commands from there.
  
 
== What to backup? ==
 
== What to backup? ==
Line 73: Line 93:
 
== How to backup? ==
 
== How to backup? ==
  
'''Currently''': Every night, an encrypted backup of the database is made on the server (and is overwritten by the next nightly backup) and stored on at least two other locations. In addition, media are stored on a storage box provided by Hetzner. A snapshot of this storage box is scheduled every night. Finally, backups for configuration files are created every night. Those backups are encrypted and stored on at least two different locations.  
+
'''Currently''': Every night, an encrypted backup of the database is made on the server and is stored on at least two other locations. The backups are incremental every day, differential every Sunday, and we perform a full backup every month. Backups for configuration files are created every night. These backups are incremental and are stored for 6 months and are stored on at least two different locations. Finally, backups and media are stored on a storage box provided by Hetzner. A snapshot of this storage box is scheduled every night.
  
 
=== Database ===
 
=== Database ===
  
 
For an instant backup, we can just dump the database as follows:
 
For an instant backup, we can just dump the database as follows:
  docker-compose exec db pg_dump -U postgres -Fc postgres > /home/mastodon/backup/db/dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql
+
  docker-compose exec -u postgres db pg_dump -Fc postgres > /home/mastodon/backup/db/dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql
  
 
In short, the option ''-Fc'' enables data compression. Postgres ensures data consistency during the backup<ref>https://www.postgresql.org/docs/current/app-pgdump.html</ref>, which means that we don't have to stop Mastodon while dumping the database ;)
 
In short, the option ''-Fc'' enables data compression. Postgres ensures data consistency during the backup<ref>https://www.postgresql.org/docs/current/app-pgdump.html</ref>, which means that we don't have to stop Mastodon while dumping the database ;)
  
'''Important note''': At the moment, we are using this solution. Although it's not optimal (see below), it's the easiest solution to set up. Every night, a database dump is created. These database dumps are encrypted with GPG keys and stored on at least two different locations.
+
==== WAL archiving with pgBackRest ====
 
 
==== WAL archiving ====
 
  
 
However, over a long time period, we usually wants incremental backups to avoid duplicating the whole database each day. Fortunately, Postgres provides all the tools we need to set up incremental backups.<ref>https://www.opsdash.com/blog/postgresql-backup-restore.html</ref>
 
However, over a long time period, we usually wants incremental backups to avoid duplicating the whole database each day. Fortunately, Postgres provides all the tools we need to set up incremental backups.<ref>https://www.opsdash.com/blog/postgresql-backup-restore.html</ref>
Line 90: Line 108:
 
First, we have to enable WAL (Write Ahead Log) archiving.<ref>https://www.opsdash.com/blog/postgresql-wal-archiving-backup.html</ref>. We just modify the Postgres config, which is located in /path/to/mastodon/postgres/postgressql.conf:
 
First, we have to enable WAL (Write Ahead Log) archiving.<ref>https://www.opsdash.com/blog/postgresql-wal-archiving-backup.html</ref>. We just modify the Postgres config, which is located in /path/to/mastodon/postgres/postgressql.conf:
 
  # The WAL level must be archive or higher.
 
  # The WAL level must be archive or higher.
  wal_level = archive
+
  wal_level = replica
 
# Ensure there is at least one WAL file for each "archive_timeout" duration.
 
archive_timeout = 1h
 
 
   
 
   
 
  # This is a soft upper limit on the total size of WAL files.
 
  # This is a soft upper limit on the total size of WAL files.
 
  max_wal_size = 1GB
 
  max_wal_size = 1GB
 
# Keep around at least these many WAL files (aka segments).
 
wal_keep_segments = 10
 
 
   
 
   
 
  # The archive_mode must be set to on for archiving to happen.
 
  # The archive_mode must be set to on for archiving to happen.
Line 105: Line 117:
 
   
 
   
 
  # This is the command to invoke for each WAL file to be archived.
 
  # This is the command to invoke for each WAL file to be archived.
  archive_command = '/opt/wal-backup.sh wal %p'
+
  archive_command = '/usr/local/bin/pgbackrest --log-level-console=info --stanza=main archive-push %p'
  
This configuration tells Postgrel to archive transaction at least every hour by executing the script in /opt/wal-backup.sh. Note that this script must be executable by the postgres user. If the script fails, Postgres will try to execute it again and again. Please check the Postgres logs to be sure that everything is fine.
+
# Max number of processes to generate the WAL files
 +
max_wal_senders=3
  
Then, we must install a tool that will backup the WAL files. We can either use WAL-E<ref>https://github.com/wal-e/wal-e</ref> or WAL-G<ref>https://github.com/wal-g/wal-g</ref> which is a refactor in Go of the former. Both tools' configuration are very similar.
+
This configuration tells PostgreSQL to archive transaction by executing the command:  
 +
/usr/local/bin/pgbackrest --log-level-console=info --stanza=main archive-push %p
 +
Note that this command must be executable by the postgres user. If the script fails, PostgreSQL will try to execute it again and again. Please check the Postgres logs to be sure that everything is fine.
  
For WAL-G, we would end up with the following script:
+
Then, we must install pgBackRest<ref>https://pgbackrest.org/user-guide.html</ref>, which is the tool that will backup the WAL files. Although the user guide of pgBackRest is dense, it is quite straightforward, so we recommend you to read it.
# If the script is run on the same host as Postgres, we can just listen to the Postgres' socket
+
 
export PGHOST=/var/run/postgresql
+
Since we are using docker, we had to create a custom docker image that includes pgBackRest. We created the following Dockerfile in /home/mastodon/pgbackrest/Dockerfile:
  export PGUSER=postgres
+
  FROM postgres:12-alpine
 
   
 
   
  # This variable expect a string with the public key in ascii
+
  ARG PGBACKREST_VERSION="2.31"
  export WALG_PGP_KEY=$(cat /path/to/your-key.pub)
+
  ARG PGBACKREST_SHA256SUM="7157ec4ad2428379243c30acf2b15c2e9339beeec14697714f8eac2ce4c19896"
 +
RUN apk add --virtual build-dependencies \
 +
      build-base gcc make wget postgresql-dev openssl-dev libxml2-dev pkgconfig lz4-dev bzip2-dev \
 +
    && apk add lz4-libs coreutils libbz2 \
 +
    && pgbackrest_archive="pgbackrest-${PGBACKREST_VERSION}.tar.gz" \
 +
    && pgbackrest_dir="/opt/pgbackrest-release-${PGBACKREST_VERSION}" \
 +
    && wget -O $pgbackrest_archive -q https://github.com/pgbackrest/pgbackrest/archive/release/${PGBACKREST_VERSION}.tar.gz \
 +
    && echo "${PGBACKREST_SHA256SUM}  $pgbackrest_archive" | sha256sum -c - \
 +
    && tar zxf $pgbackrest_archive -C /opt \
 +
    && cd $pgbackrest_dir/src \
 +
    && ./configure \
 +
    && make \
 +
    && make install \
 +
    && ln -s /usr/local/bin/pgbackrest /usr/bin/pgbackrest \
 +
    && apk del build-dependencies \
 +
    && cd / \
 +
    && rm -rf $pgbackrest_archive $pgbackrest_dir
 
   
 
   
  export WALG_FILE_PREFIX=/path/to/backups/postgres/
+
  COPY --chown=postgres:postgres pgbackrest.sh /opt/
 +
RUN chmod 744 /opt/pgbackrest.sh
 
   
 
   
  # Postgres doesn't export the $PATH variable, so we must set it up
+
  COPY --chown=postgres:postgres pgbackrest.conf /etc/
export PATH=/usr/bin:/usr/local/bin:.
+
RUN chmod 640 /etc/pgbackrest.conf
 
   
 
   
  # The actual backup command. $1 is either "wal" or "backup", and $2 is the path to the file or database.
+
  CMD ["postgres", \
wal-g $1-push $2
+
      "-c", "archive_command=/usr/local/bin/pgbackrest --log-level-console=info --stanza=main archive-push %p", \
 +
      "-c", "archive_mode=on", \
 +
      "-c", "wal_level=replica", \
 +
      "-c", "max_wal_senders=3"]
  
You can export your public key to a file with:
+
Because pgbackrest wasn't packaged for alpine images, we had to build it from sources.
gpg -a --export <gpg key id> > /path/to/your-key.pub
 
Note that WAL-G doesn't currently support GPG keys using elliptic curves.  
 
  
Finally, we must create an initial base backup once:
+
We also create the pgBackRest config file in /home/mastodon/pgbackrest/pgbackrest.conf:
  /opt/wal-backup.sh backup /path/to/mastodon/postgres
+
 
 +
[global]
 +
log-level-file=off
 +
log-level-console=info
 +
   
 +
repo1-path=/mnt/pgbackrest
 +
repo1-retention-full=2
 +
repo1-retention-diff=2
 +
repo1-cipher-pass=<pgbackrest password>
 +
repo1-cipher-type=aes-256-cbc
 +
 +
[main]
 +
pg1-path=/var/lib/postgresql/data
 +
 
 +
Don't forget to generate your pgbackrest password.
 +
 
 +
Finally, change the /path/to/mastodon/docker-compose.yml file to build and use this custom postgres docker image:
 +
db:
 +
  ...
 +
  build: /home/mastodon/pgbackrest
 +
  image: pgbackrest:12
 +
  ...
 +
  volumes:
 +
    ...
 +
    - /mnt/pgbackrest:/mnt/pgbackrest
 +
  ...
 +
 
 +
Here, we can see that we are mounting a volume that will contain the backup. We use the same method as the one for storing the media (see [[#Storage_Box]]), and we gave the ownership to the postgres user, i.e. the uid 70 as this user only exists inside the postgres docker container.
 +
 
 +
'''Note:''' We upgraded the PostgreSQL database from 9.6 to 12, so adapt this configuration according to your needs.
 +
 
 +
Finally, build the docker image and recreate the database container:
 +
 
 +
docker-compose build db
 +
docker-compose down
 +
docker-compose up -d
  
 
=== Medias ===
 
=== Medias ===
Line 144: Line 212:
 
=== Configuration files ===
 
=== Configuration files ===
  
We can backup the config files with duplicity:<ref>http://duplicity.nongnu.org/</ref><ref>https://splone.com/blog/2015/7/13/encrypted-backups-using-rsync-and-duplicity-with-gpg-and-ssh-on-linux-bsd/</ref><ref>https://blog.rom1v.com/2013/08/duplicity-des-backups-incrementaux-chiffres/</ref>
+
We are backuping the config files with Borg<ref>https://simonlefort.be/informatique:borg</ref> and borgmatic<ref>https://simonlefort.be/informatique:borgmatic</ref>.
duplicity --encrypt-key <gpg encrypt key> \
+
 
  --full-if-older-than 1M \
+
We created a storage box mountpoint at /mnt/backup (see [[#Storage_Box]]) and we gave ownership to the root user and the backup-sync group
  --name daily_mastodon_config_backup \
+
 
  --num-retries 3 \
+
First, we initialize the Borg repository:
  --include /etc \
+
borg init -e keyfile --umask 0027 /mnt/backup/borg
  --include /path/to/mastodon/.env.production \
 
  --include /path/to/mastodon/docker-compose.yml \
 
  --exclude '**' \
 
  / <protocol>://<backup server>:/path/to/backups
 
  
This will store the encrypted backup on a remote server. The backups are incremental, but each week a full backup is made.  
+
This will prompt for a passphrase. You'll have to backup the keyfile as it is needed to decrypt the borg repository, and it is stored on the server, under /root/.config/borg/keys
  
Everything is encrypted with the provided GPG public keys. Duplicity needs to know at least one private key in order to do the incremental backups, so we create a key pair on the host.  
+
The umask option is there to allow read-write permission to the owner, allow read-only permission to the group, and remove all permissions for other users. See also the [https://en.wikipedia.org/wiki/Umask Wikipedia about umask].
  
The list of protocols that are handled by duplicity can be found in the man page.  
+
Then, we create the borgmatic config file at /etc/borgmatic/config.yaml:
 +
location:
 +
  exclude_patterns:
 +
  - /path/to/elasticsearch
 +
  - /path/to/redis
 +
  repositories:
 +
  - <storage box>:/home/backup/borg
 +
  source_directories:
 +
  - /path/to/mastodon
 +
  - /etc
 +
retention:
 +
  keep_daily: 7
 +
  keep_hourly: 24
 +
  keep_monthly: 6
 +
  keep_weekly: 4
 +
storage:
 +
  compression: zlib,7
 +
  encryption_passphrase: <borg passphrase>
 +
  umask: '0027'
  
In this example, we exclude everything except the /etc directory and the mastodon's config files.
+
We can now trigger a backup with this command:
 +
borgmatic --create --prune
  
We can keep several full backups of the config files, as they weight almost nothing:
+
We can also check the consistency of the backup with:
  duplicity remove-all-but-n-full 2 --force --name mastodon_config <protocol>://<backup server>:/path/to/backups
+
  borgmatic --check
This command must be executed ''after'' each backup.
 
  
 
'''Important note''': These two commands are executed every night, and backups are stored on at least two locations.
 
'''Important note''': These two commands are executed every night, and backups are stored on at least two locations.

Latest revision as of 14:38, 9 January 2021

This page aims to describe procedures to maintain the Pirate Party's Mastodon instance.

Storage Box

Due to the size of the media, we decided to store them on a storage box provided by Hetzner. This can also be used for storing our backups.

This storage box is mounted with sshfs.

First, we need to create a ssh key pair for the root user and install it on the storage box. In the ITSquad, we decided to create a sub-account of the storage-box per server, so we can safely overwrite the storage box's authorized_keys file:

ssh-keygen -t ed25519 -f /root/.ssh/storage-box
scp -P 23 .ssh/storage-box.pub <user>@<storage box server>:/home/.ssh/authorized_keys

Then, we create the directory that will be used as the storage box's mountpoint:

mkdir /mnt/mastodon

We add the following line in /etc/fstab to mount the storage box volume:

# <file system>                     <mount point>  <type>      <options>  <dump>  <pass>
<storage box>:/home/media  /mnt/mastodon  fuse.sshfs delay_connect,_netdev,user,idmap=user,transform_symlinks,allow_other,default_permissions,reconnect,uid=mastodon,gid=mastodon 0 0

We give the ownership of this volume to the mastodon user in order to avoid HTTP error 500 on file uploads.[1].

We configure the SSH client config at /root/.ssh/config to define which port, user and private key to use to connect to the storage box:

Host <storage box>
  User <user>
  Port 23
IdentityFile /root/.ssh/storage-box
PreferredAuthentications publickey,password

We mount the storage box volume:

mount /mnt/mastodon

Finally, we create a link on the mastodon directory:

ln -s /mnt/mastodon/media /path/to/mastodon/public/system

And we restart the Mastodon instance \o/

Note: We first tried to use autofs to mount the storage box, but this was resulting in socket errors because the docker container didn't get the information that the volume had been disconnected and mounted again.

How to upgrade?

A script called upgrade.sh located in /home/mastodon will reproduce the steps below. Note: this script stops at step 12 and does not go further.

Before running this script or the following commands, please check the instruction for the current release. Sometimes, aditionnal actions are needed (other than migrate database and compile assets)

  1. git fetch -t # update the git branch, including new tags
  2. git stash # prevent changes made to the files to be overwritten (mainly, the docker-compose.yml file)
  3. git pull
  4. git stash pop # restore your changes
  5. docker-compose build # build the image
  6. docker-compose exec -u postgres pgbackrest --retention-diff 3 --stanza main --type incr backup # backup the database (see #WAL_archiving with_pgBackRest)
  7. tail ../your_dump_file.sql # check if the backup worked, with your_dump_file.sql being the dumpfile you have created in the previous step.
  8. docker-compose run --rm -e SKIP_POST_DEPLOYMENT_MIGRATIONS=true web rails db:migrate # upgrade the database (pre-install)
  9. docker-compose down # shut down the containers.
  10. docker-compose run --rm web bin/tootctl cache clear # clear the cache
  11. docker-compose run --rm web rails db:migrate # upgrade the database
  12. docker-compose up -d # start the mastodon instance (create new volumes)
  13. docker-compose logs -ft web # (optional) if you want to monitor the progress. Once this is done you ctrl+c
  14. docker system prune -a # remove all unused volumes, old images, etc.

What to do when something went wrong?

Don't panic.

If you made a dump of the database before upgrading, you can restore the database as follows:

  1. docker-compose stop
  2. docker-compose start db
  3. docker-compose exec db dropdb postgres -U postgres # remove the db !!!!!!!
  4. docker-compose exec db createdb postgres -U postgres # create a fresh and new db
  5. cat ../your_dump_file.sql | docker exec -i mastodon_db_1 psql -U postgres # restore the database, with "your_dump_file" being a database backup
  6. docker-compose down
  7. docker-compose up -d

If you are using pgBackRest, you can follow these steps: https://pgbackrest.org/user-guide.html#quickstart/perform-restore

As the tool is inside a docker container, you can enter the container like this:

docker-compose run --rm -u postgres db bash

Then, you can execute the commands from there.

What to backup?

Backups of the database are not enough. We need to backup medias, user feeds, etc.

According to the official Mastodon documentation,[2] we need to take special cares of the following files and directories:

  • The public/system directory, which contains user-uploaded images and videos
  • The .env.production and docker-compose.yml files, which contain server config and secrets
  • The Postgres database, using pg_dump (see below)
  • The /etc directory, which contains the system's configuration files

Moreover, the backups must be encrypted on the storage server.

How to backup?

Currently: Every night, an encrypted backup of the database is made on the server and is stored on at least two other locations. The backups are incremental every day, differential every Sunday, and we perform a full backup every month. Backups for configuration files are created every night. These backups are incremental and are stored for 6 months and are stored on at least two different locations. Finally, backups and media are stored on a storage box provided by Hetzner. A snapshot of this storage box is scheduled every night.

Database

For an instant backup, we can just dump the database as follows:

docker-compose exec -u postgres db pg_dump -Fc postgres > /home/mastodon/backup/db/dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql

In short, the option -Fc enables data compression. Postgres ensures data consistency during the backup[3], which means that we don't have to stop Mastodon while dumping the database ;)

WAL archiving with pgBackRest

However, over a long time period, we usually wants incremental backups to avoid duplicating the whole database each day. Fortunately, Postgres provides all the tools we need to set up incremental backups.[4]

First, we have to enable WAL (Write Ahead Log) archiving.[5]. We just modify the Postgres config, which is located in /path/to/mastodon/postgres/postgressql.conf:

# The WAL level must be archive or higher.
wal_level = replica

# This is a soft upper limit on the total size of WAL files.
max_wal_size = 1GB

# The archive_mode must be set to on for archiving to happen.
archive_mode = on

# This is the command to invoke for each WAL file to be archived.
archive_command = '/usr/local/bin/pgbackrest --log-level-console=info --stanza=main archive-push %p'
# Max number of processes to generate the WAL files
max_wal_senders=3

This configuration tells PostgreSQL to archive transaction by executing the command:

/usr/local/bin/pgbackrest --log-level-console=info --stanza=main archive-push %p

Note that this command must be executable by the postgres user. If the script fails, PostgreSQL will try to execute it again and again. Please check the Postgres logs to be sure that everything is fine.

Then, we must install pgBackRest[6], which is the tool that will backup the WAL files. Although the user guide of pgBackRest is dense, it is quite straightforward, so we recommend you to read it.

Since we are using docker, we had to create a custom docker image that includes pgBackRest. We created the following Dockerfile in /home/mastodon/pgbackrest/Dockerfile:

FROM postgres:12-alpine

ARG PGBACKREST_VERSION="2.31"
ARG PGBACKREST_SHA256SUM="7157ec4ad2428379243c30acf2b15c2e9339beeec14697714f8eac2ce4c19896"
RUN apk add --virtual build-dependencies \
      build-base gcc make wget postgresql-dev openssl-dev libxml2-dev pkgconfig lz4-dev bzip2-dev \
    && apk add lz4-libs coreutils libbz2 \
    && pgbackrest_archive="pgbackrest-${PGBACKREST_VERSION}.tar.gz" \
    && pgbackrest_dir="/opt/pgbackrest-release-${PGBACKREST_VERSION}" \
    && wget -O $pgbackrest_archive -q https://github.com/pgbackrest/pgbackrest/archive/release/${PGBACKREST_VERSION}.tar.gz \
    && echo "${PGBACKREST_SHA256SUM}  $pgbackrest_archive" | sha256sum -c - \
    && tar zxf $pgbackrest_archive -C /opt \
    && cd $pgbackrest_dir/src \
    && ./configure \
    && make \
    && make install \
    && ln -s /usr/local/bin/pgbackrest /usr/bin/pgbackrest \
    && apk del build-dependencies \
    && cd / \
    && rm -rf $pgbackrest_archive $pgbackrest_dir

COPY --chown=postgres:postgres pgbackrest.sh /opt/
RUN chmod 744 /opt/pgbackrest.sh

COPY --chown=postgres:postgres pgbackrest.conf /etc/
RUN chmod 640 /etc/pgbackrest.conf

CMD ["postgres", \ 
      "-c", "archive_command=/usr/local/bin/pgbackrest --log-level-console=info --stanza=main archive-push %p", \
      "-c", "archive_mode=on", \
      "-c", "wal_level=replica", \
      "-c", "max_wal_senders=3"]

Because pgbackrest wasn't packaged for alpine images, we had to build it from sources.

We also create the pgBackRest config file in /home/mastodon/pgbackrest/pgbackrest.conf:

[global]
log-level-file=off
log-level-console=info

repo1-path=/mnt/pgbackrest
repo1-retention-full=2
repo1-retention-diff=2
repo1-cipher-pass=<pgbackrest password>
repo1-cipher-type=aes-256-cbc

[main]
pg1-path=/var/lib/postgresql/data

Don't forget to generate your pgbackrest password.

Finally, change the /path/to/mastodon/docker-compose.yml file to build and use this custom postgres docker image:

db:
  ...
  build: /home/mastodon/pgbackrest
  image: pgbackrest:12
  ...
  volumes:
    ...
    - /mnt/pgbackrest:/mnt/pgbackrest
  ...

Here, we can see that we are mounting a volume that will contain the backup. We use the same method as the one for storing the media (see #Storage_Box), and we gave the ownership to the postgres user, i.e. the uid 70 as this user only exists inside the postgres docker container.

Note: We upgraded the PostgreSQL database from 9.6 to 12, so adapt this configuration according to your needs.

Finally, build the docker image and recreate the database container:

docker-compose build db
docker-compose down
docker-compose up -d

Medias

It can be interesting to execute the following command before making a backup of the medias:

docker-compose run --rm web bin/tootctl media remove --days=14

This will remove local cache of media older than NUM_DAYS (=7 by default, but here we set it at 14 days). Note that this command is daily executed on our instance.

Important note: Due to the size of the media, we decided to store them on a storage box (500Go) provided by Hetzner. Every night, a snapshot of this storage box is scheduled, and we keep 4 snapshots.

Configuration files

We are backuping the config files with Borg[7] and borgmatic[8].

We created a storage box mountpoint at /mnt/backup (see #Storage_Box) and we gave ownership to the root user and the backup-sync group

First, we initialize the Borg repository:

borg init -e keyfile --umask 0027 /mnt/backup/borg

This will prompt for a passphrase. You'll have to backup the keyfile as it is needed to decrypt the borg repository, and it is stored on the server, under /root/.config/borg/keys

The umask option is there to allow read-write permission to the owner, allow read-only permission to the group, and remove all permissions for other users. See also the Wikipedia about umask.

Then, we create the borgmatic config file at /etc/borgmatic/config.yaml:

location:
  exclude_patterns:
  - /path/to/elasticsearch
  - /path/to/redis
  repositories:
  - <storage box>:/home/backup/borg
  source_directories:
  - /path/to/mastodon
  - /etc
retention:
  keep_daily: 7
  keep_hourly: 24
  keep_monthly: 6
  keep_weekly: 4
storage:
  compression: zlib,7
  encryption_passphrase: <borg passphrase>
  umask: '0027'

We can now trigger a backup with this command:

borgmatic --create --prune

We can also check the consistency of the backup with:

borgmatic --check

Important note: These two commands are executed every night, and backups are stored on at least two locations.

References