Difference between revisions of "ITSquad/Mastodon/Maintenance"

From Pirate Party Belgium
Jump to: navigation, search
(Medias: no compression)
(Database)
 
(2 intermediate revisions by the same user not shown)
Line 61: Line 61:
 
In short, the option ''-Fc'' enables data compression. Postgres ensures data consistency during the backup<ref>https://www.postgresql.org/docs/current/app-pgdump.html</ref>, which means that we don't have to stop Mastodon while dumping the database ;)
 
In short, the option ''-Fc'' enables data compression. Postgres ensures data consistency during the backup<ref>https://www.postgresql.org/docs/current/app-pgdump.html</ref>, which means that we don't have to stop Mastodon while dumping the database ;)
  
However, over a long time period, we usually wants incremental backups to avoid duplicating the whole database each day.
+
==== WAL archiving ====
  
Fortunately, Postgres all the tools we need to set up incremental backups.<ref>https://www.opsdash.com/blog/postgresql-backup-restore.html</ref> (to be continued)
+
However, over a long time period, we usually wants incremental backups to avoid duplicating the whole database each day. Fortunately, Postgres provides all the tools we need to set up incremental backups.<ref>https://www.opsdash.com/blog/postgresql-backup-restore.html</ref>
 +
 
 +
First, we have to enable WAL (Write Ahead Log) archiving.<ref>https://www.opsdash.com/blog/postgresql-wal-archiving-backup.html</ref>. We just modify the Postgres config, which is located in /path/to/mastodon/postgres/postgressql.conf:
 +
# The WAL level must be archive or higher.
 +
wal_level = archive
 +
 +
# Ensure there is at least one WAL file for each "archive_timeout" duration.
 +
archive_timeout = 1h
 +
 +
# This is a soft upper limit on the total size of WAL files.
 +
max_wal_size = 1GB
 +
 +
# Keep around at least these many WAL files (aka segments).
 +
wal_keep_segments = 10
 +
 +
# The archive_mode must be set to on for archiving to happen.
 +
archive_mode = on
 +
 +
# This is the command to invoke for each WAL file to be archived.
 +
archive_command = '/opt/wal-backup.sh wal %p'
 +
 
 +
This configuration tells Postgrel to archive transaction at least every hour by executing the script in /opt/wal-backup.sh. Note that this script must be executable by the postgres user. If the script fails, Postgres will try to execute it again and again. Please check the Postgres logs to be sure that everything is fine.
 +
 
 +
Then, we must install a tool that will backup the WAL files. We can either use WAL-E<ref>https://github.com/wal-e/wal-e</ref> or WAL-G<ref>https://github.com/wal-g/wal-g</ref> which is a refactor in Go of the former. Both tools' configuration are very similar.
 +
 
 +
For WAL-G, we would end up with the following script:
 +
# If the script is run on the same host as Postgres, we can just listen to the Postgres' socket
 +
export PGHOST=/var/run/postgresql
 +
export PGUSER=postgres
 +
 +
# This variable expect a string with the public key in ascii
 +
export WALG_PGP_KEY=$(cat /path/to/your-key.pub)
 +
 +
export WALG_FILE_PREFIX=/path/to/backups/postgres/
 +
 +
# Postgres doesn't export the $PATH variable, so we must set it up
 +
export PATH=/usr/bin:/usr/local/bin:.
 +
 +
# The actual backup command. $1 is either "wal" or "backup", and $2 is the path to the file or database.
 +
wal-g $1-push $2
 +
 
 +
You can export your public key to a file with:
 +
gpg -a --export <gpg key id> > /path/to/your-key.pub
 +
Note that WAL-G doesn't currently support GPG keys using elliptic curves.
 +
 
 +
Finally, we must create an initial base backup once:
 +
/opt/wal-backup.sh backup /path/to/mastodon/postgres
  
 
=== Medias ===
 
=== Medias ===
Line 74: Line 120:
 
We can backup the medias using duplicity:<ref>http://duplicity.nongnu.org/</ref><ref>https://splone.com/blog/2015/7/13/encrypted-backups-using-rsync-and-duplicity-with-gpg-and-ssh-on-linux-bsd/</ref><ref>https://blog.rom1v.com/2013/08/duplicity-des-backups-incrementaux-chiffres/</ref>
 
We can backup the medias using duplicity:<ref>http://duplicity.nongnu.org/</ref><ref>https://splone.com/blog/2015/7/13/encrypted-backups-using-rsync-and-duplicity-with-gpg-and-ssh-on-linux-bsd/</ref><ref>https://blog.rom1v.com/2013/08/duplicity-des-backups-incrementaux-chiffres/</ref>
 
  duplicity --no-compression --encrypt-key <gpg encrypt key> --full-if-older-than 1W --name mastodon_medias --num-retries 3 public/system rsync://<server>:/path/to/backups
 
  duplicity --no-compression --encrypt-key <gpg encrypt key> --full-if-older-than 1W --name mastodon_medias --num-retries 3 public/system rsync://<server>:/path/to/backups
This will store the encrypted backup on a remote server. The backups are incremental, but each week a full backup is made. Note that we don't compress the medias, since they are already compressed by dedicated algorithms.
+
This will store the encrypted backup on a remote server. The backups are incremental, but each week a full backup is made. Note that we don't compress the medias, since they are already compressed by dedicated video and image algorithms.
  
  
Line 84: Line 130:
  
 
We do the same as the medias, although we can space out the full backups, as the files are changed less often:
 
We do the same as the medias, although we can space out the full backups, as the files are changed less often:
  duplicity --encrypt-key <gpg encrypt key> --full-if-older-than 1M --name mastodon_config --num-retries 3 /etc rsync://<server>:/path/to/backups
+
  duplicity --encrypt-key <gpg encrypt key> --full-if-older-than 1M --name mastodon_config --num-retries 3 --include /etc --include /path/to/mastodon/.env.production --include /path/to/mastodon/docker-compose.yml --exclude '**' / rsync://<server>:/path/to/backups
  
TODO: Find a way to include .env.production and docker-compose.yml
+
In this example, we exclude everything except the /etc directory and the mastodon's config files.
  
We can keep the config files for one year, as they weight nothing:
+
We can keep the config files for one year, as they weight almost nothing:
 
  duplicity remove-older-than 1Y --force --name mastodon_config rsync://<server>:/path/to/backups
 
  duplicity remove-older-than 1Y --force --name mastodon_config rsync://<server>:/path/to/backups
  

Latest revision as of 21:59, 10 July 2019

This page aims to describe procedures to maintain the Pirate Party's Mastodon instance.

How to upgrade?

A script called update.sh located in /home/mastodon will reproduce the steps below. Note: this script stops at step 11 and does not go further.

Before running this script or the following commands, please check the instruction for the current release. Sometimes, aditionnal actions are needed (other than migrate database and compile assets)

  1. git fetch -t # update the git branch, including new tags
  2. git stash # prevent changes made to the files to be overwritten (mainly, the docker-compose.yml file)
  3. git checkout vX.X.X # jump to the version where we want to update
  4. git stash pop # restore your changes
  5. docker-compose pull # pull from the docker-hub https://hub.docker.com/r/tootsuite/mastodon/
  6. docker-compose exec db pg_dump -U postgres -Fc postgres > ../dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql # backup the database
  7. tail ../your_dump_file.sql # check if the backup worked, with your_dump_file.sql being the dumpfile you have created in the previous step.
  8. docker-compose down # shut down the containers.
    # You'll maybe get "ERROR: An HTTP request took too long to complete" and other errors. Don't mind this and just wait 'till it's done.
  9. docker-compose run --rm web rails db:migrate # upgrade the database
  10. docker-compose run --rm web rails assets:precompile # complie the assets
  11. docker-compose up -d # start the mastodon instance (create new volumes)
  12. docker-compose logs -ft web # (optional) if you want to monitor the progress. Once this is done you ctrl+c
  13. docker system prune -a # remove all unused volumes, old images, etc.

What to do when something went wrong?

Don't panic. You can restore the database backup as follows:

  1. docker-compose stop
  2. docker-compose start db
  3. docker-compose exec db dropdb postgres -U postgres # remove the db !!!!!!!
  4. docker-compose exec db createdb postgres -U postgres # create a fresh and new db
  5. cat ../your_dump_file.sql | docker exec -i mastodon_db_1 psql -U postgres # restore the database, with "your_dump_file" being a database backup
  6. docker-compose down
  7. docker-compose up -d

You can also go to a previous version of Mastodon with:

  1. git checkout vX.X.X
  2. sed -r 's/(image:.+mastodon)(:.+)?$/\1:vX.X.X/g' docker-compose.yml # don't forget to change this back to "latest" when things get fixed
  3. docker-compose pull

What to backup?

Backups of the database are not enough. We need to backup medias, user feeds, etc.

According to the official Mastodon documentation,[1] we need to take special cares of the following files and directories:

  • The public/system directory, which contains user-uploaded images and videos
  • The .env.production and docker-compose.yml files, which contain server config and secrets
  • The Postgres database, using pg_dump (see below)
  • The /etc directory, which contains the system's configuration files

Moreover, the backups must be encrypted on the storage server.

How to backup?

Currently: Every night, a backup of the database is made on the server (but is overwritten by the next nightly backup). This is clearly not optimal. Below is the ideal situation we would like to reach. We do not backup medias and configuration files yet. We still need to implement a more reliable solution

Database

For an instant backup, we can just dump the database as follows:

docker-compose exec db pg_dump -U postgres -Fc postgres > /home/mastodon/backup/db/dump_$(date +%d-%m-%Y"_"%H_%M_%S).sql

In short, the option -Fc enables data compression. Postgres ensures data consistency during the backup[2], which means that we don't have to stop Mastodon while dumping the database ;)

WAL archiving

However, over a long time period, we usually wants incremental backups to avoid duplicating the whole database each day. Fortunately, Postgres provides all the tools we need to set up incremental backups.[3]

First, we have to enable WAL (Write Ahead Log) archiving.[4]. We just modify the Postgres config, which is located in /path/to/mastodon/postgres/postgressql.conf:

# The WAL level must be archive or higher.
wal_level = archive

# Ensure there is at least one WAL file for each "archive_timeout" duration.
archive_timeout = 1h

# This is a soft upper limit on the total size of WAL files.
max_wal_size = 1GB

# Keep around at least these many WAL files (aka segments).
wal_keep_segments = 10

# The archive_mode must be set to on for archiving to happen.
archive_mode = on

# This is the command to invoke for each WAL file to be archived.
archive_command = '/opt/wal-backup.sh wal %p'

This configuration tells Postgrel to archive transaction at least every hour by executing the script in /opt/wal-backup.sh. Note that this script must be executable by the postgres user. If the script fails, Postgres will try to execute it again and again. Please check the Postgres logs to be sure that everything is fine.

Then, we must install a tool that will backup the WAL files. We can either use WAL-E[5] or WAL-G[6] which is a refactor in Go of the former. Both tools' configuration are very similar.

For WAL-G, we would end up with the following script:

# If the script is run on the same host as Postgres, we can just listen to the Postgres' socket
export PGHOST=/var/run/postgresql
export PGUSER=postgres

# This variable expect a string with the public key in ascii
export WALG_PGP_KEY=$(cat /path/to/your-key.pub)

export WALG_FILE_PREFIX=/path/to/backups/postgres/

# Postgres doesn't export the $PATH variable, so we must set it up
export PATH=/usr/bin:/usr/local/bin:.

# The actual backup command. $1 is either "wal" or "backup", and $2 is the path to the file or database.
wal-g $1-push $2

You can export your public key to a file with:

gpg -a --export <gpg key id> > /path/to/your-key.pub

Note that WAL-G doesn't currently support GPG keys using elliptic curves.

Finally, we must create an initial base backup once:

/opt/wal-backup.sh backup /path/to/mastodon/postgres

Medias

It can be interesting to execute the following command before making a backup of the medias:

docker-compose run --rm web bin/tootctl media remove --days=14

This will remove local cache of media older than NUM_DAYS (=7 by default, but here we set it at 14 days). Note that this command is daily executed on our instance.


We can backup the medias using duplicity:[7][8][9]

duplicity --no-compression --encrypt-key <gpg encrypt key> --full-if-older-than 1W --name mastodon_medias --num-retries 3 public/system rsync://<server>:/path/to/backups

This will store the encrypted backup on a remote server. The backups are incremental, but each week a full backup is made. Note that we don't compress the medias, since they are already compressed by dedicated video and image algorithms.


As medias are really space costly, we don't keep more than one full backup (with all the daily incremental backups) at a time:

duplicity remove-all-but-n-full 1 --force --name mastodon_medias rsync://<server>:/path/to/backups

This command must be executed after each backup.

Configuration files

We do the same as the medias, although we can space out the full backups, as the files are changed less often:

duplicity --encrypt-key <gpg encrypt key> --full-if-older-than 1M --name mastodon_config --num-retries 3 --include /etc --include /path/to/mastodon/.env.production --include /path/to/mastodon/docker-compose.yml --exclude '**' / rsync://<server>:/path/to/backups

In this example, we exclude everything except the /etc directory and the mastodon's config files.

We can keep the config files for one year, as they weight almost nothing:

duplicity remove-older-than 1Y --force --name mastodon_config rsync://<server>:/path/to/backups

References