2024-11-14 Setting Up A Data Backup Plan
I have spent time setting up HA PostgreSQL databases. The main goal is to prevent service data loss due to machine failure. But for a home lab there are other valuable data that needs to be saved safely. Most of these data are stored as files or folders. I'd like to have a way to keep them safe, too.
When it comes to file backup, there are some aspects to consider: security, encryption, syncing, archiving etc. I am considering using Syncthing as the general tool to keep important files backed up regularly, and Vaultwarden for password and sensitive data (also for its browser extensions). These self-hosted services are not for the public, so I plan to let Vaultwarden use SQLite for data storage, and back up the data via Syncthing. Later when I set up a note taking app for personal use (probably with memos) the same approach may be used.
Syncthing
Visit Syncthing and check the installation and get started guide, then head to its GitHub repo and find this for Docker installation. As usual, create a folder :
mkdir syncthing
cd synthing
and put the docker-compose.yaml
file in it:
networks:
mynet:
name: mynet
external: true
services:
syncthing:
restart: unless-stopped
image: syncthing/syncthing
container_name: syncthing
environment:
- PUID=1000
- PGID=1000
volumes:
- ./data:/var/syncthing
ports:
- 192.168.111.x:8384:8384 # Web UI
- 192.168.111.x:22000:22000/tcp # TCP file transfers
- 192.168.111.x:22000:22000/udp # QUIC file transfers
- 192.168.111.x:21027:21027/udp # Receive local discovery broadcasts
restart: unless-stopped
networks:
- mynet
healthcheck:
test: curl -fkLsS -m 2 127.0.0.1:8384/rest/noauth/health | grep -o --color=never OK || exit 1
interval: 1m
timeout: 10s
retries: 3
Note that I set up Syncthing to operate on the mesh VPN 192.168.111.0/24
.
Since the main goal is to store backups in multiple but known places, I turn off discovery and relaying, and join devices by specifying IP addresses explicitly. That is, when adding a remote device, I also set the Addresses
field in the Advanced
settings to the target address tcp4://192.168.111.x
.
I check the Default Folder
box, and create some files in the default Sync
folder (under ./data
) on these joined devices. All these files are collected in the same folder on each devices.
Note that, because modifying a file in one device will propagate its change to the others, its implication needs to be addressed. For backup purpose, the machine that creates backup should be the only producer, and the other devices just receive and keep the data. Here are some rules that can be used joinly:
- We can employ some naming mechanism, e.g. folder names and timestamps, so that a device only dumps data to the folders belong to it
- Use proper Syncthing settings, to prevent sending files generated by others
- Do not treat the files in shared folders as documents for collaborative writing.
Vaultwarden
Visit the GitHub repo directly and adapt the docker-compose.yaml
file as below:
mkdir vaultwarden
cd vaultwarden
networks:
uvw:
name: mynet
external: true
services:
vaultwarden:
image: vaultwarden/server:latest
container_name: vaultwarden
restart: unless-stopped
environment:
DOMAIN: "https://example.com"
volumes:
- ./data/:/data/
expose:
- 80
networks:
- mynet
Vaultwarden requires a domain name with HTTPS access for it to run properly. So I created a domain name, and serve it behind NPM. Launch the container and see it working:
docker-compose up -d && docker-compose logs -f
I also install the browser extension suggested from the admin UI: BitWarden. When install the extension, there is a dropdown letting you to choose the server. Select Self-hosted
. Also, there is no place to add folders (only a No Folder
folder there) on the front page, so you need to go to Settings
, Vault
, and create folders there.
Backup Vaultwarden
It is temping to just sync the data volume to another place using Syncthing. But this is generally not a good idea, i.e., copying data while the app is still operating on it. To backup Vaultwarden, there is this reference from Vaultwarden. Here also contains a nice writeup.
So I adjust the script, after creating a shared folder Backups
from Syncthing:
#!/bin/bash
# Set the script to exit immediately if any command fails
set -e
DATE=$(date +%Y-%m-%d)
BACKUP_DIR=~/.../syncthing/data/Backups/vaultwarden
BACKUP_FILE=vaultwarden-$DATE.tar.gz
CONTAINER=vaultwarden
CONTAINER_DATA_DIR=~/.../vault/data
# create backups directory if it does not exist
mkdir -p $BACKUP_DIR
# Stop the container
/usr/bin/docker stop $CONTAINER
# Backup the vaultwarden data directory to the backup directory
tar -czf "$BACKUP_DIR/$BACKUP_FILE" -C "$CONTAINER_DATA_DIR" .
# Restart the container
/usr/bin/docker restart $CONTAINER
# To delete files older than 30 days
find $BACKUP_DIR/* -mtime +30 -exec rm {} \;
Note that the script assumes the commands all run by current user (hence the ~
) who has the permission to perform all the tasks.
Run the script directly, and see the data flow from ./vaultwarden/data
in one container to ./data/Backups
in another container. If it works, you can create a cron job under the user.
$ crontab -l
$ crontag -e
...
# add the line at the end
0 0 * * * /.../vaultwarden/backup.sh