211 lines
8.9 KiB
Markdown
211 lines
8.9 KiB
Markdown
# Docker Services Recovery
|
|
|
|
This document describes how to restore the Docker services (`netbird`, `pocket-id`, `caddy`, `vaultwarden`) from an Internxt backup using `docker-recover.sh`.
|
|
|
|
## Overview
|
|
|
|
The recovery script does the following:
|
|
|
|
1. Lists available backups on Internxt and prompts you to choose one (or accepts one as an argument)
|
|
2. Downloads the selected archive from Internxt via rclone
|
|
3. Extracts it to a staging directory
|
|
4. Stops any existing containers for the four services
|
|
5. Moves existing `/opt/<service>` directories aside as `<service>.pre-recovery.<timestamp>`
|
|
6. Restores each service's directory from the archive into `/opt/`
|
|
7. For Netbird specifically: creates the containers/volumes without starting, copies the management database into the named volume, then starts everything
|
|
8. Brings all services back up with `docker compose up -d`
|
|
|
|
## Prerequisites
|
|
|
|
Before running recovery, the target machine needs:
|
|
|
|
- **Docker** with the `compose` plugin (v2)
|
|
- **rclone** installed and configured with a remote named `internxt` pointing to the same bucket/path that holds the backups
|
|
- **Root access** (the script must be run as root or via `sudo`)
|
|
- The `docker-recover.sh` script installed and executable, e.g. at `/usr/local/bin/docker-recover.sh`
|
|
|
|
### Setting up rclone on a new machine
|
|
|
|
If you're recovering to a fresh VPS, the easiest way to get rclone configured is to copy the config file from the old machine:
|
|
|
|
```bash
|
|
# On the old machine, find the config file:
|
|
rclone config file
|
|
|
|
# Copy that file to the new machine at:
|
|
# /root/.config/rclone/rclone.conf
|
|
```
|
|
|
|
Alternatively, run `rclone config` on the new machine and reconfigure the Internxt remote from scratch.
|
|
|
|
Verify it works:
|
|
|
|
```bash
|
|
rclone listremotes
|
|
# should show: internxt:
|
|
|
|
rclone lsf internxt:vps-backups/
|
|
# should list the available backup archives
|
|
```
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
sudo docker-recover.sh # interactive: lists backups, prompts for choice
|
|
sudo docker-recover.sh latest # restores the most recent backup automatically
|
|
sudo docker-recover.sh strato-docker_2026-05-07_03-00-00.tar.gz # restore a specific archive
|
|
sudo docker-recover.sh --dry-run # show what would happen, change nothing
|
|
sudo docker-recover.sh --dry-run latest # dry-run on the latest backup
|
|
sudo docker-recover.sh -n latest # short form of --dry-run
|
|
sudo docker-recover.sh --help # show usage
|
|
```
|
|
|
|
## Recommended workflow
|
|
|
|
### 1. Verify the backup with a dry-run
|
|
|
|
Before doing anything destructive, always run a dry-run first to see exactly what the script intends to do:
|
|
|
|
```bash
|
|
sudo docker-recover.sh --dry-run latest
|
|
```
|
|
|
|
Every operation is logged as `WOULD RUN: <command>` and prefixed with `[DRY-RUN]`. No files are changed, no containers are touched, nothing is downloaded.
|
|
|
|
For an even more thorough dry-run, do this two-step:
|
|
|
|
1. Start a real recovery and abort at the confirmation prompt — this downloads the archive to `/tmp/docker-recovery/` but doesn't change anything else
|
|
2. Run `--dry-run` afterwards — since the archive is now cached locally, the script will list the actual top-level entries from the archive
|
|
|
|
### 2. (If migrating to a new VPS) Update DNS first
|
|
|
|
If you're recovering to a different machine with a different IP address, update your DNS A/AAAA records **before** running recovery, and ideally wait for propagation. Otherwise:
|
|
|
|
- Caddy may try to renew TLS certificates on startup against the old IP
|
|
- Netbird clients won't be able to reach the new server until DNS resolves correctly
|
|
|
|
### 3. Run the recovery
|
|
|
|
```bash
|
|
sudo docker-recover.sh latest
|
|
```
|
|
|
|
The script will:
|
|
|
|
- List available backups
|
|
- Ask for confirmation before doing anything destructive
|
|
- Stream progress to the terminal and log everything to `/var/log/docker-recover.log`
|
|
|
|
### 4. Verify the services came up
|
|
|
|
```bash
|
|
docker ps
|
|
```
|
|
|
|
All four services should be running. Check individual logs if anything looks off:
|
|
|
|
```bash
|
|
cd /opt/netbird && docker compose logs --tail=50
|
|
cd /opt/pocket-id && docker compose logs --tail=50
|
|
cd /opt/caddy && docker compose logs --tail=50
|
|
cd /opt/vaultwarden && docker compose logs --tail=50
|
|
```
|
|
|
|
### 5. Verify each service's functionality
|
|
|
|
- **Caddy**: `curl -I https://yourdomain.example.com` should return a valid response with the existing TLS certificate
|
|
- **Pocket-ID**: log in via the web UI; check that previously registered passkeys still work
|
|
- **Netbird**: check that existing peers reconnect and the dashboard shows your network
|
|
- **Vaultwarden**: log in with an existing account; verify your vault items are present
|
|
|
|
### 6. Clean up the pre-recovery directories
|
|
|
|
The script preserves the previous state of `/opt/<service>` as `/opt/<service>.pre-recovery.<timestamp>`. Once you've verified everything works correctly:
|
|
|
|
```bash
|
|
sudo rm -rf /opt/*.pre-recovery.*
|
|
```
|
|
|
|
If something looks wrong, you can roll back by stopping the new containers, removing `/opt/<service>`, and renaming the `.pre-recovery.<timestamp>` directory back.
|
|
|
|
## What gets restored
|
|
|
|
For each of `netbird`, `pocket-id`, `caddy`, `vaultwarden`:
|
|
|
|
- The full contents of `/opt/<service>/` — `docker-compose.yml`, `.env` files, configuration files, bind-mounted data directories
|
|
|
|
For Netbird specifically, the management database is restored separately:
|
|
|
|
- The archive contains a `netbird/_netbird-db/` subdirectory holding the database extracted from `/var/lib/netbird/` inside the management container
|
|
- The script creates the Netbird containers (and the named Docker volume that backs `/var/lib/netbird/`) without starting them, copies the database in, then starts everything
|
|
|
|
This is necessary because Netbird stores its database in a Docker named volume rather than in the bind-mounted `/opt/netbird/` directory, so a plain tarball of `/opt/netbird/` is not enough on its own.
|
|
|
|
## Caddy TLS certificates
|
|
|
|
Caddy's TLS certificates and ACME account info live in `/opt/caddy/data/` (assuming a bind mount). Restoring this directory means:
|
|
|
|
- HTTPS works immediately on the new machine without re-issuing certificates
|
|
- No risk of hitting Let's Encrypt rate limits during migration
|
|
|
|
If you ever change domain names during a migration, you may need to remove old certificates from `/opt/caddy/data/caddy/certificates/` and let Caddy re-issue them for the new names.
|
|
|
|
## Troubleshooting
|
|
|
|
### `rclone remote 'internxt' not configured`
|
|
|
|
Run `rclone listremotes` as the same user that's running the script (root if using `sudo`). If `internxt:` doesn't appear, the rclone config file isn't where root expects it. Copy it to `/root/.config/rclone/rclone.conf` or set `RCLONE_CONFIG=/path/to/rclone.conf` in the environment.
|
|
|
|
### `No backups found matching strato-docker_*.tar.gz`
|
|
|
|
Verify the remote contents directly:
|
|
|
|
```bash
|
|
sudo rclone lsf internxt:vps-backups/
|
|
```
|
|
|
|
If the archives use a different prefix or are in a different folder, adjust `ARCHIVE_PREFIX` and `RCLONE_PATH` at the top of the script.
|
|
|
|
### Netbird database not restored
|
|
|
|
If the script logs `could not detect Netbird management container; skipping DB restore`, this means it couldn't find a service named `netbird-server` (newer setup) or `management` (older setup) in `/opt/netbird/docker-compose.yml`. Check the compose file and adjust the detection logic in the script if your service has a different name.
|
|
|
|
### Containers fail to start after recovery
|
|
|
|
Check the logs (`docker compose logs`) for the affected service. Common issues:
|
|
|
|
- **Port conflicts**: another service on the host is using the same port
|
|
- **Missing environment variables**: the `.env` file wasn't included in the backup or was misplaced
|
|
- **Volume permission issues**: file ownership in the restored directories doesn't match what the container expects (especially for non-root containers)
|
|
|
|
### Pre-recovery rollback
|
|
|
|
If recovery succeeds but a service is broken, you can revert to the pre-recovery state:
|
|
|
|
```bash
|
|
cd /opt/<service> && docker compose down
|
|
sudo rm -rf /opt/<service>
|
|
sudo mv /opt/<service>.pre-recovery.<timestamp> /opt/<service>
|
|
cd /opt/<service> && docker compose up -d
|
|
```
|
|
|
|
For Netbird specifically, you may also need to remove and recreate the Docker named volume to clear out the restored database:
|
|
|
|
```bash
|
|
cd /opt/netbird
|
|
docker compose down -v # the -v removes the named volumes too
|
|
# then bring up the rolled-back state
|
|
docker compose up -d
|
|
```
|
|
|
|
## Logs
|
|
|
|
All recovery operations are logged to `/var/log/docker-recover.log` (in addition to being printed to the terminal). The log persists across runs, so you can review past recoveries.
|
|
|
|
## Related files
|
|
|
|
- `/usr/local/bin/docker-backup.sh` — the backup script that produces these archives, run nightly via cron
|
|
- `/usr/local/bin/docker-recover.sh` — this recovery script
|
|
- `/var/log/docker-backup.log` — backup logs
|
|
- `/var/log/docker-recover.log` — recovery logs
|
|
- `/tmp/docker-recovery/` — staging directory used during recovery (downloaded archives are cached here) |