Backup & Recovery
Overview
Backup strategy covers persistent data, configurations, and disaster recovery procedures.
Backup Scope
graph TB
subgraph Data
DB[(Databases)]
PVC[Persistent Volumes]
end
subgraph Configuration
K8s[K8s Manifests]
Secrets[Secrets]
ConfigMaps[ConfigMaps]
end
subgraph External
Git[Git Repository]
S3[Backup Storage]
end
DB --> S3
PVC --> S3
K8s --> Git
Secrets --> S3
Database Backups
MongoDB
#!/bin/bash
# MongoDB backup script
BACKUP_DIR="/backup/mongodb/$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR
mongodump \
--uri="mongodb://user:pass@mongodb:27017" \
--out=$BACKUP_DIR \
--gzip
# Retention: 7 days
find /backup/mongodb -mtime +7 -delete
MySQL
#!/bin/bash
# MySQL backup script
BACKUP_FILE="/backup/mysql/backup_$(date +%Y%m%d).sql.gz"
mysqldump \
-u root -p$MYSQL_ROOT_PASSWORD \
--all-databases \
--single-transaction \
--routines \
--triggers \
| gzip > $BACKUP_FILE
# Retention: 7 days
find /backup/mysql -mtime +7 -delete
InfluxDB
#!/bin/bash
# InfluxDB backup script
BACKUP_DIR="/backup/influxdb/$(date +%Y%m%d)"
influx backup $BACKUP_DIR \
--bucket metrics \
--org default
# Retention: 4 weeks
find /backup/influxdb -mtime +28 -delete
Elasticsearch
# Create snapshot
curl -X PUT "elasticsearch:9200/_snapshot/backup/snap_$(date +%Y%m%d)" \
-H 'Content-Type: application/json' \
-d '{
"indices": "logs-*,metrics-*",
"include_global_state": false
}'
Volume Backups
Longhorn Snapshots
apiVersion: longhorn.io/v1beta1
kind: RecurringJob
metadata:
name: daily-backup
namespace: longhorn-system
spec:
cron: "0 2 * * *"
task: backup
groups:
- default
retain: 7
concurrency: 2
Manual Snapshot
# Create snapshot
kubectl apply -f - <<EOF
apiVersion: longhorn.io/v1beta1
kind: Snapshot
metadata:
name: manual-snapshot
namespace: longhorn-system
spec:
volume: my-volume
EOF
Configuration Backup
GitOps
All Kubernetes manifests stored in Git:
| Repository |
Content |
| hub |
Hub application manifests |
| wiki |
Wiki manifests |
| infra |
Infrastructure configs |
Secrets Backup
# Export secrets (encrypted)
kubectl get secrets -A -o yaml | \
kubeseal --format yaml > secrets-backup.yaml
Backup Schedule
| Data |
Frequency |
Retention |
Location |
| MongoDB |
Daily 2AM |
7 days |
NAS |
| MySQL |
Daily 2AM |
7 days |
NAS |
| InfluxDB |
Weekly |
4 weeks |
NAS |
| Elasticsearch |
Daily |
7 days |
NAS |
| Longhorn volumes |
Daily |
7 days |
NAS |
| Kubernetes configs |
Git |
Infinite |
GitHub |
Recovery Procedures
MongoDB Restore
# Full restore
mongorestore \
--uri="mongodb://user:pass@mongodb:27017" \
--gzip \
/backup/mongodb/20240115/
# Single database
mongorestore \
--uri="mongodb://user:pass@mongodb:27017" \
--gzip \
--db hub \
/backup/mongodb/20240115/hub/
MySQL Restore
# Full restore
gunzip < /backup/mysql/backup_20240115.sql.gz | \
mysql -u root -p
# Single database
mysql -u root -p app_data < backup.sql
Longhorn Restore
- Create volume from backup in Longhorn UI
- Create PVC referencing restored volume
- Update deployment to use new PVC
Disaster Recovery
| Scenario |
RTO |
RPO |
Procedure |
| Pod failure |
Minutes |
0 |
Automatic reschedule |
| Node failure |
Minutes |
0 |
Pod migration |
| Database corruption |
1 hour |
24 hours |
Restore from backup |
| Cluster failure |
4 hours |
24 hours |
Full rebuild |
Verification
Backup Testing
# Test MongoDB backup
mongorestore --dryRun --gzip /backup/mongodb/latest/
# Test MySQL backup
mysql -u root -p test_db < backup.sql
mysql -u root -p -e "SELECT COUNT(*) FROM test_db.users"
Monthly DR Test
- Spin up test environment
- Restore all backups
- Verify application functionality
- Document any issues
- Update procedures
Monitoring
Backup Alerts
| Alert |
Condition |
| Backup failed |
Exit code != 0 |
| Backup old |
Last backup > 24h |
| Storage low |
Backup disk > 80% |
Metrics
- Last backup timestamp
- Backup size
- Backup duration
- Storage utilization
Best Practices
- 3-2-1 Rule - 3 copies, 2 media types, 1 offsite
- Test restores - Regularly verify backups work
- Encrypt backups - Protect sensitive data
- Document procedures - Clear runbooks
- Automate - Reduce human error
- Monitor - Alert on failures