====== Paquet Debian ======
Le backup de Chapril est déployé via un paquet Debian sur un repo privé. On décrit ici les points essentiels du paquet à défaut de publier le repo, ainsi que la configuration du contrôle d'intégrité des archives.
===== Aspects backup =====
==== Script de backup ====
C'est fournis par [[https://torsion.org/borgmatic/ | borgmatic]].
On y adjoint une configuration dans ''/etc'' :
location:
source_directories:
- /
exclude_patterns:
- '/dev'
- '/media/*'
- '/mnt/*'
- '/proc'
- '/run/*'
- '/srv/backups/*.chapril.org'
- '/sys'
- '/var/cache/*'
- '/var/lib/backuppc/*'
- '/var/lib/libvirt/images/'
repositories:
- 'ssh://backup@backup.chapril.org:/srv/backups/{fqdn}'
storage:
ssh_command: ssh -p 2242 -A
archive_name_format: '{now:%Y-%m-%dT%H:%M:%S}'
# pour bullseye : borg_cache_directory: /var/cache/borg
consistency:
check_last: 2
prefix: '20'
retention:
keep_daily: 7
keep_weekly: 4
prefix: '20'
hooks:
before_backup:
- echo "Launching root backup at $(date -Iseconds)"
- for file in /etc/borg/scripts/pre-hooks/* ; do test -e "$file" || continue; echo "Executing $file..."; $file; done
after_backup:
- for file in /etc/borg/scripts/post-hooks/* ; do test -e "$file" || continue; echo "Executing $file..."; $file; done
- echo "Succeeded root backup at $(date -Iseconds)"
- borgmatic info --archive latest --json
on_error:
- echo "Failed root backup at $(date -Iseconds)"
# pour bullseye :
# after_check:
# - echo "Succeeded root checks at $(date -Iseconds)"
# after_prune:
# - echo "Succeeded root prune at $(date -Iseconds)"
==== Entrée Systemd ====
On déclenche avec un timer systemd qui retarde le démarrage avec un timing aléatoire pour éviter le ddos de [[admin:machines_virtuelles:felicette|Félicette]].
[Unit]
Description=Run borgmatic backup
[Timer]
# Will trigger at 01:00 each day
# + 0-60 random minutes
# + 30 minutes delay from borgmatic.service
OnCalendar=*-*-* 01:00:00
Persistent=true
RandomizedDelaySec=60 minutes
[Install]
WantedBy=timers.target
[Unit]
Description=borgmatic backup
Wants=network-online.target
After=network-online.target
ConditionACPower=true
[Service]
Type=oneshot
## Lower CPU and I/O priority.
Nice=19
CPUSchedulingPolicy=batch
IOSchedulingClass=best-effort
IOSchedulingPriority=7
IOWeight=100
## Logs
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=borgmatic
# Prevent rate limiting of borgmatic log events.
LogRateLimitIntervalSec=0
## Launcher
# Delay start to prevent backups immediately upon system startup
ExecStartPre=sleep 30m
ExecStart=borgmatic -v1
Restart=no
==== Scripts de pre hooks ====
#!/bin/bash
if ! test -x /usr/bin/mysql ; then
exit 0
fi
backup_dir=/var/backups/mysql
databases=$(mysql --defaults-file=/etc/mysql/debian.cnf -B -N --execute="SHOW DATABASES" | grep -v 'lost+found\|performance_schema\|information_schema')
for db in $databases ; do
mkdir -p $backup_dir
chmod 700 $backup_dir
mysqldump --defaults-file=/etc/mysql/debian.cnf --events $db | bzip2 - > $backup_dir/$db.sql.bz2
done
#!/bin/bash
if ! test -x /usr/bin/psql ; then
exit 0
fi
backup_dir=/var/backups/pgsql
databases=$(su - postgres -c 'psql -c "\l"' | tail -n+4|cut -d'|' -f 1|sed -e '/^ *$/d'|sed -e '$d'| grep -v '^[[:space:]]*template0[[:space:]]*$')
for db in $databases ; do
mkdir -p $backup_dir
chmod 700 $backup_dir
su - postgres -c "pg_dump $db" | bzip2 - > $backup_dir/$db.sql.bz2
done
#!/bin/bash
if test -x /usr/bin/influxd ; then
backup_dir=/var/backups/influxdb
db=icinga2
# Prepare.
mkdir -p $backup_dir
chmod 700 $backup_dir
# Backup.
influxd backup -portable -database $db -host localhost:8088 $backup_dir/$db
# Prune.
find $backup_dir/$db -type f -mtime +2 -delete
fi
#!/bin/bash
backup_dir=/var/backups/selections
dpkg --get-selections > $backup_dir
==== Script de post install ====
#!/bin/sh
# postinst script for backup-chapril
#
# see: dh_installdeb(1)
# summary of how this script can be called:
# * `configure'
# * `abort-upgrade'
# * `abort-remove' `in-favour'
#
# * `abort-remove'
# * `abort-deconfigure' `in-favour'
# `removing'
#
# for details, see https://www.debian.org/doc/debian-policy/ or
# the debian-policy package
case "$1" in
configure)
backup_host="backup@backup.chapril.org"
err=1
# on teste si ya une connectivité ssh et s'il faut initialiser le dépot
ssh -p 2242 -A $backup_host -o BatchMode=yes true
if [ 0 -eq $? ]
then
# si oui on teste s'il faut initier le dépot
borg_bin="/usr/bin/borg"
export BORG_RSH="ssh -p 2242 -A"
backup_dest="$backup_host:/srv/backups/`hostname --fqdn`"
$borg_bin list $backup_dest
if [ $? -ne 0 ]
then
# si il faut on initie le dépot
$borg_bin init --encryption none $backup_dest
if [ 0 -eq $? ]
then
echo " ############################################################ "
echo " # Dépot initialisé # "
echo " ############################################################ "
err=0
fi
else
echo "Dépot déjà initialisé"
err=0
fi
fi
if [ 0 -ne $err ]
then
# si non on indique comment initier le dépot
borg_bin="/usr/bin/borg"
backup_dest="$backup_host:/srv/backups/`hostname --fqdn`"
echo " ############################################################ "
echo " # Impossible de vérifier et/ou d'initialiser le dépot. # "
echo " # # "
echo " # Vérifier la connectivité SSH : # "
echo " # ssh -p 2242 -A backup@backup.chapril.org # "
echo " # # "
echo " # Puis initialisez le dépot à la main : # "
echo BORG_RSH=\"ssh -p 2242 -A\" $borg_bin init --encryption none $backup_dest
echo " # # "
echo " ############################################################ "
fi
;;
abort-upgrade|abort-remove|abort-deconfigure)
;;
*)
echo "postinst called with unknown argument \`$1'" >&2
exit 1
;;
esac
# dh_installdeb will replace this with shell code automatically
# generated by other debhelper scripts.
#DEBHELPER#
exit 0
==== Rsyslog ====
if $programname == 'borgmatic' then /var/log/borgmatic.log
& stop
==== Log rotate ====
/var/log/borgmatic.log
{
rotate 6
weekly
compress
missingok
notifempty
}
===== Configuration de l'hote =====
C'est surtout du ssh.
command="borg serve --restrict-to-path /srv/backups/dns.cluster.chapril.org",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-user-rc ssh-ed25519 ... root@dns.cluster.chapril.org
command="borg serve --restrict-to-path /srv/backups/admin.cluster.chapril.org",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-user-rc ssh-ed25519 ... root@admin.cluster.chapril.org
command="borg serve --restrict-to-path /srv/backups/mail.cluster.chapril.org",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-user-rc ssh-ed25519 ... root@mail.cluster.chapril.org
...
===== Configuration du monitoring =====
On a un script qui parse sur chaque machine le log de backup et qui est déployé par le paquet monitoring-plugins-chapril :
#!/usr/bin/env python3
import datetime, itertools, os, re
now = datetime.datetime.now(datetime.timezone.utc)
max_backup_delay = datetime.timedelta(1, 7200)
def get_name(match):
return match.group('name')
def check_backup(filename):
with open(filename) as f:
logs = f.read()
mixed_statuses = list(re.finditer(r'(?PSucceeded|Failed) (?P\w+) backup at (?P\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d\+\d\d:\d\d)$', logs, re.MULTILINE))
for name, statuses in itertools.groupby(sorted(mixed_statuses, key=get_name), key=get_name):
last = sorted(statuses, key=lambda x: x.group('date'))[-1]
print('{name}: {status} at {date}'.format(**last.groupdict()))
last_date = datetime.datetime.fromisoformat(last.group('date'))
last_status = last.group('status')
if last_status != 'Succeeded' or now - last_date > max_backup_delay:
failure.append(name)
failure = []
try:
check_backup ("/var/log/borgmatic.log")
except Exception:
check_backup ("/var/log/borgmatic.log.1")
if failure:
exit (1)
else:
exit (0)
Et la conf icinga2 :
object CheckCommand "backup" {
command = [ "sudo", PluginDir + "/check_borgmatic" ]
}
apply Service "Backup " {
import "generic-service"
check_command = "backup"
command_endpoint = host.vars.client_endpoint
assign where host.address && !host.vars.external
}
===== Aspects contrôle d'intégrité =====
On contrôle directement chaque nuit sur la machine où les backups sont stockés ([[admin:machines_virtuelles:felicette|Félicette]]).
==== Script de contrôle ====
#! /bin/bash
logger="/var/log/check_backup.log"
borg_bin="/usr/bin/borg"
backup_dest="/srv/backups/"
echo ======================================================================== >> $logger
echo " New backup check" >> $logger
echo ======================================================================== >> $logger
date >> $logger
echo "" >> $logger
cd $backup_dest
for repository in $(ls -d $backup_dest/*$(hostname -d))
do
echo "== Checking $repository" >> $logger
date >> $logger
echo "" >> $logger
$borg_bin check $repository 2>&1 >> $logger
rc=$?
if [[ $rc != 0 ]]; then exit $rc; fi
done
echo "" >> $logger
date >> $logger
echo Returned $rc >> $logger
echo ======================================================================== >> $logger
exit $rc
==== Entrée Cron ====
00 4 * * * root bash /srv/bin/check_backup.sh
==== Log rotate ====
/var/log/check_backup.log {
weekly
rotate 52
compress
delaycompress
missingok
notifempty
create 644 backup backup
}
==== Configuration du monitoring ====
On a un script qui parse sur la machine le log de check_backup :
#!/usr/bin/env python
# -*- encoding:utf8 -*-
import datetime, os, re, locale
today= datetime.datetime.now ()
max_backup_delay = datetime.timedelta (1, 7200)
def last_backup (log_file):
with open(log_file) as s:
logs_ok = re.findall (r'^([ a-zéûA-Z:,0-9]*)( \(UTC\+0[12]00\))?\nReturned 0\n={30}', s.read (), re.MULTILINE)[-1][0]
print "Last backup check : " + logs_ok
try:
return datetime.datetime.strptime (logs_ok, '%a %b %d %X %Z %Y')
except:
locale.setlocale(locale.LC_ALL, 'fr_FR.UTF-8')
return datetime.datetime.strptime (logs_ok, '%A %d %B %Y, %X')
try:
last_backup_date= last_backup ("/var/log/check_backup.log")
except:
last_backup_date= last_backup ("/var/log/check_backup.log.1")
if today - last_backup_date < max_backup_delay:
exit (0)
else:
exit (1)
Et la conf icinga2 :
object CheckCommand "check_backup" {
command = [ "/usr/local/lib/nagios/plugins/check_check_backup" ]
}
/* Backup checks */
apply Service "Check Backup " {
import "generic-service"
check_command = "check_backup"
command_endpoint = host.vars.client_endpoint
assign where host.name == "felicette.cluster.chapril.org"
}