Page MenuHomeMiraheze

Redesign backup handling
Closed, ResolvedPublic


Currently backups are handled by Bacula on RamNode. This requires substantial disk space and consequently cost for a service that is thoroughly unmaintained, relatively unused and rarely accessed or required.

We can produce a significant cost saving (up to 75% from rough calculations) if we utilise OVH's Public Cloud Archive service.

Given our backup policy is off-site and off-main provider, this work is unfortunately blocked on us moving away from OVH for our main service hosting.

Currently with database backups, we keep a live copy on the server for menial work (accidental data deletion or recovery of deleted databases) which covers a theoretical 99% of our backup usage needs. We could adopt a similar policy for other bacula backups (except gluster?) and use OVH's PCA to long term store backup snapshots for things like disaster recovery. We need to maintain bacula's recovery period, but we might be able to beat it in some instances.

Event Timeline

John triaged this task as Low priority.Nov 28 2021, 17:02
John created this task.
John raised the priority of this task from Low to Normal.

Upping to normal as we are looking to decom bacula in the very near future.

Unknown Object (User) unsubscribed.Feb 12 2022, 07:25

Started work on this using a python handler for interacting with OVH's PCA via swift.

Being able to run and support effective backups is looking like we need to reduce existing infrastructure strain. So this is unfortunately blocked on bigger projects in the next few months.

Given recent events, a work around solution will be worked on and released this weekend hopefully.

root@puppet141:~/private# /usr/local/bin/miraheze-backup backup private
Starting backup of 'private' for date 2022-12-27...
Completed! This took 8.501368522644043s
root@puppet141:~/private# /usr/local/bin/miraheze-backup backup sslkeys
Starting backup of 'sslkeys' for date 2022-12-27...
Completed! This took 7.49277400970459s

image.png (251×1 px, 30 KB)

Backup schedules defined:

  • Private - weekly
  • SSL Keys - weekly
  • SQL - fortnightly
  • mediawiki-xml - MediaWiki (SRE) - can someone propose a time frame for XML dumps please? - 3 monthly?
  • Phabricator Static - fortnightly

Since we've got SQL backups as well, I'd say every 3 months for XML would be reasonable