Differenze tra le versioni di "Server/horror/Technical documentation"
(→Filesystem overview: fixes) |
(→Disaster recovery: add notes) |
||
Riga 189: | Riga 189: | ||
; You need | ; You need | ||
+ | # understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user | ||
# a good understanding of what data is to be recovered and from what date | # a good understanding of what data is to be recovered and from what date | ||
# check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered) | # check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered) | ||
Riga 194: | Riga 195: | ||
# [[#Server login]] | # [[#Server login]] | ||
− | ; Instructions | + | ; Recovery Instructions |
# please create a public Task in [[phabricator:tag/wmit-infrastructure/]] to describe the incident shortly, and notify [[Infrastruttura]] | # please create a public Task in [[phabricator:tag/wmit-infrastructure/]] to describe the incident shortly, and notify [[Infrastruttura]] | ||
− | # | + | # do a [[#Server login]] |
− | #: Example: | + | # explore the filesystem to find the most relevant backup |
+ | #: Example for latest copy: | ||
#: <code>ls -l /var/backups/wmi</code> | #: <code>ls -l /var/backups/wmi</code> | ||
− | + | #: Example for 13 days ago: | |
− | #: Example: | + | #: <code>ls -l /var/backups/wmi.13</code> |
− | #: <code> | + | # just use standard utilities to download the needed data |
− | |||
− | |||
− | |||
− | # | ||
#: Example: | #: Example: | ||
− | #: <code> | + | #: <code>rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz ./my-destination/</code> |
[[Categoria:Documentazione tecnica|horror]] | [[Categoria:Documentazione tecnica|horror]] |
Versione delle 17:19, 3 mag 2023
This page is the public technical documentation for the server ⚙️ horror
, dedicated to off-site backups, useful for a #Disaster recovery.
In short
In short the server ⚙️ horror
can receive an additional off-site backup from other servers. These off-site copies are kept and maintained for multiple days.
Every single server pushes on server ⚙️ horror
what should be saved off-site. So, the server ⚙️ horror
does not decide what should be saved.
Example of night transfer activity:
┌─────────┐ ┌──────┐ ┌───────┐ │intreccio│ │horror│ │lessema│ └────┬────┘ └──┬───┘ └───┬───┘ │ │ │ │─────────────────>│ │ │ │ │ │ │ │ │ │ <────────────────│ ┌────┴────┐ ┌──┴───┐ ┌───┴───┐ │intreccio│ │horror│ │lessema│ └─────────┘ └──────┘ └───────┘
Authorization
Server administrators must be authorized before being able to do a #Server login in the ⚙️ horror
backup server. To be authorized:
- You need
- a good reason
- for example #Add a project under the backup umbrella
- for example #Disaster recovery)
- Unix-like sysadmin experience
- Instructions
Request access policy:
Server login
Access to the backup server is exclusively via SSH login. There are no other forms of access, since SSH is the most secure method possible. To do it:
- You need
- #Authorization
- SSH experience
- Instructions
Just login via SSH using the username we assigned to you in your #Authorization process:
ssh name-surname@horror.wikimedia.it
If it doesn't work, stop immediately and repeat #Authorization.
Do not try random attempts or you can be blocked, notified, fired or even sued. Your life can be terminated by an AI.
Change user
If you have a personal #Server login with enough privileges and you need to change user, use sudo:
sudo --login --user=ANOTHER_USER
Then you can do anything like that user, for example:
crontab -l
This is useful if you want to test a specific pull backup user.
Filesystem overview
You can explore the filesystem only after #Server login. All recent backups are here:
/var/backups/wmi /var/backups/wmi/intreccio.wikimedia.it /var/backups/wmi/lessema.wikimedia.it /var/backups/wmi/...
Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:
/var/backups/wmi.2 /var/backups/wmi.2/intreccio.wikimedia.it /var/backups/wmi.2/lessema.wikimedia.it /var/backups/wmi.2/...
Note that all sub-directories can be accessed only if you are its dedicated user.
For example the user intreccio
MUST be the only one able to write in this position:
/var/backups/wmi/intreccio.wikimedia.it
For example the user intreccio
MUST NOT be able to read/write old copies:
/var/backups/wmi.1/intreccio.wikimedia.it /var/backups/wmi.2/intreccio.wikimedia.it /var/backups/wmi.3/intreccio.wikimedia.it
Filesystem policy
The filesystem rule is the standard one in Unix-like systems: give as few privileges as possible.
Here is a summary of the main filesystem pathnames:
Path | owner:group | Permissions | Description |
---|---|---|---|
/var/backups/wmi*/ | root:root | 755 | Everyone should be allowed to list its sub-directories to list the available latest backups.
Note: You may be allowed to list sub-directories but you must be not allowed to access them as default. |
/var/backups/wmi*/project | project:project | 750 | The user project must be the only one allowed to access in its sub-directory. |
Note: the location /var/backups/wmi
is automatically rotated in /var/backups/wmi.1
etc. and the oldest is automatically deleted. Permissions are just kept.
Add a project under the backup umbrella
- You need
- a good understanding about what data need to be saved
- a good understanding about how to transfer that data (e.g. rsync + SSH)
- #Server login
- Instructions
In short you just need to create a directory on server ⚙️ horror
and a dedicated user able to read/write in that directory. Then, you can push backups on that directory.
Some pseudo-instructions to be executed from server ⚙️ horror
to create a new project foo to be added under its backup umbrella:
USERNAME=foo PROJECT=fooproject sudo adduser --disabled-password $USERNAME sudo mkdir --parents /var/backups/wmi/"$PROJECT" sudo chown $USERNAME:$USERNAME /var/backups/wmi/"$PROJECT"
The final purpose is to execute this command daily from you server foo to push your backups on server horror:
rsync /my/source/path foo@horror.wikimedia.it:/var/backups/wmi/fooproject
You can also execute this command daily from server horror to pull data from server foo:
rsync mysource@myserver:/my/source/path /var/backups/wmi/fooproject
- Schedule time policy
Your backup logic can write in the backup location in this period:
- 12:00-23:59 Europe/Rome
- 00:00-04:59 Europe/Rome
You must not write there in this period instead, otherwise you may have collisions with the rotation logic:
- 05:00-12:00
- Available backup tools
- rsync
- rclone
- mysqldump
- https://gitpull.it/source/micro-backup-script/ (just a stupid script that encapsulates those above)
- ...
- Success checklist
- your data is saved (by you, or by your new crontab rule) at midnight in
/var/backups/wmi/fooproject
- your data is automatically rotated in
/var/backups/wmi.1/fooproject
in the next day
Disaster recovery
- You need
- understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user
- a good understanding of what data is to be recovered and from what date
- check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
- check if there are on-site backups (if yes, try to use them - they may be more up to date)
- #Server login
- Recovery Instructions
- please create a public Task in phabricator:tag/wmit-infrastructure/ to describe the incident shortly, and notify Infrastruttura
- do a #Server login
- explore the filesystem to find the most relevant backup
- Example for latest copy:
ls -l /var/backups/wmi
- Example for 13 days ago:
ls -l /var/backups/wmi.13
- just use standard utilities to download the needed data
- Example:
rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz ./my-destination/