Server/horror/Technical documentation

Da Wikimedia Italia.
< Server‎ | horror
Jump to navigation Jump to search
Pagina legata al server ⚙️ horror

This page is the public technical documentation for the server ⚙️ horror, dedicated to off-site backups, useful for a #Disaster recovery.

In short

In short the server ⚙️ horror can receive an additional off-site backup from other servers. These off-site copies are kept and maintained for multiple days.

Every single server pushes on server ⚙️ horror what should be saved off-site. So, the server ⚙️ horror does not decide what should be saved.

Example of night transfer activity:

┌─────────┐
│intreccio│ (push)
└────┬────┘
     ↓
┌─────────┐
│ horror  │ (receiver)
└─────────┘
     ↑
┌────┴────┐
│lessema  │ (push)
└─────────┘

Authorization

Server administrators must be authorized before being able to do a #Server login in the ⚙️ horror backup server. To be authorized:

You need
  1. a good reason
    for example #Add a project under the backup umbrella
    for example #Disaster recovery)
  2. Unix-like sysadmin experience
Instructions

Request access policy:

Authorized Users

Authorized server operators in ⚙️ horror:

List of SSH usernames and users:

In case the above persons are gone, contact a superadmin of the service provider, and contact another trusted server administrator, to recover access (in case, asking for help from support):

Server login

Access to the backup server is exclusively via SSH login. There are no other forms of access, since SSH is the most secure method possible. To do it:

You need
  1. #Authorization
  2. SSH experience
Instructions

Just login via SSH using the username we assigned to you in your #Authorization process:

ssh name-surname@horror.wikimedia.it

If it doesn't work, stop immediately and repeat #Authorization.

Do not try random attempts or you can be blocked, notified, fired or even sued. Your life can be terminated by an AI.

Change user

If you have a personal #Server login with enough privileges and you need to change user, use sudo:

sudo --login --user=ANOTHER_USER

Then you can do anything like that user, for example:

crontab -l

This is useful if you want to test a specific pull backup user.

Root user

The root user should not be used in normal conditions.

During an emergency, you can use sudo to add your SSH keys inside the usual position:

/root/.ssh/authorized_keys

Filesystem overview

You can explore the filesystem only after #Server login. All recent backups are here:

/var/backups/wmi
/var/backups/wmi/intreccio.wikimedia.it
/var/backups/wmi/lessema.wikimedia.it
/var/backups/wmi/...

Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:

/var/backups/wmi.2
/var/backups/wmi.2/intreccio.wikimedia.it
/var/backups/wmi.2/lessema.wikimedia.it
/var/backups/wmi.2/...

Note that all sub-directories can be accessed only if you are its dedicated user.

For example the user intreccio MUST be the only one able to write in this position:

/var/backups/wmi/intreccio.wikimedia.it

For example the user intreccio MUST NOT be able to read/write old copies:

/var/backups/wmi.1/intreccio.wikimedia.it
/var/backups/wmi.2/intreccio.wikimedia.it
/var/backups/wmi.3/intreccio.wikimedia.it

Filesystem policy

The filesystem rule is the standard one in Unix-like systems: give as few privileges as possible.

Here is a summary of the main filesystem pathnames:

Path owner:group Permissions Description
/var/backups/wmi*/ root:root 755 Everyone should be allowed to list its sub-directories to list the available latest backups.

Note: You may be allowed to list sub-directories but you must be not allowed to access them as default.

/var/backups/wmi*/project project:project 750 The user project must be the only one allowed to access in its sub-directory.

Note: the location /var/backups/wmi is automatically rotated in /var/backups/wmi.1 etc. and the oldest is automatically deleted. Permissions are just kept.

Add a project under the backup umbrella

You need
  1. a good understanding about what data need to be saved
  2. a good understanding about how to transfer that data (e.g. rsync + SSH)
  3. #Server login
Instructions

In short you just need to create a directory on server ⚙️ horror and a dedicated user able to read/write in that directory. Then, you can push backups on that directory.

Some pseudo-instructions to be executed from server ⚙️ horror to create a new project foo to be added under its backup umbrella:

USERNAME=foo
PROJECT=fooproject

sudo adduser --disabled-password $USERNAME

sudo mkdir --parents           /var/backups/wmi/"$PROJECT"
sudo chown $USERNAME:$USERNAME /var/backups/wmi/"$PROJECT"

The final purpose is to execute this command daily from you server foo to push your backups on server horror:

rsync /my/source/path foo@horror.wikimedia.it:/var/backups/wmi/fooproject

You can also execute this command daily from server horror to pull data from server foo:

 rsync mysource@myserver:/my/source/path /var/backups/wmi/fooproject
Schedule time policy

Your backup logic can write in the backup location in this period:

  • 12:00-23:59 Europe/Rome
  • 00:00-04:59 Europe/Rome

You must not write there in this period instead, otherwise you may have collisions with the rotation logic:

  • 05:00-12:00
Available backup tools
Success checklist
  1. your data is saved (by you, or by your new crontab rule) at midnight in /var/backups/wmi/fooproject
  2. your data is automatically rotated in /var/backups/wmi.1/fooproject in the next day

Disaster recovery

You need
  1. understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user
  2. a good understanding of what data is to be recovered and from what date
  3. check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
  4. check if there are on-site backups (if yes, try to use them - they may be more up to date)
  5. #Server login
Recovery Instructions
  1. please create a public Task in phabricator:tag/wmit-infrastructure/ to describe the incident shortly, and notify Infrastruttura
  2. do a #Server login
  3. be sure to be able to become #Root user
  4. explore the filesystem to find the most relevant backup
    Example for latest copy:
    ls -l /var/backups/wmi
    Example for 13 days ago:
    ls -l /var/backups/wmi.13
  5. just use standard utilities to download the needed data
    Example:
    rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz ./my-destination/