Differenze tra le versioni di "Server/horror/Technical documentation"

Da Wikimedia Italia.
< Server‎ | horror
Jump to navigation Jump to search
m (horror)
(add random info)
 
(17 versioni intermedie di uno stesso utente non sono mostrate)
Riga 1: Riga 1:
 
{{Server|horror}}
 
{{Server|horror}}
Public technical documentation for the server {{Server link|horror}}, dedicated to ''off-site'' backups.
+
This page is the public technical documentation for the server {{Server link|horror}}, dedicated to ''off-site'' backups, useful for a [[#Disaster recovery]].
  
== Server access ==
+
== In short ==
  
Server administrators can be authorized to enter with a dedicated account using SSH.
+
In short the server {{Server link|horror}} can ''receive'' an additional off-site backup from other servers. These off-site copies are kept and maintained for multiple days.
  
; You need: A good reason.
+
Every single server ''pushes'' on server {{Server link|horror}} what should be saved off-site. So, the server {{Server link|horror}} does ''not'' decide what should be saved.
  
; Instructions
+
Example of night transfer activity:
  
 
<pre>
 
<pre>
ssh name-surname@horror.wikimedia.it
+
┌─────────┐
 +
│intreccio│ (push)
 +
└────┬────┘
 +
    ↓
 +
┌─────────┐
 +
horror │ (receiver)
 +
└─────────┘
 +
    ↑
 +
┌────┴────┐
 +
│lessema  │ (push)
 +
└─────────┘
 
</pre>
 
</pre>
  
Be sure to be authorized before trying. Do not try random attempts or you will be blocked.
+
== Authorization ==
 +
 
 +
Server administrators must be authorized before being able to do a [[#Server login]] in the {{Server link|horror}} backup server. To be authorized:
 +
 
 +
; You need:
 +
 
 +
# a good reason
 +
#: for example [[#Add a project under the backup umbrella]]
 +
#: for example [[#Disaster recovery]])
 +
# Unix-like sysadmin experience
 +
 
 +
; Instructions
  
 
Request access policy:
 
Request access policy:
Riga 20: Riga 41:
 
* https://wiki.wikimedia.it/wiki/Infrastruttura
 
* https://wiki.wikimedia.it/wiki/Infrastruttura
  
== Overview ==
+
== Authorized Users ==
 +
 
 +
Authorized server operators in {{Server link|horror}}:
 +
 
 +
* [[:Categoria:Accessi/Server horror/sistemisti]]
 +
 
 +
List of SSH usernames and users:
 +
 
 +
* <code>valerio-bozzolan</code> - [[User:Valerio Bozzolan]]
 +
* <s><code>anylink-...</code> - [[m:User:DavideCuteri-WMIT]]</s>
 +
 
 +
In case the above persons are gone, contact a superadmin of the service provider, and contact another trusted server administrator, to recover access (in case, asking for help from support):
 +
 
 +
* [[:Categoria:Accessi/Fornitore ctb/superadmin]]
 +
 
 +
== Server login ==
 +
 
 +
Access to the backup server is exclusively via SSH login. There are <u>no</u> other forms of access, since SSH is the most secure method possible. To do it:
  
A system administrator with [[#Server access]] and enough privileges can login in the server via SSH.
+
; You need
  
; You need: sysadmin experience with GNU/Linux.
+
# [[#Authorization]]
 +
# SSH experience
  
 
; Instructions
 
; Instructions
  
All recent backups are here:
+
Just login via SSH using the username we assigned to you in your [[#Authorization]] process:
 +
 
 +
ssh ''name-surname''@horror.wikimedia.it
 +
 
 +
If it doesn't work, stop <u>immediately</u> and repeat [[#Authorization]].
 +
 
 +
Do not try random attempts or you can be blocked, notified, fired or even sued. Your life can be terminated by an AI.
 +
 
 +
== Change user ==
 +
 
 +
If you have a personal [[#Server login]] with enough privileges and you need to change user, use sudo:
 +
 
 +
sudo --login --user=ANOTHER_USER
 +
 
 +
Then you can do anything like that user, for example:
 +
 
 +
crontab -l
 +
 
 +
This is useful if you want to test a specific ''pull'' backup user.
 +
 
 +
== Root user ==
 +
 
 +
The root user should not be used in normal conditions.
 +
 
 +
During an emergency, you can use sudo to add your SSH keys inside the usual position:
 +
 
 +
/root/.ssh/authorized_keys
 +
 
 +
== Filesystem overview ==
 +
 
 +
You can explore the filesystem only after [[#Server login]]. All recent backups are here:
  
 
  /var/backups/wmi
 
  /var/backups/wmi
 +
/var/backups/wmi/intreccio.wikimedia.it
 +
/var/backups/wmi/lessema.wikimedia.it
 +
/var/backups/wmi/...
  
 
Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:
 
Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:
  
 
  /var/backups/wmi.2
 
  /var/backups/wmi.2
 +
/var/backups/wmi.2/intreccio.wikimedia.it
 +
/var/backups/wmi.2/lessema.wikimedia.it
 +
/var/backups/wmi.2/...
  
 
Note that all sub-directories can be accessed only if you are its dedicated user.
 
Note that all sub-directories can be accessed only if you are its dedicated user.
  
For example all of these are owned by the user <code>lessema</code>:
+
For example the user <code>intreccio</code> MUST be the only one able to write in this position:
  
  /var/backups/wmi/lessema.wikimedia.it
+
  /var/backups/wmi/intreccio.wikimedia.it
/var/backups/wmi.1/lessema.wikimedia.it
 
/var/backups/wmi.2/lessema.wikimedia.it
 
/var/backups/wmi.3/lessema.wikimedia.it
 
  
So to get these copies do something like this:
+
For example the user <code>intreccio</code> MUST NOT be able to read/write old copies:
  
  rsync lessema@horror.wikimedia.it:/var/backups/wmi/lessema.wikimedia.it .
+
  /var/backups/wmi.1/intreccio.wikimedia.it
 +
/var/backups/wmi.2/intreccio.wikimedia.it
 +
/var/backups/wmi.3/intreccio.wikimedia.it
  
If it does not work, make sure to have the right [[#Server access]] privileges.
+
== Filesystem policy ==
  
== Filesystem policies ==
+
The filesystem rule is the standard one in Unix-like systems: give as <u>few</u> privileges as possible.
  
Here is a summary of the main filesystem pathnames
+
Here is a summary of the main filesystem pathnames:
  
 
{| class="wikitable"
 
{| class="wikitable"
Riga 61: Riga 135:
 
! Description
 
! Description
 
|-
 
|-
| /var/backups/wmi
+
| /var/backups/wmi*/
 
| root:root
 
| root:root
 
| 755
 
| 755
 
| Everyone should be allowed to list its sub-directories to list the available latest backups.
 
| Everyone should be allowed to list its sub-directories to list the available latest backups.
  
* Note: You may be allowed to list sub-directories but you are not allowed to access them as default.
+
Note: You may be allowed to list sub-directories but you must be not allowed to access them as default.
|-
 
| /var/backups/wmi.*
 
| root:root
 
| 750
 
| Everyone should be allowed to list its sub-directories to know the available old backups.
 
 
 
* Note: You may be allowed to list sub-directories but you are not allowed to access them as default.
 
 
|-
 
|-
| /var/backups/wmi/''project''
+
| /var/backups/wmi*/''project''
 
| ''project'':''project''
 
| ''project'':''project''
 
| 750
 
| 750
 
| The user ''project'' must be the only one allowed to access in its sub-directory.
 
| The user ''project'' must be the only one allowed to access in its sub-directory.
 
|}
 
|}
 +
 +
Note: the location <code>/var/backups/wmi</code> is automatically rotated in <code>/var/backups/wmi.1</code> etc. and the oldest is automatically deleted. Permissions are just kept.
  
 
== Add a project under the backup umbrella ==
 
== Add a project under the backup umbrella ==
Riga 85: Riga 154:
 
; You need
 
; You need
  
* a GNU/Linux server (''foo'') with some files to be saved
+
# a good understanding about what data need to be saved
* SSH access to server ''foo''  
+
# a good understanding about how to transfer that data (e.g. ''rsync + SSH'')
* SSH access to server {{Server link|horror}} ([[#Server access]]) and <code>sudo</code>
+
# [[#Server login]]
* knowledge of SSH keys
 
* knowledge of data transfers over SSH (e.g. using rsync)
 
  
 
; Instructions
 
; Instructions
Riga 99: Riga 166:
 
<pre>
 
<pre>
 
USERNAME=foo
 
USERNAME=foo
PROJECT=foo.wikimedia.it
+
PROJECT=fooproject
  
 
sudo adduser --disabled-password $USERNAME
 
sudo adduser --disabled-password $USERNAME
Riga 107: Riga 174:
 
</pre>
 
</pre>
  
The final purpose is to execute this command daily <u>from you server ''foo''</u>:
+
The final purpose is to execute this command daily <u>from you server ''foo''</u> to <u>push</u> your backups on server horror:
 +
 
 +
rsync /my/source/path foo@horror.wikimedia.it:/var/backups/wmi/fooproject
 +
 
 +
You can also execute this command daily <u>from server ''horror''</u> to <u>pull</u> data from server ''foo'':
 +
 
 +
  rsync mysource@myserver:/my/source/path /var/backups/wmi/fooproject
  
rsync /my/important/pathname foo@horror.wikimedia.it:/var/backups/wmi/foo.wikimedia.it
+
; Schedule time policy
  
For example using a crontab.
+
Your backup logic can write in the backup location in this period:
  
It's that simple.
+
* 12:00-23:59 Europe/Rome
 +
* 00:00-04:59 Europe/Rome
  
If want to have  don't want to manually run an rsync to push backups but you want some syntax sugar or you want to also do dumps or start/stop services, here some useful backup scripts which can be used to make on-site backups, and then send the copy to server {{Server link|horror}}:
+
You <u>must not</u> write there in this period instead, otherwise you may have collisions with the rotation logic:
  
* https://gitpull.it/source/micro-backup-script/
+
* 05:00-12:00
 +
 
 +
; Available backup tools
 +
 
 +
* rsync
 +
* rclone
 +
* mysqldump
 +
* https://gitpull.it/source/micro-backup-script/ (just a stupid script that encapsulates those above)
 
* ...
 
* ...
 +
 +
; Success checklist
 +
 +
# your data is saved (by you, or by your new crontab rule) at midnight in <code>/var/backups/wmi/''fooproject''</code>
 +
# your data is automatically rotated in <code>/var/backups/wmi.1/''fooproject''</code> in the next day
 +
 +
== Disaster recovery ==
 +
 +
; You need
 +
 +
# understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user
 +
# a good understanding of what data is to be recovered and from what date
 +
# check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
 +
# check if there are on-site backups (if yes, try to use them - they may be more up to date)
 +
# [[#Server login]]
 +
 +
; Recovery Instructions
 +
 +
# please create a public Task in [[phabricator:tag/wmit-infrastructure/]] to describe the incident shortly, and notify [[Infrastruttura]]
 +
# do a [[#Server login]]
 +
# be sure to be able to become [[#Root user]]
 +
# explore the filesystem to find the most relevant backup
 +
#: Example for latest copy:
 +
#: <code>ls -l /var/backups/wmi</code>
 +
#: Example for 13 days ago:
 +
#: <code>ls -l /var/backups/wmi.13</code>
 +
# just use standard utilities to download the needed data
 +
#: Example:
 +
#: <code>rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz  ./my-destination/</code>
  
 
[[Categoria:Documentazione tecnica|horror]]
 
[[Categoria:Documentazione tecnica|horror]]

Versione attuale delle 17:19, 16 mag 2024

Pagina legata al server ⚙️ horror

This page is the public technical documentation for the server ⚙️ horror, dedicated to off-site backups, useful for a #Disaster recovery.

In short

In short the server ⚙️ horror can receive an additional off-site backup from other servers. These off-site copies are kept and maintained for multiple days.

Every single server pushes on server ⚙️ horror what should be saved off-site. So, the server ⚙️ horror does not decide what should be saved.

Example of night transfer activity:

┌─────────┐
│intreccio│ (push)
└────┬────┘
     ↓
┌─────────┐
│ horror  │ (receiver)
└─────────┘
     ↑
┌────┴────┐
│lessema  │ (push)
└─────────┘

Authorization

Server administrators must be authorized before being able to do a #Server login in the ⚙️ horror backup server. To be authorized:

You need
  1. a good reason
    for example #Add a project under the backup umbrella
    for example #Disaster recovery)
  2. Unix-like sysadmin experience
Instructions

Request access policy:

Authorized Users

Authorized server operators in ⚙️ horror:

List of SSH usernames and users:

In case the above persons are gone, contact a superadmin of the service provider, and contact another trusted server administrator, to recover access (in case, asking for help from support):

Server login

Access to the backup server is exclusively via SSH login. There are no other forms of access, since SSH is the most secure method possible. To do it:

You need
  1. #Authorization
  2. SSH experience
Instructions

Just login via SSH using the username we assigned to you in your #Authorization process:

ssh name-surname@horror.wikimedia.it

If it doesn't work, stop immediately and repeat #Authorization.

Do not try random attempts or you can be blocked, notified, fired or even sued. Your life can be terminated by an AI.

Change user

If you have a personal #Server login with enough privileges and you need to change user, use sudo:

sudo --login --user=ANOTHER_USER

Then you can do anything like that user, for example:

crontab -l

This is useful if you want to test a specific pull backup user.

Root user

The root user should not be used in normal conditions.

During an emergency, you can use sudo to add your SSH keys inside the usual position:

/root/.ssh/authorized_keys

Filesystem overview

You can explore the filesystem only after #Server login. All recent backups are here:

/var/backups/wmi
/var/backups/wmi/intreccio.wikimedia.it
/var/backups/wmi/lessema.wikimedia.it
/var/backups/wmi/...

Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:

/var/backups/wmi.2
/var/backups/wmi.2/intreccio.wikimedia.it
/var/backups/wmi.2/lessema.wikimedia.it
/var/backups/wmi.2/...

Note that all sub-directories can be accessed only if you are its dedicated user.

For example the user intreccio MUST be the only one able to write in this position:

/var/backups/wmi/intreccio.wikimedia.it

For example the user intreccio MUST NOT be able to read/write old copies:

/var/backups/wmi.1/intreccio.wikimedia.it
/var/backups/wmi.2/intreccio.wikimedia.it
/var/backups/wmi.3/intreccio.wikimedia.it

Filesystem policy

The filesystem rule is the standard one in Unix-like systems: give as few privileges as possible.

Here is a summary of the main filesystem pathnames:

Path owner:group Permissions Description
/var/backups/wmi*/ root:root 755 Everyone should be allowed to list its sub-directories to list the available latest backups.

Note: You may be allowed to list sub-directories but you must be not allowed to access them as default.

/var/backups/wmi*/project project:project 750 The user project must be the only one allowed to access in its sub-directory.

Note: the location /var/backups/wmi is automatically rotated in /var/backups/wmi.1 etc. and the oldest is automatically deleted. Permissions are just kept.

Add a project under the backup umbrella

You need
  1. a good understanding about what data need to be saved
  2. a good understanding about how to transfer that data (e.g. rsync + SSH)
  3. #Server login
Instructions

In short you just need to create a directory on server ⚙️ horror and a dedicated user able to read/write in that directory. Then, you can push backups on that directory.

Some pseudo-instructions to be executed from server ⚙️ horror to create a new project foo to be added under its backup umbrella:

USERNAME=foo
PROJECT=fooproject

sudo adduser --disabled-password $USERNAME

sudo mkdir --parents           /var/backups/wmi/"$PROJECT"
sudo chown $USERNAME:$USERNAME /var/backups/wmi/"$PROJECT"

The final purpose is to execute this command daily from you server foo to push your backups on server horror:

rsync /my/source/path foo@horror.wikimedia.it:/var/backups/wmi/fooproject

You can also execute this command daily from server horror to pull data from server foo:

 rsync mysource@myserver:/my/source/path /var/backups/wmi/fooproject
Schedule time policy

Your backup logic can write in the backup location in this period:

  • 12:00-23:59 Europe/Rome
  • 00:00-04:59 Europe/Rome

You must not write there in this period instead, otherwise you may have collisions with the rotation logic:

  • 05:00-12:00
Available backup tools
Success checklist
  1. your data is saved (by you, or by your new crontab rule) at midnight in /var/backups/wmi/fooproject
  2. your data is automatically rotated in /var/backups/wmi.1/fooproject in the next day

Disaster recovery

You need
  1. understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user
  2. a good understanding of what data is to be recovered and from what date
  3. check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
  4. check if there are on-site backups (if yes, try to use them - they may be more up to date)
  5. #Server login
Recovery Instructions
  1. please create a public Task in phabricator:tag/wmit-infrastructure/ to describe the incident shortly, and notify Infrastruttura
  2. do a #Server login
  3. be sure to be able to become #Root user
  4. explore the filesystem to find the most relevant backup
    Example for latest copy:
    ls -l /var/backups/wmi
    Example for 13 days ago:
    ls -l /var/backups/wmi.13
  5. just use standard utilities to download the needed data
    Example:
    rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz ./my-destination/