Differenze tra le versioni di "Server/horror/Technical documentation"

Da Wikimedia Italia.
< Server‎ | horror
Jump to navigation Jump to search
(more info)
(add random info)
 
(16 versioni intermedie di uno stesso utente non sono mostrate)
Riga 1: Riga 1:
 
{{Server|horror}}
 
{{Server|horror}}
 
This page is the public technical documentation for the server {{Server link|horror}}, dedicated to ''off-site'' backups, useful for a [[#Disaster recovery]].
 
This page is the public technical documentation for the server {{Server link|horror}}, dedicated to ''off-site'' backups, useful for a [[#Disaster recovery]].
 +
 +
== In short ==
 +
 +
In short the server {{Server link|horror}} can ''receive'' an additional off-site backup from other servers. These off-site copies are kept and maintained for multiple days.
 +
 +
Every single server ''pushes'' on server {{Server link|horror}} what should be saved off-site. So, the server {{Server link|horror}} does ''not'' decide what should be saved.
 +
 +
Example of night transfer activity:
 +
 +
<pre>
 +
┌─────────┐
 +
│intreccio│ (push)
 +
└────┬────┘
 +
    ↓
 +
┌─────────┐
 +
│ horror  │ (receiver)
 +
└─────────┘
 +
    ↑
 +
┌────┴────┐
 +
│lessema  │ (push)
 +
└─────────┘
 +
</pre>
  
 
== Authorization ==
 
== Authorization ==
Riga 18: Riga 40:
  
 
* https://wiki.wikimedia.it/wiki/Infrastruttura
 
* https://wiki.wikimedia.it/wiki/Infrastruttura
 +
 +
== Authorized Users ==
 +
 +
Authorized server operators in {{Server link|horror}}:
 +
 +
* [[:Categoria:Accessi/Server horror/sistemisti]]
 +
 +
List of SSH usernames and users:
 +
 +
* <code>valerio-bozzolan</code> - [[User:Valerio Bozzolan]]
 +
* <s><code>anylink-...</code> - [[m:User:DavideCuteri-WMIT]]</s>
 +
 +
In case the above persons are gone, contact a superadmin of the service provider, and contact another trusted server administrator, to recover access (in case, asking for help from support):
 +
 +
* [[:Categoria:Accessi/Fornitore ctb/superadmin]]
  
 
== Server login ==
 
== Server login ==
Riga 36: Riga 73:
 
If it doesn't work, stop <u>immediately</u> and repeat [[#Authorization]].
 
If it doesn't work, stop <u>immediately</u> and repeat [[#Authorization]].
  
Do not try random attempts or you can be blocked, notified, fired or even sued.
+
Do not try random attempts or you can be blocked, notified, fired or even sued. Your life can be terminated by an AI.
 +
 
 +
== Change user ==
 +
 
 +
If you have a personal [[#Server login]] with enough privileges and you need to change user, use sudo:
 +
 
 +
sudo --login --user=ANOTHER_USER
 +
 
 +
Then you can do anything like that user, for example:
 +
 
 +
crontab -l
 +
 
 +
This is useful if you want to test a specific ''pull'' backup user.
 +
 
 +
== Root user ==
 +
 
 +
The root user should not be used in normal conditions.
 +
 
 +
During an emergency, you can use sudo to add your SSH keys inside the usual position:
 +
 
 +
/root/.ssh/authorized_keys
  
 
== Filesystem overview ==
 
== Filesystem overview ==
Riga 43: Riga 100:
  
 
  /var/backups/wmi
 
  /var/backups/wmi
 +
/var/backups/wmi/intreccio.wikimedia.it
 +
/var/backups/wmi/lessema.wikimedia.it
 +
/var/backups/wmi/...
  
 
Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:
 
Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:
  
 
  /var/backups/wmi.2
 
  /var/backups/wmi.2
 +
/var/backups/wmi.2/intreccio.wikimedia.it
 +
/var/backups/wmi.2/lessema.wikimedia.it
 +
/var/backups/wmi.2/...
  
 
Note that all sub-directories can be accessed only if you are its dedicated user.
 
Note that all sub-directories can be accessed only if you are its dedicated user.
  
For example all of these are owned by the user <code>lessema</code>:
+
For example the user <code>intreccio</code> MUST be the only one able to write in this position:
  
  /var/backups/wmi/lessema.wikimedia.it
+
  /var/backups/wmi/intreccio.wikimedia.it
/var/backups/wmi.1/lessema.wikimedia.it
 
/var/backups/wmi.2/lessema.wikimedia.it
 
/var/backups/wmi.3/lessema.wikimedia.it
 
  
So to get the most recent backup of your project just do something like this:
+
For example the user <code>intreccio</code> MUST NOT be able to read/write old copies:
  
  rsync ''lessema''@horror.wikimedia.it:/var/backups/wmi/''lessema.wikimedia.it'' ./my-destination/
+
  /var/backups/wmi.1/intreccio.wikimedia.it
 +
/var/backups/wmi.2/intreccio.wikimedia.it
 +
/var/backups/wmi.3/intreccio.wikimedia.it
  
Or to download the 3-days-old backup do something like this:
+
== Filesystem policy ==
 
 
rsync ''lessema''@horror.wikimedia.it:/var/backups/wmi.3/''lessema.wikimedia.it'' ./my-destination/
 
 
 
Etc.
 
 
 
== Filesystem policies ==
 
  
 
The filesystem rule is the standard one in Unix-like systems: give as <u>few</u> privileges as possible.
 
The filesystem rule is the standard one in Unix-like systems: give as <u>few</u> privileges as possible.
Riga 110: Riga 166:
 
<pre>
 
<pre>
 
USERNAME=foo
 
USERNAME=foo
PROJECT=foo.wikimedia.it
+
PROJECT=fooproject
  
 
sudo adduser --disabled-password $USERNAME
 
sudo adduser --disabled-password $USERNAME
Riga 120: Riga 176:
 
The final purpose is to execute this command daily <u>from you server ''foo''</u> to <u>push</u> your backups on server horror:
 
The final purpose is to execute this command daily <u>from you server ''foo''</u> to <u>push</u> your backups on server horror:
  
  rsync /my/important/pathname foo@horror.wikimedia.it:/var/backups/wmi/foo.wikimedia.it
+
  rsync /my/source/path foo@horror.wikimedia.it:/var/backups/wmi/fooproject
  
You can also execute a daily command <u>from server ''horror''</u> to <u>pull</u> your backups from server ''foo''. It's up to you.
+
You can also execute this command daily <u>from server ''horror''</u> to <u>pull</u> data from server ''foo'':
  
=== Available backup tools ===
+
  rsync mysource@myserver:/my/source/path /var/backups/wmi/fooproject
 +
 
 +
; Schedule time policy
 +
 
 +
Your backup logic can write in the backup location in this period:
 +
 
 +
* 12:00-23:59 Europe/Rome
 +
* 00:00-04:59 Europe/Rome
 +
 
 +
You <u>must not</u> write there in this period instead, otherwise you may have collisions with the rotation logic:
 +
 
 +
* 05:00-12:00
 +
 
 +
; Available backup tools
  
 
* rsync
 
* rsync
Riga 131: Riga 200:
 
* https://gitpull.it/source/micro-backup-script/ (just a stupid script that encapsulates those above)
 
* https://gitpull.it/source/micro-backup-script/ (just a stupid script that encapsulates those above)
 
* ...
 
* ...
 +
 +
; Success checklist
 +
 +
# your data is saved (by you, or by your new crontab rule) at midnight in <code>/var/backups/wmi/''fooproject''</code>
 +
# your data is automatically rotated in <code>/var/backups/wmi.1/''fooproject''</code> in the next day
  
 
== Disaster recovery ==
 
== Disaster recovery ==
Riga 136: Riga 210:
 
; You need
 
; You need
  
 +
# understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user
 
# a good understanding of what data is to be recovered and from what date
 
# a good understanding of what data is to be recovered and from what date
 
# check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
 
# check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
Riga 141: Riga 216:
 
# [[#Server login]]
 
# [[#Server login]]
  
; Instructions
+
; Recovery Instructions
  
 
# please create a public Task in [[phabricator:tag/wmit-infrastructure/]] to describe the incident shortly, and notify [[Infrastruttura]]
 
# please create a public Task in [[phabricator:tag/wmit-infrastructure/]] to describe the incident shortly, and notify [[Infrastruttura]]
# using [[#Server login]], verify the interested backup location and the required privileges
+
# do a [[#Server login]]
#: Example:
+
# be sure to be able to become [[#Root user]]
#:: <code>ls -l /var/backups/wmi</code>
+
# explore the filesystem to find the most relevant backup
# set a strong password to that user
+
#: Example for latest copy:
#: Example:
+
#: <code>ls -l /var/backups/wmi</code>
#:: <code>passwd ''interested-user''</code>
+
#: Example for 13 days ago:
# from your already-existing device, download the needed data
+
#: <code>ls -l /var/backups/wmi.13</code>
#: Example:
+
# just use standard utilities to download the needed data
#:: <code>rsync ''interested-user''@horror.wikimedia.it:/var/backups/wmi/''interested-project'' ./my-destination/
 
# when you have concluded, disable the password to that user
 
 
#: Example:
 
#: Example:
#: <code>passwd --delete ''interested-user''</code>
+
#: <code>rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz  ./my-destination/</code>
  
 
[[Categoria:Documentazione tecnica|horror]]
 
[[Categoria:Documentazione tecnica|horror]]

Versione attuale delle 17:19, 16 mag 2024

Pagina legata al server ⚙️ horror

This page is the public technical documentation for the server ⚙️ horror, dedicated to off-site backups, useful for a #Disaster recovery.

In short

In short the server ⚙️ horror can receive an additional off-site backup from other servers. These off-site copies are kept and maintained for multiple days.

Every single server pushes on server ⚙️ horror what should be saved off-site. So, the server ⚙️ horror does not decide what should be saved.

Example of night transfer activity:

┌─────────┐
│intreccio│ (push)
└────┬────┘
     ↓
┌─────────┐
│ horror  │ (receiver)
└─────────┘
     ↑
┌────┴────┐
│lessema  │ (push)
└─────────┘

Authorization

Server administrators must be authorized before being able to do a #Server login in the ⚙️ horror backup server. To be authorized:

You need
  1. a good reason
    for example #Add a project under the backup umbrella
    for example #Disaster recovery)
  2. Unix-like sysadmin experience
Instructions

Request access policy:

Authorized Users

Authorized server operators in ⚙️ horror:

List of SSH usernames and users:

In case the above persons are gone, contact a superadmin of the service provider, and contact another trusted server administrator, to recover access (in case, asking for help from support):

Server login

Access to the backup server is exclusively via SSH login. There are no other forms of access, since SSH is the most secure method possible. To do it:

You need
  1. #Authorization
  2. SSH experience
Instructions

Just login via SSH using the username we assigned to you in your #Authorization process:

ssh name-surname@horror.wikimedia.it

If it doesn't work, stop immediately and repeat #Authorization.

Do not try random attempts or you can be blocked, notified, fired or even sued. Your life can be terminated by an AI.

Change user

If you have a personal #Server login with enough privileges and you need to change user, use sudo:

sudo --login --user=ANOTHER_USER

Then you can do anything like that user, for example:

crontab -l

This is useful if you want to test a specific pull backup user.

Root user

The root user should not be used in normal conditions.

During an emergency, you can use sudo to add your SSH keys inside the usual position:

/root/.ssh/authorized_keys

Filesystem overview

You can explore the filesystem only after #Server login. All recent backups are here:

/var/backups/wmi
/var/backups/wmi/intreccio.wikimedia.it
/var/backups/wmi/lessema.wikimedia.it
/var/backups/wmi/...

Older copies can be obtained adding a numeric suffix. For example the 2-days-old backups are here:

/var/backups/wmi.2
/var/backups/wmi.2/intreccio.wikimedia.it
/var/backups/wmi.2/lessema.wikimedia.it
/var/backups/wmi.2/...

Note that all sub-directories can be accessed only if you are its dedicated user.

For example the user intreccio MUST be the only one able to write in this position:

/var/backups/wmi/intreccio.wikimedia.it

For example the user intreccio MUST NOT be able to read/write old copies:

/var/backups/wmi.1/intreccio.wikimedia.it
/var/backups/wmi.2/intreccio.wikimedia.it
/var/backups/wmi.3/intreccio.wikimedia.it

Filesystem policy

The filesystem rule is the standard one in Unix-like systems: give as few privileges as possible.

Here is a summary of the main filesystem pathnames:

Path owner:group Permissions Description
/var/backups/wmi*/ root:root 755 Everyone should be allowed to list its sub-directories to list the available latest backups.

Note: You may be allowed to list sub-directories but you must be not allowed to access them as default.

/var/backups/wmi*/project project:project 750 The user project must be the only one allowed to access in its sub-directory.

Note: the location /var/backups/wmi is automatically rotated in /var/backups/wmi.1 etc. and the oldest is automatically deleted. Permissions are just kept.

Add a project under the backup umbrella

You need
  1. a good understanding about what data need to be saved
  2. a good understanding about how to transfer that data (e.g. rsync + SSH)
  3. #Server login
Instructions

In short you just need to create a directory on server ⚙️ horror and a dedicated user able to read/write in that directory. Then, you can push backups on that directory.

Some pseudo-instructions to be executed from server ⚙️ horror to create a new project foo to be added under its backup umbrella:

USERNAME=foo
PROJECT=fooproject

sudo adduser --disabled-password $USERNAME

sudo mkdir --parents           /var/backups/wmi/"$PROJECT"
sudo chown $USERNAME:$USERNAME /var/backups/wmi/"$PROJECT"

The final purpose is to execute this command daily from you server foo to push your backups on server horror:

rsync /my/source/path foo@horror.wikimedia.it:/var/backups/wmi/fooproject

You can also execute this command daily from server horror to pull data from server foo:

 rsync mysource@myserver:/my/source/path /var/backups/wmi/fooproject
Schedule time policy

Your backup logic can write in the backup location in this period:

  • 12:00-23:59 Europe/Rome
  • 00:00-04:59 Europe/Rome

You must not write there in this period instead, otherwise you may have collisions with the rotation logic:

  • 05:00-12:00
Available backup tools
Success checklist
  1. your data is saved (by you, or by your new crontab rule) at midnight in /var/backups/wmi/fooproject
  2. your data is automatically rotated in /var/backups/wmi.1/fooproject in the next day

Disaster recovery

You need
  1. understanding whether the Unix user pushing backups has been compromised - in that case - DISABLE IT IMMEDIATELY - DISABLE ALL SSH KEYS of that user
  2. a good understanding of what data is to be recovered and from what date
  3. check if the provider has native backup/snapshots (if yes, try to use them - they may be more simple to be recovered)
  4. check if there are on-site backups (if yes, try to use them - they may be more up to date)
  5. #Server login
Recovery Instructions
  1. please create a public Task in phabricator:tag/wmit-infrastructure/ to describe the incident shortly, and notify Infrastruttura
  2. do a #Server login
  3. be sure to be able to become #Root user
  4. explore the filesystem to find the most relevant backup
    Example for latest copy:
    ls -l /var/backups/wmi
    Example for 13 days ago:
    ls -l /var/backups/wmi.13
  5. just use standard utilities to download the needed data
    Example:
    rsync root@horror.wikimedia.it:/var/backups/wmi.13/intreccio.wikimedia.it/daily/databases/matomo.sql.gz ./my-destination/