Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equinix Move: Backup Server #3749

Closed
ryanaslett opened this issue Jun 3, 2024 · 15 comments
Closed

Equinix Move: Backup Server #3749

ryanaslett opened this issue Jun 3, 2024 · 15 comments

Comments

@ryanaslett
Copy link
Contributor

sub issue of #3597

Mnx has created the packages that allow us to have a large disk instance for our backup server.

I've created a new home for backups at mnx, but I currently lack the 'infra' privileges to be able to provision it properly.

@mhdawson
Copy link
Member

mhdawson commented Jun 6, 2024

@ryanaslett I confirmed with @richardlau that you have infra level access, let us know if there is anything else needed to proceed.

@ryanaslett
Copy link
Contributor Author

ryanaslett commented Jun 12, 2024

Progress report:

  • Backup Server provisioned
  • Depenedencies (rsnapshot) installed
  • backup scripts installed, and modified to reflect new ubuntu home (was on smartos15)
  • crontab modified, currently disabled
  • Firewall config adjusted on ci, ci-release, node-www to allow backups to connect
  • nodejs_build_infra.pub and nodejs_build_backup private key deployed, and known_hosts primed
  • preliminary run of the daily job currently running, static files on nodejs-www backed up
  • rsync older data from current backup server (older periodics, archive folder, etc)

I have a question about the static data backups:

on nodejs-www, there are two subdirectories with nightly builds, and v8-canary builds. These are being pruned by a script for space saving purposes:

# - For anything over 2 calendar years ago, retain only the build dated first of the month
# - For anything in last 2 calendar years but not the last two months, retain date numbers ending in 1
# - Keep everything from the last two months.
#

All of the pruned nightly/daily canary v8 builds are on the backup server, going back 8 years.

The /backup directory on the current backup server is about 5.1 TB of data, of which 3.7TB of it is the nightly/v8-canary. The pruned versions of those directories on nodejs-www are reduced to 1TB.

So, Im wondering what the policy/intent is for keeping daily builds that far back. Its a tremendous amount of data, and now would be a good opportunity to not carry it forward if possible.

@richardlau
Copy link
Member

So, Im wondering what the policy/intent is for keeping daily builds that far back. Its a tremendous amount of data, and now would be a good opportunity to not carry it forward if possible.

The "policy" was that we didn't delete anything. The pruning was introduced to manage the space on nodejs-www (which has less available space compared to backup) -- previous to introducing the pruning we just kept bumping to larger disks.

I know some collaborators have been asking about nightlies (cc @Uzlopak) but I'm not sure they need builds going all the way back to 8 years.

@ryanaslett
Copy link
Contributor Author

Given that we wont have enough room on the new backup machine to house all of that data, and that the likelihood of that data needing urgent/immediate restore is presumably low, I propose that we stash that historical data temporarily on digital ocean's spaces (their S3 equivalent) until we can get some confirmation as to a retention policy, which I presume will take longer to establish than aligns with getting off of equinix.

@mhdawson
Copy link
Member

Are you saying that we pruned what is served through www but on the backup server we never deleted what was pruned from the www server?

If so I think that we've never been asked to restore anything from the backup server in terms of the nightlies that having the backup server mirror what is available on www would make sense.

@nodejs/build any objections/concerns to that?

@mhdawson
Copy link
Member

And to the specific suggestion of stashing the data temporarily somewhere else if needed until we agree +1 to that as @ryanaslett suggested.

@ryanaslett
Copy link
Contributor Author

ryanaslett commented Jun 24, 2024

Are you saying that we pruned what is served through www but on the backup server we never deleted what was pruned from the www server?

Yes, exactly. The scripts appear to append new data to the backup server, but do not do a full synchronize to delete anything that no longer exists on the www server.

@mhdawson
Copy link
Member

@ryanaslett thanks for confirming. Unless anybody objects I think the right answer is probably to only transfer the data which is on the www server.

@ryanaslett
Copy link
Contributor Author

The new mnx.io backup server is online, and populated with everything that is on the old backup server, with the exception of the static daily builds that are trimmed by the prune.sh script that runs on nodejs-www.

Those files were sent from the old backup server to a pair of R2 buckets on cloudflare for the time being (until we get confirmation they can be deleted)

I'd like to get either confirmation or a +1 to now decomission the old backup server and return it to Equinix.

@mhdawson
Copy link
Member

mhdawson commented Jul 9, 2024

@ryanaslett how long has the back server been online, and are there any log files for the rsyncs that we can sniff test to see data being sync'd?

I trust it's correct but a few sniff checks here and there would be be good as well.

@ryanaslett
Copy link
Contributor Author

@mhdawson It's been online and running parallel backups for a couple of weeks now. I offset the cron on it by 8 hours to not collide with the existing backup server process.

The static data that is synced over from nodejs-www appears to be keeping in sync:
(nightly builds):

Backup Server:
root@infra-mnx-ubuntu2204-x64-1:/data/backup/static/dist/nodejs/nightly# ls -d1 v*|wc -l
558

Nodejs-www Server:
root@infra-digitalocean-ubuntu1604-x64-1:/home/dist/nodejs/nightly# ls -d1 v*|wc -l
558

The periodic weekly/monthly backups were synced from the old backup server to the new backup server, and the daily's were allowed to run.

Strangely, theres a monthly anomaly on each server:

New backup server seems to be missing the june monthly backup:

root@infra-mnx-ubuntu2204-x64-1:/data/backup/periodic# ls -la
total 84
drwxr-xr-x 21 root root 4096 Jul  9 08:11 .
drwxr-xr-x  5 root root 4096 Jun 17 18:29 ..
drwxr-xr-x  7 root root 4096 Jul  9 08:11 daily.0
drwxr-xr-x  7 root root 4096 Jul  8 08:04 daily.1
drwxr-xr-x  7 root root 4096 Jul  7 08:03 daily.2
drwxr-xr-x  7 root root 4096 Jul  6 08:01 daily.3
drwxr-xr-x  7 root root 4096 Jul  5 08:07 daily.4
drwxr-xr-x  7 root root 4096 Jul  4 08:07 daily.5
drwxr-xr-x  7 root root 4096 Jul  3 08:01 daily.6
drwxr-xr-x  7 root root 4096 May 18 00:16 monthly.0
drwxr-xr-x  7 root root 4096 Apr 27 00:20 monthly.1
drwxr-xr-x  7 root root 4096 Mar 31 00:00 monthly.2
drwxr-xr-x  7 root root 4096 Mar  3 00:05 monthly.3
drwxr-xr-x  7 root root 4096 Jan 27 23:57 monthly.4
drwxr-xr-x  7 root root 4096 Dec 31  2023 monthly.5
drwxr-xr-x  7 root root 4096 May  1  2016 monthly.6
drwxr-xr-x  7 root root 4096 Apr  3  2016 monthly.7
drwxr-xr-x  7 root root 4096 Jun 29 08:00 weekly.0
drwxr-xr-x  7 root root 4096 Jun 22 07:59 weekly.1
drwxr-xr-x  7 root root 4096 Jun 12 00:12 weekly.2
drwxr-xr-x  7 root root 4096 Jun  2 00:21 weekly.3

And the existing backup server seems to be missing the May backup:

[root@3a355104-c5d6-405f-863b-9ce5948ba77b /backup/periodic]# ls -la
total 381
drwxr-xr-x 21 root root 21 Jul  9 00:29 .
drwxr-xr-x  5 root root  5 Dec  9  2016 ..
drwxr-xr-x  7 root root  7 Jul  9 00:29 daily.0
drwxr-xr-x  7 root root  7 Jul  8 00:13 daily.1
drwxr-xr-x  7 root root  7 Jul  7 00:04 daily.2
drwxr-xr-x  7 root root  7 Jul  6 00:17 daily.3
drwxr-xr-x  7 root root  7 Jul  5 00:32 daily.4
drwxr-xr-x  7 root root  7 Jul  4 00:20 daily.5
drwxr-xr-x  7 root root  7 Jul  3 00:10 daily.6
drwxr-xr-x  7 root root  7 Jun  2 00:21 monthly.0
drwxr-xr-x  7 root root  7 Apr 27 00:20 monthly.1
drwxr-xr-x  7 root root  7 Mar 31 00:00 monthly.2
drwxr-xr-x  7 root root  7 Mar  3 00:05 monthly.3
drwxr-xr-x  7 root root  7 Jan 27 23:57 monthly.4
drwxr-xr-x  7 root root  7 Dec 31  2023 monthly.5
drwxr-xr-x  7 root root  7 May  1  2016 monthly.6
drwxr-xr-x  7 root root  7 Apr  3  2016 monthly.7
drwxr-xr-x  7 root root  7 Jun 30 00:01 weekly.0
drwxr-xr-x  7 root root  7 Jun 23 00:03 weekly.1
drwxr-xr-x  7 root root  7 Jun 16 00:04 weekly.2
drwxr-xr-x  7 root root  7 Jun  8 23:59 weekly.3

I believe that's due to the new backup server being one week behind the cycle and will eventually propagate.

@mhdawson
Copy link
Member

@ryanaslett I believe there were also backups of Jenkins being put there as well. Do you have the list of things you set up to backup to the new server?

@ryanaslett
Copy link
Contributor Author

@mhdawson The periodic data contains all of the jenkins ci and ci-release data.

Everything under the /backup folder on the equinix backup machine is being backed up onto the /data/backup folder on the new mnx.io backup server.

It includes

  • /backup/archive that has some nodejs.org weblogs from 2016
  • /backup/static/benchmark that appears to be an old sqldump from 2016
  • /backup/static/dist/iojs - includes some historic nodejs builds/releases from 2015
  • /backup/static/dist/libuv - includes all the libuv releases
  • /backup/static/dist/nodejs - contains all of the nodejs releases, plus nightly builds and nightly v8 builds.
  • /backup/periodic -> contains all of the jenkins builds, everying in /etc/ for ci.nodejs.org, ci-release.nodejs.org and nodejs.org. It also has iptables, and www-logs.

I have duplicated the cron scripts, and updated the backup scripts (#3823) (hadnt yet created that PR)

I have made a backup of /root from the old server in a subfolder of root on the new server.

I didn't discover anything else on the old backup server outside of the /backup directory in either the scripts, documentation, or in traversing the filesystem.

@mhdawson
Copy link
Member

@ryanaslett thanks for the details I'm +1 on letting the old backup server go. @nodejs/build anybody have any remaning concerns, if not a +1 to confirm would also be good.

@ryanaslett
Copy link
Contributor Author

The backup server has been removed from ansible, and removed from the equinix account. Huzzah!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants