The third administration: Welcome, dev & issue reports

User avatar
Starks Hayter
lost in subnet
Posts: 1
Joined: 12 Oct 2023 17:32

Re: The third administration: Welcome, dev & issue reports

Post by Starks Hayter »

Hey folks. I saw Submachine Legacy coming out on Steam in the next few days and got a blast from the past. I joined the forum back in the Rewolucje days and spent a good bit of time here, and made my little mark in Submachine Universe with a theory. I know the Pastel Stories forum went down a few years back, but I didn't know that anyone had gone to the effort to archive any of it. It's great to see some of it is still around, as well as some familiar names.

I hope the last decade or so has been kind to everyone.
User avatar
Sublevel 113
layer restorer
Posts: 16579
Joined: 11 Dec 2012 20:23

Re: The third administration: Welcome, dev & issue reports

Post by Sublevel 113 »

Welcome back, Starks!

Enjoy the Legacy! :D
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: The third administration: Welcome, dev & issue reports

Post by admin3 »

Seems we had a spambot wave over the past few days. Always fun to have to do something urgently for once. Most accounts and their posts should be pruned, and I've rotated the registration questions.

I should probably also look into another version update and moving to more stable hosting, but no news on that yet. Keep an eye for scheduled downtime.
Admin account of Ancient Crystal, current forum host.
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: The third administration: Welcome, dev & issue reports

Post by admin3 »

Some news on that.

I've got new hosting mostly set up, I just need to test everything and figure out email, so people will still be able to sign up.

Speaking of email, it turns out that, since I first set up our current hosting solution, the screws have tightened further on self-hosted email. Gmail now bounces anything the forum sends, plausibly other major providers do as well. That will need to be fixed at the same time, looking into it.
Admin account of Ancient Crystal, current forum host.
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: [Downtime scheduled for 2024-03-09 18:00-20:00 UTC] The third administration: Welcome, dev & issue reports

Post by admin3 »

Update on that. Email and the like is done and tested, so now comes the process of migration.

The domain names (https://www.pastelland.net, https://www.pastelland.com) still refer to the old server at the time of writing. In the coming days I'll point them at the new server, a tiny but sufficient Digitalocean droplet currently running as a reverse proxy, forwarding all connections to the old one. This should be a seamless transition.

Then comes the step of actually moving over the forum content. This too I've tested, but the "true" transfer will require small amount of downtime, which I've scheduled for saturday evening UTC, as per the title. If there are no surprises the forums will be back to proper cloud hosting after that.
Admin account of Ancient Crystal, current forum host.
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: [Migration postponed] The third administration: Welcome, dev & issue reports

Post by admin3 »

Hey, you know what? The old hosting setup lasted for four years, it can last another two weeks.

I'm reasonably confident that the new hosting setup is stable, now that the limited-RAM issue has been identified and (provisionally) addressed, but I was probably going to roll the forum back to its freshly-migrated state once more just to be safe (in part because there was some odd behavior wherein the forum would re-close when I configured the email -- probably just a harmless artifact of the migration). With the memory limits in mind, though, I should probably optimize the new setup some more before taking it into production, so I may as well spin the old server back up until that's done. This should take a few weeks at most. The new server will stay in place as the frontend, so we're back to the status quo ante migration. Email, and therefore new registrations, will remain unavailable for however long this takes.

The one benefit of the fact that the forums are mostly archival nowadays is that these repeated rollbacks cause no great data loss. Enclosed below are the only three messages made since migration.
admin3 wrote:[2024-03-09 22:17 UTC]

That was exciting.

Quick recap for posterity, all times UTC:
  • At 18:00 I disabled the boards at our old hosting solution. I then thought, "Wouldn't it be nice if the frontend were set to serve a nice, clear '503 unavailable' message explaining the downtime?".
  • At 19:30 I had managed to set this up, because I really haven't used nginx before and it took way more fiddling with settings than I expected.
  • At 20:20 I'd done most of the actual work of uploading/migrating, but the connection between the frontend reverse-proxy and the backend webserver turned out to require more setup than expected, so I formally prolonged the downtime to 21:00.
  • At 20:40 the boards were essentially up and running, but while I was at it I wanted to set up new HTTPS certificates using the script I'd created for this.
  • At 20:55, as I was restarting with the new certificates, the server melted. I still have no real idea what happened there, and I'll have to dig through the logs for clues when I have the time, but CPU use shot to 100% and the server became unresponsive. I had to cycle power for the 'ocean droplet to wake it up, but there have been no signs of trouble after that.

    Then it was just a matter of re-doing the file/database migration (just to be safe), which went fine, and re-acquiring the certs, which went mostly fine save for the catch-22 that the frontend needs the certificates to start while certbot needs the frontend to pass letsencrypt's automated challenges. This could be circumvented by first getting the < *.pastelland.net > certificate (wildcard, checked by dns instead of http), starting the server with that, then running the rest of the checks (which use http to avoid collision with the dns challenges).
  • Finally, by 21:55 I felt confident re-declaring the board re-opened.
There should be no further disruptions, unless the mysterious CPU spike comes back to haunt us. I've also checked that email is, in fact, functional again -- or, at least gmail accepts mail from the new transmission system, presumably other major ones will too -- so password resets and new signups should be possible for the first time in however long it was broken.
admin3 wrote:[2024-03-09 22:28 UTC]

Naturally, the first thing that happens after me posting that is the database dies spontaneously, with no apparent error message.

That's ... pretty hard to debug? It's not a nice crash either, some tables in the database are reported as "crashed" when I start it back up. Had to re-post the above message because I re-uploaded the pre-migration DB contents, to be safe.

Not sure how I'll deal with this at the moment. The forum probably shouldn't be run while this problem is potentially in effect, but I don't want to just take it down indefinitely. Just ... don't post anything while I make up my mind.
admin3 wrote:[2024-03-10 03:29 UTC]

Alright, third iteration of this message.

Seems we've basically been running into RAM limits, which -- I really should have seen coming, but oh well. I've added some swap space, we'll see tomorrow if that helps.
The last one was, of course, preceded by two other versions of the same message, all claiming to have identified potential solutions and being immediately proven wrong as they were mercifully lost in the database rollback after the next crash. The main highlight was probably the promise of (more frequent) backups (than usual) to ameliorate any future crashes, followed immediately by a backup attempt causing the next crash.
Admin account of Ancient Crystal, current forum host.
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: [Downtime scheduled for 2024-03-23 & --03-24, 18:00-21:00 UTC] The third administration: Welcome, dev & issue report

Post by admin3 »

Blocking out two three-hour segments this weekend (2024-03-23 and 2024-03-24, both 18:00-21:00 UTC) for migration take 2, to be on the safe side.
Admin account of Ancient Crystal, current forum host.
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: [Migration in prod.] The third administration: Welcome, dev & issue reports

Post by admin3 »

Apparently not safe enough, since I only got started on this today, with an hour left on the downtime clock. An hour and three quarters later, I believe things are up and running like they should.

Migration itself was fairly routine, being mostly the same steps as last time, and I've run rudimentary checks that nothing is obviously broken. We have, of course, been fooled before, so I'm only declaring the board provisionally re-opened, for the time being. Like last time I'll trust it to be stable once a few days have passed without incident. Until I do, don't post anything you don't expect to be undone in a sudden rollback.
Admin account of Ancient Crystal, current forum host.
User avatar
admin3
Site Admin
Posts: 33
Joined: 01 Jan 1970 01:00

Re: The third administration: Welcome, dev & issue reports

Post by admin3 »

That's two weeks' uninterrupted uptime, I'm declaring the board formally re-opened.

I should also report that we may have had a data breach. Specifically, there was a backup dated early 2021 sitting around in an exposed directory, due to (say the line, Bart) faulty web server configuration. My mistake. Now, those backups have lengthy randomized names which an attacker would have to guess, meaning it's reasonably unlikely that anyone could have accessed the file to begin with -- except this one didn't look properly random (in fact, it was largely 1234... repeating), and I've no idea whether that's a known fault an attacker could have used. Even a forum this size does get hit with a lot of spambots, but I'm not sure any of them probe for breaches that much in depth. The logs don't go far enough back for me to check.

As with any database breach, password hashes were potentially exposed.
Admin account of Ancient Crystal, current forum host.
User avatar
Sublevel 113
layer restorer
Posts: 16579
Joined: 11 Dec 2012 20:23

Re: The third administration: Welcome, dev & issue reports

Post by Sublevel 113 »

Umm... Will it be wise to forcibly change password for everything and everyone? Just in case?
Post Reply