Maintenance log
lemmy-2023-09-03
What were we doing?
We were migrating the database server to our new infrastructure.
Why were we doing this?
This will help us reduce some of the operating cost while improving performance and enhancing resiliancy.
How long did this take?
The maintenance window was for one hour to complete at .
The migration was completed at however the site was running slowly so investigation into the cause continued.
It took an hour to resolve the issue and we finished 40 minutes over the original one hour window at .
Timeline
- It begins!
- Blåhaj Lemmy entering maintenance mode (5m behind schedule)
- Shut down Lemmy server (5m behind schedule)
- Begin database dump (4m behind schedule)
- Database dump completed (1m behind schedule)
- Import into new database started (1m behind schedule)
- WAL files filled up data volume
- Clean up and continue
- Import complete (7m behind schedule)
- Switch Lemmy server back on (5m behind schedule)
- Exit maintenance mode (5m behind schedule)
- Maintenance complete (5m behind schedule)
- Investigating slowness (15m left in window)
- Possible fix: Perfoming analyze on tables (5m left in window)
- Found error: not enough slots available... researching (5m overtime)
- Updating lemmy conf to increase db pool size [5 -> 30] (35m overtime)
- …
- Maintenance complete (40m overtime)
— Ada & Kaity, The Blåhaj Team