Extended API and Website Do...

Back to overview

Downtime

Extended API and Website Downtime Due to Database Issue

Oct 06, 2023 at 05:16pm UTC

Affected services

vanillo.tv

Vanillo API

Resolved
Oct 07, 2023 at 12:12am UTC

The issue has been resolved. Thank you for your patience.

Updated
Oct 06, 2023 at 06:06pm UTC

UPDATE: We have recovered all of the data for our database successfully. This means that no user data was lost in this process and Vanillo will be up as normal once we fully recover our Kubernetes clusters and bring them back to full operation. Thanks for your patience!

Created
Oct 06, 2023 at 05:16pm UTC

Sometime within the last 24 hours, our database clusters suffered catastrophic failure due to a currently unknown reason.

Per our preliminary investigation, it appears to be a bug with our Kubernetes operator, CloudNativePG, for our PostgreSQL database. This bug has caused all of our database clusters to stop working entirely, with no currently identified way to recover from the error or get the database files back.

We are currently trying to find ways to resolve the scenario. The only usable backup currently is nineteen days old, as our automated backup system also appears to have failed around that point due to this bug.

For now, the website will remain down while we work out what our next step shall be. We apologize profusely for the downtime, but unfortunately, this element was beyond our control. We are going to work as fast as we possibly can to resolve this and hopefully find a way to recover our most recent data rather than being forced to use an old backup.