GitLab database incident

For your information:

Lessons learned:

  • Engineers should get more sleeps

  • Restore strategy is more important than backup strategy

  • Testing backup plans would not be a bad idea. If we don't test backups, we don't have them. We must rechecking backup/restore plans monthly, quarterly or yearly

  • Always careful, anything with sudo command, we need to double/triple check

  • Change terminal PS1 format/colors to make it clear whether you’re using production or staging

    • RED for production

    • Blue/green for staging

  • Show the full hostname in the bash prompt for all users by default (e.g: db1.staging.gitlab.com instead of just db1)

Last updated