GitLab database incident

For your information:

Lessons learned:

  • Engineers should get more sleeps
  • Restore strategy is more important than backup strategy
  • Testing backup plans would not be a bad idea. If we don't test backups, we don't have them. We must rechecking backup/restore plans monthly, quarterly or yearly
  • Always careful, anything with sudo command, we need to double/triple check
  • Change terminal PS1 format/colors to make it clear whether you’re using production or staging
    • RED for production
    • Blue/green for staging
  • Show the full hostname in the bash prompt for all users by default (e.g: db1.staging.gitlab.com instead of just db1)