So … I was cleaning up our eu-central-1 when I accidently destroyed the entire region!!! SHIET. What I wanted to do was delete few nodes in a cluster. Instead of making a terraform destroy
on a specific target, I ran the command without any options. That was my fault, BUT terraform is smart enough to ask before performing a destroy. Unfortunately due to muscle memory I just typed yes which I often do; knowing exactly what I wanted to destroy …
I realized something was wrong when I normally get the prompt back after a simple destroy. At this point, I was thinking to myself; “did shit just hit the fan?” Looking at the output from the screen confirmed it!
If you work long enough in operations, this will happen to you eventually… ¯\(ツ)/¯
The trick is not to panic and apply what you know and solve the issue as soon as possible. Lucky for me, I had built the entire region via Terraform (which was how I managed to destroy everything in the entire region so quickly :P). It also helped that this region is still new. They are production clusters but weren’t exactly taking live traffic yet.
What I ended up doing was:
- Notified the team what had happened.
- Once the entire region was “destroyed,” I just re-ran
terraform apply
and let it rebuild again. - That’s it!!! Confirmed everything was back and updated the team! :D
As the saying goes: “If you haven’t taken down prod, you’re not a true ops!”
data:image/s3,"s3://crabby-images/c77c7/c77c7b3e52e19bb318b99fb5c39fd209d79886c2" alt=""