Can your operations survive the “toss test”?
I’ve heard this a couple of times and it still makes me chuckle. The best part is that it is actually a great litmus test. If anyone knows who came up this this let me know.
How to tell if you’ve automated enough of your operations…
1. Grab any machine, rip it out of the rack, and throw it out of the window. Can you automatically re-provision your systems and return the affected application services to their previous state in minutes? (no cheating by failing over to a standby cluster or alternate facility)
2. Grab any engineer and throw him or her out of the same window. Can your operations proceed as normal?