View All Videos
Checklists: the most unsexy way to save millions

Checklists: the most unsexy way to save millions

1

Damon Edwards / 

The New Yorker has a great article on the success of using checklists to tame extremely complex systems.

The primary example used in the article is intensive care units in hospitals. Anywhere you see the term “intensive care” substitute “data center” and anywhere you see a name of a medical procedure substitute the name of a technical procedure and the lessons are essentially the same.

What are the lessons?

1. Where checklists have been formalized and rigidly enforced (as a means of documenting and enforcing best practices), millions of dollars have been saved and many deaths (the ultimate “system outage”) have been avoided.

2. The concept of checklists is so simple and unsexy that their awesome saving power is often overlooked. Admit it, your inner geek yawns just thinking about checklists.

How can checklists immediately improve IT operations?

First, agree on your best practices and document them. Second, strictly enforce the rule that all operations activities must follow those procedures. Third, record the completion of each step of the procedure for trouble shooting and analysis.

Sounds like such common sense, doesn’t it? If it is then why do most IT operations fail at implementing such a simple culture of orderly change management?

Book Review: The Visible Ops Handbook

1

Damon Edwards / 

Operational excellence in IT always seems to be an illusive goal. Attempts you’ll see will often range from the “magic bullet technology” projects that rarely deliver on expectations to the addition of crushing bureaucracy that is quickly circumvented and rendered ineffectual.

With these thoughts in mind, I was leery when I picked up a copy of The Visual Ops Handbook. Wow, was I ever surprised. The Visible Ops Handbook is a compact and highly effective prescription for achieving operational excellency. It won’t get you all the way to the promised land but it will send you down the path on solid footing.

The approach is not about implementing new technology. It’s not about ivory tower bureaucracy. The Visible Ops Handbook is about bringing reliability, accountability, and predictability to your operations through a commonsense based process that doesn’t require heroic discipline or unrealistic political capital to implement.

Who should buy this book? The short answer is “everyone”. For a longer answer I’ll borrow a passage from the book’s introduction:

  • Organizations that have change management processes, but view these processes as overly bureaucratic and diminishing of productivity. There must be more to change management than bureaucracy, good intentions and scarcely attended meetings.
  • Organizations where, deep down, everyone knows that people circumvent proper processes because crippling outages, finger-pointing, and phantom changes run rampant.
  • A “cowboy culture” where seemingly “nimble” behavior has promoted destructive side effects. The sense of agility is all too often a delusion.
  • A “pager culture” where IT operations believes that true control simply is not possible, and that they are doomed to an endless cycle of break/fix triggered by a pager message at late hours of the night.
  • An environment where IT operations and security are constantly in reactive mode with little ability to figure out how to free themselves from fire fighting long enough to invest in any proactive work.
  • Organizations where both internal and external auditors are on a crusade to find out whether proper controls exist and to push madly for implementing new ones where they are not in place.
  • Organizations where IT understands the need for controls, but does not know which controls are needed first.

Yes, they are talking about you.

It’s a short read (100 pages including several appendices), so buy one for everyone in your department. Available in paperback from Amazon or as a PDF from itpi.org.

*Note: the full title is “The Visible Ops Handbook: Implementing ITIL in 4 Practical and Auditable Steps”. In my opinion the fact that ITIL is in the title is a bit misleading. There are some sidebar discussions that draw connections between the Visible Ops process and ITIL, but this is a book about how to succeed in operations first and foremost. I suspect the ITIL connection was made for marketing reasons. Don’t let it taint your opinion before you read the book.

Identi.ca: a bellweather for new open source models?

Identi.ca: a bellweather for new open source models?

Damon Edwards / 

Twitter’s legendary outages have driven a significant number of influential bloggers and pundit types to the new indenti.ca service. Indenti.ca is interesting not just because it’s a Twitter clone but because it’s an Open Software Service.

An Open Software Service is defined as a service:

1) Whose data is open as defined by the open knowledge definition with the exception that where the data is personal in nature the data need only be made available to the user (i.e. the owner of that account).

2) Whose source code is:
   1. Free/Open Source Software
   2. Made publicly available.

For the majority of users of consumer services like Twitter, Facebook, or GMail, whether or not the service is open probably seems inconsequential. However, when it comes to enterprise web services this could be a very interesting trend. The open data part is likely a non-starter, but the open source aspect opens some interesting doors.

Run it yourself, have someone run it for you, or even more interesting… some combination of the two. The opportunities for real innovation under this model are fascinating. In traditional open source software, “adding value” generally meant you added special features or provided timely code updates for a fee. Under this new model, “adding value” is all about the managed services and network effects that you can provide to end users.

I’ll be eagerly watching indenti.ca to see if they can make a going concern out of being a service provider when anyone can run the service for themselves.

If a CDM, SMI-S, CMDB, DASH, CMI falls in Santa Clara… will anyone hear it?

If a CDM, SMI-S, CMDB, DASH, CMI falls in Santa Clara… will anyone hear it?

Damon Edwards / 

I’ve been asked a number of times lately if I’m going to the Management Developers Conference taking place this November in Santa Clara. The title sure sounds like something right up my alley. Well, at least it did until I checked out the agenda. In a nutshell, this is a conference of vendors talking about the latest in management standards efforts.

I’ve commented before on how the vendor-backed standards efforts are out of touch with what is going on in the IT trenches. This is yet another example. What happened to concept that standards started with consensus on the ground and then matured from there? The ivory tower approach has been a historical failure in our space, why not surprise us all and try a different approach this time?

In every enterprise that contends with more than a handful of servers and applications you’ll find that there’s a “management developer” of some sort. Amazingly, software vendors and standards bodies appear to have little regard for those developers opinions when they meet in their ivory towers to dictate the next round of vendor sports standards. How many of of those real management developers would feel that this conference or these standards efforts (or any of the previous failed efforts that these current efforts are repeating) are relevant to their day to day lives?

Built for operations (update 1)

Built for operations (update 1)

2

Damon Edwards / 

We’ve previously touched on the trend of operations having an impact on application architecture. Up to this point, the shift towards being “built for operations” manifested itself as subtle organic changes that differed from organization to organization. If you stood back far enough, you can see it as an unmistakable trend but there hasn’t been a common driving force.

The rise of Amazon Web Services, specifically EC2, is a remarkable force that could result in a sea change in the average developer’s assumptions. For example, why do you need persistent local storage on any one machine? In EC2, if you shut off a machine you lose everything on it that isn’t part of the template image used to instantiate it. I can’t get that instance back but I can instantly launch a dozen clones from the same “birth state”. Whoooah… that’s just a little bit different now, isn’t it?

Local writes are lost? Servers are completely built from templates? Launch fully operational clones with the push of a button? The implications of these three simple concepts alone are enough to blow a lot of people’s mental gaskets.

Everyone gives Amazon props for cheap on-demand infrastructure hosting. Perhaps Amazon should get a bit more credit for pushing the art of systems architecture and management forward in a very public and massively appealing way.

Looking for “cross-over people” at O’Reilly Velocity 08

Looking for “cross-over people” at O’Reilly Velocity 08

1

Damon Edwards / 

In the comments section of a previous post, Berkay nicely summed up why it is so difficult to solve the development to operations problem:

“There are very few people that have the crossover skills. Developers who have operations experience/knowhow and operations people who have development/deployment experience is rare. Further there are organizational silos enforcing this divide.”

The observed gap between development personnel and operations personnel is a subject we’ve touched on before. Much of the success of running an efficient business based on online services depends on closing this gap.

 

The first step to closing the development to operations gap is getting everyone talking and establishing a common vocabulary. Any event that promotes these types of discussions are a good thing in our book.

O’Reilly’s new conference, Velocity, should be a good forum to hold these conversations. Alex and I will be attending Velocity and will also be participating as exhibitors (with ControlTier). The bulk of the conference agenda is focused on infrastructure design and management rather than application deployment and management, but the presence of both is a good sign.

If you are attending the conference, we look forward to meeting you. If you aren’t registered yet, you can use the code “vel08js” for 20% off.

Page 19 of 26First1718192021Last