Following up from yesterday’s post.
Many of us technicians like to understand complex things and learn about the roots of hard problems.
We are also very well versed in technology, how it works – how to make machines and advanced sophisticated designs. But at our jobs we work in the context of the business. It’s the business outcome that is the important context to always keep foremost in our minds. Well, at least it should.
“We have to grasp not only the Know-How but also ‘Know Why…’”, – Shigeo Shingo (Toyota)
From time to time, the business comes to us asking for help to reach a desired outcome. The outcome might be to launch a new product, add new features to the web site, anticipate a bunch of new customers, open a new market, or shorten lead time for customer requests.
So, we technicians put on our thinking caps and get excited because we are going to get to make something.
“What’s the best solution?”, we ask ourselves. “It better cover all contingincies for changing scale or future concerns”, say the far thinkers. “It’s a big project with lots of moving parts”, exclaims the project manager.
What often happens? We technicians get mired in how we solve the problem, using all the tricks up our sleeves, relishing in our know how. We know how that goes. We say: “Meta-object protocols will make our solution infinitely extendable, even dynamically! This inheritance hiearchy design coupled with good component composition will ensure the software architectured is correctly multi layered.” Blank stare and reply from the business manager: “What?” We say: “Oh it’s superior technology”
Admit it. We’ve all been there saying or listening to this kind of stuff.
What stands between us and our goal is the complexity of the solution.
“The most dangerous kind of waste is the waste we do not recognize.” – Shigeo Shingo (Toyota)
Complexity is a killer. We already live in a dynamic, fast paced world and the business is always changing. Our work world is complex so why make our solutions complex? Complex solutions are fragile, hard to roll out and adopt, time consuming to fix and extend. Complex designs just make life harder. Does the business outcome depend on any of this complexity in the solution? If not, we are gold plating and are just creating a form of waste.
So, how do we avoid building the complexity? Here’s some good rules to live by:
1. Stay focused on the “Know Why”.
Be clear you understand the desired outcome the business expects. Are you sure about it? Do others agree with your interpretation. If not, you’ll be alone defending your choices or you might suffer living with your own decisions.
2. Don’t fall prey to your “Know How”.
Watch out for the inclination to create advanced/sophisticated/over-engineered designs and implementions. This just adds time, risk, and money to get the work done.
3. Be disciplined.
Building a new solution and migrating from the “old way” of doing things to the new way is by definition a transformation. The new way will require its own new “Know How” and will not be perfect/complete/usable the first time (or the 2nd, 3rd… time). If moving to the new solution is painful, you have to stay disciplined and keep iterating to make the solution less painful. Stay focussed on the outcome the business cares about and expects from you as your guiding principle.
4. Do the simplest thing.
Choosing products and tools that are inherently simple to use, require little Know How from everyone. Tool shininess gets the heart rate up because new is fun but this isn’t what the business cares about.
You know the old saying, the shortest path between two points is a straight line.
Back before the hectic end of the year I was interviewed by HP’s Discover Performance newsletter and online magazine. The questions were about applying DevOps thinking inside enterprises can enable the pace of innovation without increasing risk.
Below is the interview in full. If you like this interview, I recommend signing up for the Discover Performance newsletter. They routinely have good articles on interesting and relevant topics and avoid injecting too much self-serving HP bias (a difficult task for enterprise funded content!).
DevOps expert Damon Edwards discusses why Ops should neither resist innovation nor be a scapegoat when things go wrong.
Innovation is a mantra in business, one that the CIO hears more and more. As IT leaders feel pressure to be more responsive, faster moving, and more innovative, Operations leaders worry that their mission—the smooth, steady delivery of high-quality IT services—may be jeopardized by rushed experimentation.
Damon Edwards, co-founder of IT consultancy DTO Solutions, has spent more than a dozen years working on web operations from both the IT and business angles. A major DevOps proponent, he recently posted about “using DevOps to turn IT into a strategic weapon.” Discover Performance reached out and asked him to talk about how Operations leaders—and IT executives in general—should approach innovation.
The (completely achievable) goal, he says, aligns IT goals with business goals by “removing all of the bottlenecks, inefficiencies, and risks between a business idea (the ‘ah-ha!’) and a measurable customer outcome (the ‘ka-ching!’).”
DE: “Disconnect” has become somewhat of an ugly euphemism inside corporations. Unfortunately it’s become code for “I’m right and you’re wrong.” In reality, a “disconnect” is usually just two people operating and making assumptions based on differing definitions. As a result, you get unfortunate infighting between people who, in all other ways, both desperately want the company to succeed.
Talk to folks in the technology roles and they tend to see innovation as being synonymous to invention. There is a rich legacy of invention in the technology world. Getting your name on a patent was an ultimate trophy. Much of the myth and lore of tech and geek culture is built on a love of tinkering and invention. Now contrast that to what you see when you visit the business folks. They see innovation as the application of new ideas to create value for their current customers and to attract new customers. Unfortunately, now that you can win a patent for what is essentially a business idea, the invention/innovation distinction is even more muddled.
If executives let the two core parts of the company operate under completely different definitions, you’re bound to have conflicts and gridlock. You have to make it clear what innovation is and isn’t for your company, and how you’re going to measure its impact.
DE: There is always some level of risk with innovation because you’re operating in the unknown. You don’t know if the customers will respond. You don’t know if the response will be what you want or expect. When the revenue and health of a company are tied to getting a large number of favorable responses, there is risk.
But I should be clear that innovation, especially on the web, should be low risk from a technology perspective. You reach your customers through standard interfaces and over standard protocols. We know how to deploy safely 20 times a day. We know how to scale services to hundreds of millions of users. We know how to manage petabytes of data. If you’re running a web company, your innovation risk should almost exclusively be on the business end, not the technology end.
DE: Again, I’d ask what went wrong organizationally that put the Ops team in a position of risk. Was the business asking for something that was never done before and needed some never-thought-of-before technology to work? Doubtful. Did the developers change their underlying framework or introduce new technology that wasn’t properly vetted or Ops didn’t know how to handle? Possible. Did Ops upgrade or switch a technology component? Also possible.
My point is that, while Ops is the common scapegoat, the problem often started somewhere else and likely had nothing to do with the business being more innovative. So Ops gets blamed—which is like blaming the canary for the gas in the coal mine—and in response Ops starts saying no all the time. Suddenly “innovation” is the bad guy when it really had nothing to do with it.
DE: Innovation is a numbers game because, like most things in life, business has a countdown clock. If you’re a startup, it’s a simple clock. It’s the amount of cash left in the bank. If you are an established company, it’s a bit more complicated. It could be how long until a competitor beats you to the punch. It could be how long the CEO gives you to meet a goal. The point is, one way or another, you have a finite amount of time to absolutely delight the hell out of your customers by figuring out what they want and delivering exactly that to them.
You don’t control the clock and you don’t control the customer—what do you control? You can control the number of chances you get to delight the customer before the clock runs out. That’s where DevOps comes in. DevOps aims to remove all of the bottlenecks, inefficiencies, and risks between a business idea (the “ah-ha!”) and a measurable customer outcome (the “ka-ching!”). When you remove all of that, you get a lightning-fast and highly reliable service delivery pipeline that spans from the edge of Development all the way to the datacenter. That allows you to run more experiments, get faster feedback, and take more “shots at the prize.”
DE: There was a movie called “Charlie Wilson’s War” that had a great line between Tom Hanks, playing a U.S. Congressman, and Philip Seymour Hoffman, playing a CIA agent. Hoffman asks, “Why is Congress saying one thing and doing nothing?” Hanks replies, “Well, tradition mostly.”
All jokes aside, tradition is a powerful thing and hard to break. Tradition, or “what we’ve always done,” in IT is no different. There was a thread on Slashdot just this past month that asked whether developers should be allowed to deploy their own applications. You should have seen the outcry. The sheer number of commenters who shot down the idea as pure heresy was shocking. And the richest part of all of their denunciation was that the mob said over and over that the idea would “never work at a real company with real revenue at stake.” I thoroughly enjoyed sending that to John Allspaw, who runs all of technology at Etsy, and Jesse Robbins, who was in charge of risk and disaster planning for operations at Amazon. Etsy does over $600 million of transactions per year, and Amazon does about $50 billion in revenue. In both companies, developers are the ones who deploy and own the uptime for their own code. John’s reaction to the thread was a simple yet priceless one: “OMG.”
DE: We have a saying that we use a lot at DTO Solutions: “Moving to the cloud without changing your processes is just expensive and complicated Hosting 2.0.” The cloud gives you a new abstraction layer that provides all sorts of benefits in the form of flexibility and speed. But to take advantage of those benefits, you first must change your application lifecycle and operating procedures. Furthermore, you have to revisit the architecture and deployment model for your applications. Often you’ll find that the choices that were made in the past were based on outdated ideas like the need for hardware conservation or to fit a monolithic codebase into a waterfall project delivery cycle. The conditions have changed, so companies need to rethink how and why they do things within the context of the new conditions.
The cloud removes all sorts of infrastructure barriers that makes moving at a faster pace even possible. DevOps addresses the process and cultural issues. Agile addresses the software development process issues. Customer Development and The Lean Startup remove the business process issues. You add it all up and you are ready for your organization to move at speeds that you never thought were possible.
I recently made a couple of additional videos about the curiosity that is the Rerun project. You can find them below.
The conventional wisdom on shell scripts is that… well… “shell scripts suck”. But why? Shell as a language is extremely powerful and useful but shell scripts can quickly become unwieldy when trying to use amongst a team or in long-lived operations. But what if you had a framework that solved the team-level problems and the lacking of standardization while letting you use the full power and familiarity of shell? Enter Rerun, a simple tool that turns your favorite shell scripts into modular automation that has standardized options handling, command line completion, documentation generation, and a built-in test framework. Suddenly shell scripts don’t suck so bad anymore.
Why am I so interested in Rerun? Because I’ve seen Rerun have a positive effect on a very real human problem: In most non-startup organizations, the DevOps divide is made worse by a mismatch of skills, tools, and technologies.
It’s common for the Ops team to have used a tool like Puppet to automate server config and image building. But when it comes to app deployment and config, the various app teams don’t have the Puppet skills or motivation to follow suit. So each app team picks their own tooling or glue language. Of course, this just confuses Ops and makes their lives more difficult. Sometimes there will be a centralized release team (often now awkwardly rebranded as the “DevOps Team”) that will attempt to pick their own solution. But, neither Dev nor Ops ends up bring happy with the choice and the “DevOps Team” is now the bottleneck in the middle. Lots of noble DevOps intentions die in scenarios like this.
The effect of Rerun is that everyone can now come to the table on equal footing and use shell as their lingua franca. They can learn to collaborate using a simple framework for the “glue” that holds things together (of course, Ops still builds server images using a config management tool and Dev still builds their apps the way they want to). The built-in documentation generation and test automation framework makes handoffs easier. Everyone knows the simple command and options interfaces, but can also read each others code if need be (it’s just shell scripts, after all). Once you get everyone engaged and contributing to bridging the DevOps Gap, you can collaboratively start to look to other newer, specialized solutions.
I have to get Rerun’s creator, Alex Honor, to do a full post on Rerun. In the meantime you might find these videos interesting:
Video 1: Chuck Scott gives a tour of how he uses Rerun to turn his “keeper scripts” into reusable, standardized, test-driven automation
Video 2: Group discussion with Anthony Shortland, Lee Thompson, Chuck Scott that looks at an example of a DevOps toolchain automated with Rerun
Need a simple and self-contained* way to automate the full lifecycle of a Jenkins instance (install, uninstall, manage plug-ins, manage jobs, etc.)? Anthony Shortland shows how he gets it done with Rerun.
(*Why simple and self-contained? Many reasons… the company-wide adoption of full config management solution is proceeding at uneven pace, the need to use a lowest-common denominator language so you can have simple handoffs, you want to avoid “religious” tool wars, you need a very small footprint, you need it to be totally portable, …. and the list goes on)
Here is where you can find the Jenkins Rerun module:
Minimizing time to market and getting faster feedback from customers are primary concern for businesses who want to stay competitive. You need to be able to go from a business idea to a customer-facing running service as quickly, reliably, and effortlessly as possible. This as a flow of work that crosses many organizational silos.
Where does this flow often bog down? Handoffs. Whether the handoffs are within a team (e.g. Dev to Dev) or between teams (e.g. Dev to Ops), there is always the need to pass work from one stage of the lifecycle to the next.
At DTO Solutions, our clients are often already aware that they have flow problems when they ask us to for help. When we use techniques like Value-Stream Mapping to learn how the work flows, handoff problems are prominent forms of waste that jump off of the page. The diagram below uses pie charts to highlight the relative time lost due to difficult handoffs during the product life cycle.
What are common reasons for difficult handoffs?
Handoff problems affect organizations, both big and small. Obviously, one answer to solving handoff problems is to minimize them. But if you are in an organization larger than just a handful of people, that just isn’t a realistic option. To decrease time to market and enable fast feedback, you are going to have to roll up your sleeves to solve the handoff problems.
Where are good places to start making handoffs smoother?
Here are a few of the top fixes that we find important for solving handoff problems at their source:
In subsequent posts, we’ll address each one of these fixes.