View All Videos

Archive for the ‘DevOps’ Category

Deployment management design patterns for DevOps

Deployment management design patterns for DevOps

7

Alex Honor / 

If you are an application developer you are probably accustomed to drawing from established design patterns. A system of design pattern can play the role of a playbook offering solutions based on combining complimentary approaches. Awareness of design anti-patterns can also be helpful in avoiding future problems arising from typical pitfalls. Ideally, design patterns can be composed together to form new solutions. Patterns can also provide an effective vocabulary for architects, developers and administrators to discuss problems and weigh possible solutions.

It’s a topic I have discussed before, but what happens once the application code is completed and must run in integrated operational environments? For companies that run their business over the web, the act of deploying, configuring, and operating applications is arguably as important as writing the software itself. If an organization cannot efficiently and reliably deploy and operate the software, it won’t matter how good the application software is.

But where are the design patterns embodying best practices for managing software operations? Where is the catalog of design patterns for managing software deployments? What is needed is a set of design patterns for managing the operation of a software system in the large. Design patterns like these would be useful to those that automate any of these tasks and will facilitate those tools developers who have adopted the “infrastructure as code” philosophy.

So what are typical design problems in the world of software operation?

The challenges faced by software operations groups include:

  • Application deployments are complex: they are based on technologies from different vendors, are spread out over numerous machines in multiple environments, use different architectures, arranged in different topologies.
  • Management interfaces are inconsistent: every application component and supporting piece of infrastructure has a different way of being managed. This includes both how components are controlled and how they are configured.
  • Administrative management is hard to scale: As the layers of software components increase, so does the difficulty to coordinate actions across them. This is especially difficult when the same application can be setup to run in a minimal footprint while another can be designed to support massive load.
  • Infrastructure size differences: Software deployments must run in different sized environments. Infrastructure used for early integration testing is smaller than those supporting production. Infrastructure based on virtualization platforms also introduces the possibility of environments that can be re-scaled based on capacity needs.

Facing these challenges first hand, I have evolved a set of deployment management design patterns using a “divide and conquer” strategy. This strategy helps identify minimal domain-specific solutions (i.e., the patterns) and how to combine them in different contexts (i.e., using the patterns systematically). The set of design patterns also include anti-patterns. I call the system of design patterns “PAGODA”. The name is really not important but as an acronym it can mean:

  • PAtterns GOod-for Deployment Administration
  • PAckaGe-Oriented Deployment Administration
  • Patterns for Application and General Operation for Deployment Administrators
  • Patterns for Applications, Operations, and Deployment Administration

Pagoda as an acronym might be a bit of a stretch but the image of a pagoda just strikes me a as a picture of how the set of patterns can be combined to form a layered structure.

 

Here is a diagram of the set of design patterns arranged by how they interrelate.

 

The diagram style is inspired by a great reference book, Release It. You can see the anti patterns are colored red while the design patterns that mitigate them are in green.

Here is a brief description of each design pattern:

Pattern Description Mitigates Alternative names
Command Dispatcher A mechanism used to lookup and execute logically organized named procedures within a data context permitting environment abstraction within the implementations. Too Many Tools Command Framework
Lifecycle A formalized series of operational stages through which resources comprising application software systems must pass. Control Hairball Alternative names
Orchestrator Encapsulates a multi-step activity that spans a set of administrative steps and or other process workflows. Control Hairball, Too Many Cooks Process Workflow, Control Mediator,
Composable Service A set of independent deployments that can assembled together to support new patterns of integrated software systems. Monolithic Environment Composable Deployments
Adaptive Deployment Practice of using an environment-independent abstraction along with a set of template-based automation, that customizes software and configuration at deployment time. Control Hairball, Configuration Bird Nest, Unmet Integration Environment Adaption
Code-Data Split Practice of separating the executable files (the product) away from the environment-specific deployment files, such as configuration and data files that facilitates product upgrade and co-resident deployments. Service Monolith Software-Instance Split
Packaged Artifact A structured archive of files used for distributing any software release during the deployment process. Adhoc Release Alternative names

The anti-patterns might be more interesting since they represent practices that have definite disadvantages:

Anti-Pattern Description Mitigates Alternative names
Too Many Tools Each technology and process activity needs its own tool, resulting in a multitude of syntaxes and semantics that must each be understood by the operator, and makes automation across them difficult to achieve. Command Dispatcher Tool Mishmash, Heterogeneous interfaces
Too Many Cooks A common infrastructure must be maintained by various disciplines but each use their own tools to affect change increasing chances for conflicts and overall negative effects. Control Mediator Unmediated Action
Control Hairball A process that spans activities that occur across various tools and locations in the network, is implemented in a single piece of code for convenience but turns out to be very inflexible, opaque and hard to maintain and modify. Control Mediator, Adaptive Deployment, Workflow  
Configuration Bird Nest A network of circuitous indirections used to manage configuration and seem to intertwine like a labyrinth of straw in a bird nest. People often construct a bird nest in order to provide a consistent location for an external dependency. Environment Adaptation  
Service Monolith Complex integrated software systems end up being maintained as a single opaque mass with no-one understanding entirely how it was put together, or what elements it is comprised, and how they interact. Code-Data Split, Composable Service House Of Cards, Monolithic Environment
Adhoc Release The lack of standard practice and distribution mechanisms for releasing application changes. Packaged Artifact  

 

Of course, this isn’t the absolute set of deployment management patterns. No doubt you might have discovered and developed your own. It is useful to identify and catalog them so they can be shared with others that will face scenarios already examined and resolved. Perhaps this set offered here will spurn a greater effort.

 

 

6 Months In: Fully Automated Provisioning Revisited

6 Months In: Fully Automated Provisioning Revisited

4

Damon Edwards / 

It’s been about six months since I co-authored the “Web Ops 2.0: Achieving Fully Automated Provisioning” whitepaper along with the good folks at Reductive Labs (the team behind Puppet). While the paper was built on a case study about a joint user of ControlTier and Puppet (and a joint client of my employer, DTO Solutions, and Reductive Labs), the broader goal was to start a discussion around the concept of fully automated provisioning.

So far, so good. In addition to the feedback and lively discussion, we’ve just gotten word of the first independent presentation by a community member. Dan Nemec of Silverpop made a great presentation at AWSome Atlanta (a cloud computing technology focused meetup). John Willis was kind enough to record and post the video:

I’m currently working on an updated and expanded version of the whitepaper and am looking for any contributors who want to participate. Everything is being done under the Creative Commons (Attribution – Share Alike) license.

The core definition of “fully automated provisioning” hasn’t changed: the ability to deploy, update, and repair your application infrastructure using only pre-defined automated procedures.

Nor has the criteria for achieving fully automated provisioning:

  1. Be able to automatically provision an entire environment — from “bare-metal” to running business services — completely from specification
  2. No direct management of individual boxes
  3. Be able to revert to a “previously known good” state at any time
  4. It’s easier to re-provision than it is to repair
  5. Anyone on your team with minimal domain specific knowledge can deploy or update an environment

The representation of the open source toolchain has been updated and currently looks like this:

 

The new column on the left was added to describe the kind of actions that takes place at the corresponding layer. The middle column shows each layer of the toolchain. In the right column are examples of existing tools.

There are some other areas that are currently being discussed:

1. Where does application package management fall?
This is an interesting debate. Some people feel that all package distribution and management (system and application packages) should take place at the system configuration management layer. Others think that it’s appropriate for the system configuration management layer to handle system packages and the application service deployment layer to handle application  and content packages.

2. How important is consistency across lifecycle?
It’s difficult to argue against consistency, but how far back into the lifecycle should the fully automated provisioning system reach? All Staging/QA environments? All integrated development environments? Individual developer’s systems? It’s a good rule of thumb to deal with non-functional requirements as early in the lifecycle as possible, but that imposes an overhead that must be dealt with.

 
3. Language debate
With a toolchain you are going to have different tools with varying methods of configuration. What kind of overhead are you adding because of differing languages or configuration syntax? Does individual bias towards a particular language or syntax come into play? Is it easier to bend (or some would say abuse) one tool to do most of everything rather than use a toolchain that lets each tool do what its supposed to be good at?

4. New case study
I’m working on adding additional case studies. If anyone has a good example of any part of the toolchain in action, let me know.

Tools are easy. Changing your operating culture is hard.

1

Damon Edwards / 

Did you ever notice that our first inclination is to reach for a tool when we want to change something? What we always seem to forget is that web operations, as a discipline, is only partially about technology.

The success of your web operations depends more on the processes and culture your people work within than it does on any specific tooling choices. Yet, as soon as you start talking about changing/improving operations the conversation quickly devolves into arguments about the merits of various tools.

We see this repeatedly in our consulting business. Time after time we are called in to do a specific automation project and wind up spending the bulk of the effort as counselors and coaches helping the organization make the cultural shift that was the real intention of the automation project.

This article from InfoQ on how difficult it is to get development organizations to adopt an agile culture is a superb encapsulation of the difficulty of cultural change. Switch the word “development” to “web operations” and switch “Agile” to any cultural change you want to make and the article still holds up.

This condition really shouldn’t be a surprise to any of us. After all, how much time do we really spend, as an industry, discussing cultural, process, and organizational issues? Compare the number of books and articles written about the people part of web operations vs. the number of books and articles written about the people part of web operations. The ratio is abysmal, especially when you compare it to other types of business operations (manufacturing, finance, service industries, etc.)

UPDATE: The Agile Executive brings up the point that tools are valuable for enforcing new behavior. I definitely agree with that… but still maintain that new tools without a conscious decision to change behavior and culture is, more often than not, an approach that will fail.

Automated Infrastructure enables Agile Operations

Automated Infrastructure enables Agile Operations

5

Alex Honor / 

Agile” been applied to such unanticipated domains as enterprises, start ups, investing, etc. Agile encompasses several generic common sense principles (eg: simple design, sustainable pace, many incremental changes, action over bureaucracy, etc.) so the desire to bestow its virtues on all kinds of endeavors is understandable.

But why contemplate the idea of Agile Operations? Why would Agile Operations even make sense?

Let’s start by playing devils advocate. Some of the Agile principles appear to contradict well established and accepted systems administration goals, namely stability and availability. Traditional culture in operations leans towards risk-aversion and stasis in an attempt to assure maximum service levels. Many operations groups play a centralized role serving multiple business lines and have evolved to follow a top-down directed, command and control style management structure that wants to limit access to change. From their point of view, change is the enemy of stability and availability. With stability and availability being the primary goals of operations, it’s easy to see where the skepticism towards Agile Operations comes from.


The calls for Agile Operations has initially been driven by product development groups that employ Agile practices . These groups churn out frequent, small improvements to their software systems on a daily basis. The difference in change management philosophy has been the cause of a growing clash between development and operations. The clash intensifies when the business wants to drive these rapid product development iterations all the way through to production (even 10+ times a day).


So, if operations is to avoid being a bottleneck to this Agile empowered flow of product changes, how can they do it in a way that won’t create unmanageable chaos?

To apply Agile to the world of operations, one must first see all infrastructure as programmable. Rather than see infrastructure as islands of equipment that were setup by reading a manual and typing commands into a terminal, one sees infrastructure as a set of components that are bootstrapped and maintained through programs. In other words, infrastructure is managed by executing code not by directly applying changes manually at the keyboard.


Replacing manual tasks with executable code is the crucial enabler to sharing a common set of change management principles between development and operations. This alignment is truly the key first step in allying development and operations to support the business’ time to market needs. This shared change management model also facilitates a few additional beneficial practices.

  • Shared code bases: Store and control application and infrastructure code in the same place so both dev and ops staff have clear visibility into everything needed to create a running service.
  • Collaborative configuration management: Application and infrastructure configuration management code can be jointly developed early in the development cycle and tested in development integration environments. Code and configuration become the currency between dev and ops.
  • Skill transfer: App and ops engineers can transfer knowledge about the inner workings of the runtime application system and develop skills around tooling to maintain them.
  • Reproducibility: Reproducing a running application from source and a build specification is vital to managing a business at scale. (http://www.itpi.org/home/visibleops.php)

While some may argue that “Agile” in its entirety does not completely apply to the world of operations, an automated infrastructure based on principles like code sharing as a form of collaboration between dev and ops is a sound basis to enable business agility.

 

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr

1

Alex Honor / 

 

The Flickr guys, John Allspaw and Paul Hammond gave an entertaining and validating presentation at OReilly Velocity (slides).

 

The talk began with a brief description about how Flickr’s technology group enabled the business to deliver features and update their online service frequently (10+ deploys per day) but it really turned out to be a success story about how Dev and Ops can align and work together without falling into age-old traditional cross organizational conflicts.


Here’s a few (paraphrased) quotes:

 

 

Ops’ job is to enable the business. (not to keep the site stable and fast)

 

The business requires change… so lower the risk of change through tools and culture. Make more frequent, but smaller changes … through push button (and hands off) deployment tools.

 

Plan fire drills to make sure everyone (junior guys included) knows how to solve production problems because failure will happen.

 

Ops who think like devs. Devs who think like ops

 

 

The talk really boiled down to two ingredients to enable the close dev and operations collaboration (tools + culture):

 

 

Tools

1. Automated infrastructure

2. Shared version control

3. One step build and deploy

4. Feature flags

5. Shared metrics

6. IRC and IM robots

 

Culture

1. Respect

2. Trust

3. Healthy attitude about failure

4. Avoiding blame

 

 

I think for some, the real validation was hearing that it’s just as much making a cultural shift as it is a mixture of choosing and using the right kind of tools. Anybody who has worked in the trenches will realize that of course.

 

 

Continuous Deployment Really Means Continuous Testing

Continuous Deployment Really Means Continuous Testing

2

Damon Edwards / 

On Twitter and on web operations focused blogs, the concept of Continuous Deployment is a topic that is gaining momentum. Across our consulting clients, we’ve also seen a significant uptick in discussion around the concept of Continuous Deployment (some calling it “Agile Deployment”).

The extreme example of Continuous Deployment that has sparked the most polarizing discussions is from Timothy Fitz’s posts on doing production deployment’s up to fifty times per day.

While it’s a fascinating read, many people for whom the essay is their first exposure to the idea of Continuous Deployment overlook the real value. The value is not how Fitz gets code all the way into production on a sub-daily basis. The value is in achieving a state of continuous automated testing.

If you understand the concept of “the earlier you find a bug, the cheaper it is”, the idea of continuous testing is as good as it gets. Every time a build executes, your full suite of unit, regression, user/functional, and performance tests are automatically run. In a mature operation this could quite literally mean millions of automated tests being executed every day. As your application development makes even the smallest moves forward, the application is being rigorously testing inside and out.

Another common misconception is that Continuous Deployment means that human-powered QA cycles are a thing of the past or are somehow less important. This belief is probably a byproduct of those extreme practitioners of Continuous Deployment who are doing hot deployments to production after every build. In most business scenarios there is not much benefit to continuous production deployment. The value of a human-powered QA team sensing if the look, feel, and functionality match the requirements can’t, and shouldn’t, be overlooked.

Most of our consulting clients just aren’t interested in sub-daily deployments to live production environments. What they want out of Continuous Deployment is to have a constant state of broad automated testing and an always up-to-date QA environment for human-powered testing and business review.

In addition to deploying a broad suite of automated testing tools, Fully Automated Provisioning provides the linchpin that makes Continuous Deployment a reality.

Page 12 of 14First1011121314