dev2ops

Use the word “process” and confusion ensues

Damon Edwards / June 10th, 2008

In my last post about the River of News for Monitoring concept, process played a central role. In various conversations I’ve had since writing that post, it’s become clear to me that the word “process” is a tricky and overloaded word. There are lots of processes whirling through an enterprise. There are business processes (customer transactions, supplier transactions, human resources, finance, etc.). There are application processes (pretty self-explanatory). Then there are IT processes. My view of IT processes is that they are the actions that transform your IT assets and their related environments. (Note: obviously you could use an ITIL-like definition born from a standards body at this point, but I’m looking for a simple set of buckets to use without causing further debate)

There are plenty of tools that track and examine business processes. There are plenty of tools that track and examine application processes. However, when it comes to IT process, the available tooling is quite thin. Sure there are tools (e.g. ticketing systems, bug trackers, approval workflows, etc.) that track the HUMAN aspect of your IT processes, but they give you very little visibility into what actually took place at the system or application level. To make matters worse, under their fancy dashboards, most of these systems they rely almost entirely on a human to tell them what was done or observed. In today’s complex and highly distributed environments, it’s almost impossible to get an accurate picture of what really took place using these faulty or outdated techniques.

Skeptical? Give the status quo a test. Walk into any sizable IT operations and ask them to do 2 things:

1. Show you all of the deployment activity, with the context that those activities occurred in, that took place in [pick some slice of their environment] between [pick two arbitrary points in time]. This doesn’t mean things like “Bob said he completed the steps of this process” or “Jane said she ran this set of scripts”. I’m referring to evidence of the actual technical activities that took place across the boxes.

2. Show you the entire lifecycle of [pick an arbitrary application package], from when it was built and packaged to all of its deployments throughout Dev, QA, and Prod environments.

I would be shocked if they had this information readily available. In most cases, they couldn’t produce this information at all. For many companies these systems are their source of revenue generation, their “factory floor” if you will, and they can’t answer these simple questions. In any other industry this would be wholly unacceptable.

This is the situation we are working on changing.

IT Operations Needs a “River of News”

Damon Edwards / June 2nd, 2008

Over time, user interfaces for a class of tools tend to settle on a common paradigm. User expectations and vendors’ copycat nature form a self-correcting loop that settles somewhere around a comfortable middle. For a fun example, try to spot the differences between the Yahoo and AOL homepages. User interface paradigms in the systems management world have followed this same herd mentality.

Monitoring has always been the dominant feature of systems management, so it is of little surprise that the flavor of monitoring has dominated the user interfaces and dashboards of the major enterprise systems management tools. Show me my things. Show me the state of my things. Show me some rollup numbers about my things and their states. For answering questions about things (usually servers) and the state of those things, this classic monitoring paradigm tends to work quite well.

But this point of view falls short in an area that is gaining importance, visibility into IT operations processes. As systems become increasingly distributed and IT operations moves from a back of the house function to a revenue producing core competency, visibility into process is all the more important. Who has carried out what actions? What actions were performed on what machines? What’s the history of a set of packages as they have moved from development to production? How can your organization know when a complex update process has been completed? How do you know when changes have been made outside of approved change windows? The inventory and state paradigm of traditional monitoring tools doesn’t help you much when you are asking these kinds of questions.

A new kind of monitoring paradigm is needed to answer these questions. The most promising concept I’ve seen as of late comes not from the systems management world but rather from the blogosphere. Dave Winer first popularized the concept of a “River of News” and the concept has applicability here. Simply put, Dave describes the flow of information that comes into his feed reader to be more like a river of information going by his door than a set of messages being delivered to his mailbox.

The activity that takes place within IT operations can similarly be likened to a river. Today, most of that river of activity takes place in the shadows and little, if any of it, is captured by a central system. Usually, the only way to find out about these events is by word of mouth or person-to-person email chains. Anyone who is more than 1 or 2 degrees of separation away from the event is usually flying in the dark.

So how would a river of news style tool for enterprise systems management processes work?

1. You need to create the river. All of the scripts and tools you use to build your deployment artifacts and manipulate your environments need to dump events into the river. As a side benefit, creating the river is an incentive to enforce the rule that changes must only be made through change management tooling.

2. You need to create filters for the river. Filters allow you to view the river from a certain point of view (like a package, user, node, environment, etc) or only view events that happened between certain points in time. You’re going to need both common views as well as the ability to setup and share ad-hoc views.

3. You need to setup notifications. You can’t watch the river at all times and, unlike the blogosphere, there are some pieces of news that you just can’t miss. You need to be able to set traps that watch for events or a series of events that match particular patterns. When those traps are hit you then have to make sure the right people are notified and sent the relevant information.

4. You need to introduce management dashboards and auditing reports. Keep the senior managers, bean counters, and compliance auditors happy and your life will be happier.

This whole idea is still a pretty fresh one. In upcoming releases of ControlTier’s ReportCenter, we are going to be introducing these features and seeing where they take us. Any and all comments or suggestions are more than welcome.

One step closer to Datacenter WYSIWYG?

Alex Honor / May 21st, 2008

One of the great opportunities that came from being a JavaOne exhibitor was the chance to meet a variety of people: everyone from business managers to day-to-day developers. Many had a story to tell about their dev-to-ops problems.

One visitor to our booth introduced me to two acronyms I hadn’t heard before but they certainly reflect the philosophy that drives the development of our open source software projects. When guiding him through the demo of our Workbench CMDB tool, he pointed out that was the “WIRSB” aspect of our tool. Next when I showed how our execution framework and automation modules were driven by the CMDB data, bringing the environment in line with the model state he exclaimed that was the “WIRI”.

WIRSB? WIRI? These were two acronyms that were new to me and I stopped and asked him what those meant.

WIRSB: What It Really Should Be
He explained how a tool that changed the state of an application – either its topology, configuration, or runtime state – should be based on an integrated model. This model should include information about the machines, software packages, essential configuration settings, as well as, a model for how the change will actually be done. It’s the specification for how things should be.

WIRI: What It Really Is
With the specification described by WIRSB in hand, one can imagine that management tools should be able to compare the current reality to the idealized form described by the specification. Difference represent cases of non-compliance. Additionally, it should be the job of other management tools to transform the environment to match the specification.

The ultimate goal is a minimal difference between WIRSB and WIRI pictures at any point in time. If you look at emerging next generation solutions, they each seem to incorporate the paradigm of WIRSB vs WIRI to some extent. Solutions that don’t adopt this paradigm will eventually become scaling bottle necks that really do slow down innovation.

Of course, anyone familiar with model-driven architecture will perceive this WIRSB/WIRI idea as the same concept. What I found encouraging is this thinking is penetrating into the operations world, too. Perhaps we are seeing an evolutionary step away from traditional administration tools that too often assume a two dimensional static state, or the
stone axes hurriedly made in house.

Datacenter WYSIWYG
Maybe the end state is a graphical tool that allows a service operations manager to define the WIRSB model. The management tools would then be driven by the model to transform, maintain and monitor the environment, always ensuring that WIRSB == WIRI.

Stone Axes — the tale of the secret development effort going on at your company right now

Alex Honor / May 7th, 2008

UPDATE (7/9/08): Welcome SYS-CON readers!

Talk to any seasoned application developer who’s about to embark on a new web application project and you can bet on the following: they have decided on or have narrowed down their choice on an application development framework. The use of some kind of framework is taken for granted. Why do application developers rely on and use frameworks? Because frameworks provide the necessary scaffolding that allows the developer to focus on just the business logic.

Talk to engineers and administrators that manage the online service of a SaaS or eCommerce business, and you see quite a different picture. Indeed, you’ll even find a different set of assumptions. This group, the one that keeps the service running, writes and uses custom scripts and tools to get their job done. This “stuff” is software, and it is crucial for keeping the business online. Unfortunately, this software is almost always invisible to the business owners. And, you can also bet on the following: there isn’t a framework, and there IS a whole lot of scaffolding being reinvented. When I say framework, I’m not referring to EMS frameworks, the ones that include agents on each host for monitoring and (ahem) “management”. What I’m talking about here are the scripts the engineers and administrators write, to automate the online service operations. These are done without an underlying framework. This body of management scripts ultimately boils down to business logic, the logic that governs the delivery of the service operation.

For those that may not be aware of or don’t have first hand experience in the SaaS or eCommerce world, it turns out there really are two software development efforts going on. First, there is the software effort everybody knows about – the one the business and product owners focus on – making the software enabling the business model. Second, there is the effort producing all the stuff that enables the delivery of the business, in the form of an always-on online service. This software lets the operations team keep the service updated, maintained and available. Unfortunately, this second body of software is often taken for granted, seems to occur behind the scenes and is almost never subject to the same attention and rigors as the business model software.

The ops folks that work in the trenches, appreciate how important their roles and scripting are to the business. But often times, they don’t identify themselves as developers nor do they see their home grown tools as “real” software, but rather just expedient simple time savers. The trouble is this collection of scripts and tools is important to keeping the business running and should be seen as business software. The authors and users of these management scripts appreciate this and do envision better designed, implemented and tested code since their jobs rely on the reliability of their performance. Like any software, requirements change and the software must change with it. What once started off as a simple script, turns into a monolithic, inflexible morass of code.

More often then not, the most used and relied upon programming construct is extremely rudimentary and crude: a loop that iterates over a set of host machines, executing some sequence of system commands. As the scripts importance increases, and the requirements drive its evolution, you also see the need to include logging, notification, security, modularity, data, configuration, etc… Wait. Doesn’t this sound like stuff frameworks usually provide?

So what the service operations team needs is a new breed of framework, a domain-specific framework that helps them develop “management applications”. Like traditional application development frameworks, it should include features like:

logging: Log the activity that changes the environment or effects the online service

notification: Forward events to email or monitoring stations

security: Control access to only specified actions

modularity: Allow packaging functionality so it can be reused

data: Enable the design and maintenance of a data model

configuration: Allow the management software to be externally configured so it can be used across environments

A domain-specific framework should also include:

distributed execution: Provide network abstraction so large sets of hosts can be controlled and coordinated

development tooling: Something like an IDE for development management applications

packaging and deployment infrastructure: It’s a distributed world so the framework should include the ability to package new code modules and deploy them to where they need to go

canned functionality: There should be a set of common utilities that can be used as a starting point to create new management applications

multi-language support: certain languages are better suited to different problems so the framework should allow the definition of new components in multiple implementation languages

Operations doesn’t need to undergo a paradigm shift but rather alignment with their application developer brethren by using a development framework of their own. The business also needs to acknowledge this second, yet equally important software project: the one that keeps the service going.

Jing Project… the “Scrum” of project communications?

Damon Edwards / April 23rd, 2008

One of my favorite ways that I’ve heard the Scrum methodology described is that it is project management designed to compliment how people and human nature actually work, not how we wish they would work.

The Jing Project brings that same philosophy to project communications.

When you are working with distributed teams, information sharing (the collection of tribal knowledge, if you will) is paramount. Groups that share knowledge effectively and seamlessly flourish. There is a long and infamous list of failed projects that died for no reason other than poor communication.

How do managers try to enforce effective and timely information sharing? Usually through a knee jerk reaction of increasingly formalized documentation requirements. The medium of communication is effective, but the barrier to actually communicating is so high that it actually encourages less communication.

How do engineers end up sharing information? Telephone calls, IM, and yelling over the cube wall where possible. The barrier to communicating is very low but the medium is absolutely lousy for gaining effective tribal knowledge.

Is there a happy medium that is practical and realistic? Enter the Jing Project. The technology is nothing new, but its subtle combination and ease of use hints at a communication method that could change how organizations share technical and procedural knowledge.

I think it’s only fitting that I let a short screencast explain how Jing works.

Of course, Jing is a bit bare bones and is lacking killer features like being able to attach files that live with your videos or link response videos together. But anyway you look at it, the implications for the future are clear.

Lesson learned: Keep it simple… but not stupid!

Damon Edwards / March 25th, 2008

In most areas of life you don’t get a redo. But luckily, when it comes to software all you need is time and motivation.

The development effort that has gone into the ControlTier automation platform thus far has been gratifyingly leveraged by some of the largest e-commerce and SaaS providers out there. But while we were out there earning a reputation for getting the impossible done, we took our eye off the ultimate goal: To make the ControlTier automation tools useful and accessible to organizations of all sizes and complexity.

Being too focused on solving the biggest and thorniest problems can lead you to forget that you also have a responsibility to make the small to medium problems trivial to handle. After all, in our field, most people’s day-to-day lives are consumed with an overwhelming amount of minor to moderate tasks. If you don’t focus equal attention on the “small stuff” you are just throwing up artificial barriers to adoption and making it more more difficult it is to reach the people you are trying to help.

Ok, so lesson learned. Now what are we doing about it?

The first thing we’ve done is take the distributed automation framework that provides ControlTier’s foundation and completely reworked the code to make it a standalone tool the comes with useful utilities built right in (the tool is called CTL). The second thing we did was provide users with an option to skip the environment and application modeling that was required to get going with ControlTier tools. While, yes, it is ControlTier’s modeling capabilities that make it such a powerful and flexible solution, it’s tough to get your head around it when all you want to do is plug in the framework and automate some simple tasks.

With the release of CTL, users now have the option of writing standalone command modules that require no modeling. As a solution grows in complexity, scale, and variability that same user can piece-by-piece take advantage of the full modeling and context management features as appropriate. This way we’ve managed to make the “easy stuff” trivial and straightforward while still giving you the full power needed to tackle the big problems.

Page 20 of 26First «18 192021 22 »Last

Use the word “process” and confusion ensues

IT Operations Needs a “River of News”

One step closer to Datacenter WYSIWYG?

Stone Axes — the tale of the secret development effort going on at your company right now

Jing Project… the “Scrum” of project communications?

Lesson learned: Keep it simple… but not stupid!

Get new posts by email

Browse

Dev2Ops Authors on Twitter

Archives