Configuration Management for Continuous Delivery (part 1)

Avoid cognitive explosion, then take away the waterfall you first thought of

Continuous delivery is about quickly and repeatedly assembling successive versions of an evolving product design, ensuring its availabilty in a `user-ready' state at all times. The evolving versions should be able to co-exist unproblematically, and the procedural overhead of managing them should be as light as possible. The speed of the transformation pipeline depends on all of its parts, with a simple physics of flow.

Some confusion has welled up about the role of configuration management in these processes. Is it just for building the infrastructure, or is is also for helping the delivery? Is it (or is it not) part of the dynamics of application development? My own vision of configuration has always been that it is: configuration has a dynamical role to play; indeed, I often use the term `maintenance' rather than `build' for that reason.

In fact, Continuous Delivery, is just a scaled version of the convergent approach to configuration management (advocated since the early days of CFEngine, and now also enshrined in Puppet and Chef), applied not only to live machine resources, but to live software resources.

Give me end-state, not start-image

The configuration debate is sometimes framed as being about convergence (or coming to a desired end state) or congruence (building a result from a blank state). Convergence is like fixing the outcome and compute the route (like a GPS finder), and congruence is about repeating a recipe in a sequence of known steps to massage a system into shape (what W.E. Deming calls tampering). This dichotomy is foreign to the majority of developers, who mainly think in terms of the latter. The latter is what we are taught in college (imperative programming and flow charts!), but I have argued for the former (and indeed have championed this in CFEngine) because the latter does not properly address error-correction (noise) in the pipeline. And, in fact, most IT people do know about convergence through the command tool make.

Make turns the naive sequence of to-dos upside down into a goal-oriented series of cascading waterfalls to an end-state or `build product'. If you find yourself stuck in the middle of the cascade for some reason, another round of execution will pick up where you left off and deliver the promised result, without having to go all the way back to the beginning. That is fault-tolerant continuity in action.

Desired-state building

What CFEngine's convergent end-state, and later promise theory, were able to do was to redefine a change process as the assurance of a system of fixed outcomes (targets), based entirely on data (i.e. the description of the absolute state rather than a sequence of relative transformations). This is a model-based approach: you design an outcome as a number of promises and allow some promises to depend on the pre-existence of others. If parts change or get broken, the whole can self-repair just the relevant parts from any initial state of repair, and converge or cascade towards a desired end-state (the `product state'). This is exactly what make tries to do.

This makes more sense than starting from nothing every time. In a factory pipeline, if the system is suspended for some reason, we don't throw away all the half-built cars and start over again; we make the changes required and pick up production where we left off. This is because we know what we are making, and can test when we've arrived at that state (what I call the GPS approach to building verifiable outcomes). We do not have to start from the beginning and there is a well-defined end state.

The advantage of this model-based approach is that it doesn't matter how many changes and rebuilds you do, you will always follow an efficient and targeted approach to the planned design state, and as soon as you see a change to some part you can initiate the remainder of the pipeline without delay. You ensure that your changes are compatible with the desired outcome. Thus, one of its side-effects is to increase the throughput: a major point in the argument for continuous delivery.

Tampering versus convergence

Avoiding conflicts of interest

There is a possible conflict of interest here though. As a product company you want to have stability and not waste everyone's time upgrading every minor bug-fix. That suggests not too many version releases. You also need to fix bugs and issues more frequently. That suggests you should make many releases. To solve this conundrum the industry has learnt to use version branching, so that one may pursue the illusion of being able to make many changes without conflict. But here we can point out a basic lesson of promise theory: just because every version is promised, doesn't mean anyone has to promise to use any of them: upgrading is voluntary. So there do not have to be many branches. A single branch will do, as long as the continuous process converges.

Don't forget the intention pipeline

For presumably cultural reasons, we are conditioned to focus on the building process, but the very first part of the pipeline is the evolving design itself, or what we intend to build. Given a new intended outcome, we know that we can converge to a reliable product without waste or conflict, as long as there are only weak dependencies between the components (promise theory).

But, when we make different versions of the same thing, i.e. newer versions that should be compatible, drop-in substitutes for old versions -- hopefully with improvements --, how do we know the design itself will converge to have the same fitness for purpose, with forwards and backwards compatibility?

In other words, can we make intentions continuously compatible too?

The common understanding of how to do this is to commit changes of intent to a completely separate model of desired state, identified as a `version branch' which is a single `world of thought'. But this leads to a branching into `many worlds', and an ever-increasing split-brain problem. By branching, we seem to avoid the problem of later changes destroying earlier one, leading to a scenario like Deming's funnel experiment. But we need changes to converge together to preserve compatibility with the desired outcome.

Planning to a desired state makes a lot of sense. In practice, however, there is a little bit of Monte Carlo searching involved in developing software: trial and error is an efficient method of solving problems, which is fine as long as there is running error correction. Thus the Continuous Delivery philosophy is to avoid too much branching and correct continuously, according to the larger goal (Shannon error-correction at the human level). It is just basic maintenance, very like the CFEngine approach to continuity of operational state. It is fine as long as error correction is so fast that consumers can't really see it happening. This applies to configurations just as much as it applies to software. So, if we are making infrastructure only to support the continuous delivery of a product, we can still apply the same principles to the infrastructure.

Modularity or atomicity in containers enables convergence

Modularity, or putting semantics into fixed buckets, helps convergence to a predictable end state (think hash table or Bloom filter). If your software is really a bunch of independent promises, then every one can converge independently of every other, without interference. Even the order doesn't matter, and parallel factories for keeping the parts can keep their promises to build the parts to be assembled later. Arranging for this parallelized design is what distributed orchestration is supposed to mean, though it is not always what is meant by the industry today.

We are seeing the branching mistake being made again, in operations, today with the return to a branch into a new virtual machine or container with a `golden image' approach to simplify complex configurations. The idea is to never bring the changes back together if we can avoid doing so. But this does not take version compatibility into account at the consumer experience level. Stuffing complexity into a box so you can't see it, or placing the explosion behind blast doors, does not really solve the problem.

Branching seems to bring tidy encapsulation, but it also leads to a proliferation of things (a cognitive explosion), which seems an easy quick-fix up front, but which get increasingly expensive and error-prone down the line as one if forced to merge to a compatible set of intentions. A simpler approach is the use of model-based pattern configurations to identify convergent outcomes from the beginning. Model based tools, like CFEngine, allow you to move from model to model in independent waterfall cascades, without sacrificing backwards compatibility. While you might be able to power through `many worlds' of golden images by brute force, using model-based configurable worlds has a lower cognitive cost.

In terms of the convergence to desired state (sometimes incorrectly called idempotence), what we have is a method for stability of not only end state but in end-to-end processes, through pre-conditions, whether under this branching and merging, or directly in a single timeline. When every change version is backwards compatible (maintaining the set of compatible promises), you are converging, not breaking earlier versions; and all changes reach a desired asymptotic end-state, whether you are explicitly modular or not. It is a convergence of convergences, a different kind of waterfall. Many-worlds collapse into one. This is also what continuous delivery tries to do at the multi-component software level.

Model-based convergence is the friend of continuity

The answer to rapidly changing virtualization and branching is thus not the fixed library of golden images, which are merely cached decision trees. They only push the problems ahead of you.

Both intent and execution (semantics and dynamics) have to be continuously convergent. It pays, then, to think through what promises you want to keep rather carefully, because it is the desired end-state that makes this possible. If you are changing the design in a non-backwards compatible way, you most likely start enacting Deming's funnel, chasing after relative changes, and possibly not converging at all - instead, flying off into the void and never arriving at an improved state, because you are constantly changing the measuring stick for success *.

Branching doesn't matter as long as it quickly converges to known semantic buckets.

Today, we frequently imagine that programmability itself brings consistency. It really doesn't. If you fix the start-state and apply deltas, there is no guarantee of arriving error-free at a desired end-state. You have to turn it around into a convergent goal, as make taught us.

Think not infrastructure as code, but infrastructure (and code) as documentation (of intended outcome). A simple self-healing build system. CFEngine, for example, far from being a legacy tool touted by some is still a reference implementation for distributed, orchestrated data-driven processes, with model-based policy. It needs no separate `orchestration engine'. Its continuous operation becomes the continuously unfolding delivery pipeline to product.

PART 2: how do we accomplish convergence technically?

* This happens with all difference methods: they are fundamentally unstable things. I learned this first in finite-difference methods for solving differential equations (like the kind that get space probes to Mars), but it applies in more mundane cases. Large changes are more likely to stray from the desired path than small ones, hence Newton invented calculus. That's why continuous delivery methods emphasize many small changes rather than big project rocket launches.