(2006) Configuration Management, Models and Myths

Mark Burgess

Originally published in Usenix ;login: 2006-2007
  1. Part 1: Cabbage Patch KISS
  2. Part 2: Babel, babble, toil and grammar
  3. Part 3: A Shocking Lack of Ad-hocracy
  4. Part 4: There's no I/O without U

Part 1: Cabbage Patch KISS

When I was eighteen and fresh out of school, I worked for a couple of summers as a gardener in the well-to-do houses and estates of the English countryside, together with a small team of hands. Each sunrise, we were transported at breakneck speeds around the veins of the Midlands in a rusting station wagon (appropriately known as `estate cars' in the U.K.) by my boss Ken, a soon-to-retire adventurer with a British Air Force moustache and a love of music and the outdoors. We would discuss the great Russian composers as we clipped away at hedges and tilled over flower beds.

As workers, we were somewhat human in our modus; it was occasionally precarious work, and not altogether skilled or efficient, especially after the lunch breaks during which we would ritually consume as many rounds of beer at the local pub as there were workers on that particular day (a matter of honour, and not something that I could conceivably survive in my present). These were `Bruckner episodes', as Ken noted, ever trying to get me to listen to the rather mediocre composer, whose work he referred to as `traffic-light music'. I later learned this meant that just when you thought it was about to finally go somewhere, it would stop, dither and then attempt to start again. Quite.

I believe I learned several good lessons about maintenance from my stint in the English Country Gardens. The first was "Always let your tool do the work", as Ken used to point out, with a nudge and a wink and then a guffaw. In other words, know how to use the tools of the job rather than breaking your back with manual labour (and various carnal interpretations).

The second was about careful execution. Gardening is nothing if not a strenuous job. It seems straightforward, everything under control, until you mistakenly cut a prize flower in two, or fall foul of a mixup. "Oh, I thought she said do destroy the garden gnome", enshrined amongst the taskforce as: "Always read the destructions." On one occasion, enthusiastic over-zealousness was cut short when a friend of mine stepped backwards onto a rake (in classic Tom and Jerry style), and emptied a wheelbarrow full of earth into the client's newly filled swimming pool. This was not policy, but we like to think of it as one of those inevitable once-in-lifetime (or workday) freak accidents. Yes, we were naive to believe otherwise.

Gardening was, I think, my first exposure to the idea of configuration manangement, i.e. the routine maintenance of everthing from the most sublime stately garden to the most rediculous cabbage patch. Size is not important; the same errors are made in spite of scale. It is very much about seeing how the result of estensively planned and orchestrated order generally falls foul both to growing weeds and the misguided hands of the reputedly infallible maintainer. (In spite of the best made plans of pubs and men, anything that can go wrong is not necessarily to be blamed on a pint of Murphy's.)

In this series I want to talk about configuration management in system administration from the perspective of a researcher in the field: this is my cabbage patch, a topic that I have been interested in for some fifteen years now, it is also a fascinating theoretical and practical problem, which has been overshadowed in recent years by tool-talk and XML incantations. Yet, there is much to be said about this topic from a research point of view and all those eager to rush out and build their own tool should pause to take heed of what has been learned. It's not about tools, it is about principles and assumptions, just as it is about knowing the limitations and learning to deal with them. So, I want to dangle a carrot of rationality in front of you, to turn the discussion away from the cosmetics of tools back to the science of the problem.

What is a configuration?

As with any Geek Tragedy we start by introducing the dramatis personae, and speak the names of our dæmons to show that we do not fear them. We begin with the notion of the configuration itself. This seems perhaps to be a rather obvious concept, but as we know from bitter experience, assuming the obvious to be uninteresting is the best way to avoid seeing the wood for the trees. So what is configuration about? Here are some definitions for configuration, which I found rustling through the weeds:

  • "The appearance formed by the arrangement and relation of parts"
  • "An arrangement or layout".
  • "Functional or physical characteristics of hardware/software as set forth in technical documentation" (from a software document).
  • "The specific assemblage of components and devices that makes up the components of a complete system."

Of these, I like the first and the last the best. It is clear that configuration has to do with identifiable patterns, and how the pieces fit together to make a whole. In the last definition we also mix in the idea of a system --- i.e. that a pattern might actually perform a role within some functional mechanism.

What is configuration management?

The IEEE Glossary of Software Engineering Terminology (Standard 729-1983) defines: "Configuration Management is the process of identifying and defining the items in the system, controlling the change of these items throughout their life-cycle, recording and reporting the status of items and change requests, and verifying the completeness and correctness of items".

This definition is a software engineering definition, but it captures the main features of what we understand by host or network configuration management. I like especially that the definition talks about the process surrounding the configuration (i.e. management) and not merely the definitions or the tools used. In particular the idea of change management is included as part of the process, as is verification which implies some kind of monitoring.

Often system monitoring is separated from the idea of configuration implementation. Possibly this is because it was originally assumed that implementation would always be executed by humans. Elaborate monitoring systems have been devised, but these are often `read only'. As we shall see later in the series, this is a key challenge to be met.

State and configuration

In computer science, we talk a lot about states. The idea of a state is part of the fundamental theory of computation and most of us have an intuitive idea of what is meant by the concept so can we relate configurations to states? What these ideas have in common is that they are all alternative values of a variable quantity, like a register. A configuration is somehow a pattern formed by a number of state variables. Let's recap' the different ideas of state to give the devils their names.

The simplest idea of state is that of a scalar value, or a single item. For example, file attributes in Unix are simple register-like values. When we write chmod 664 file-object we are defining the state of the mode register for the file object. The state of the register belongs to the set {000,001,002,..,776,777}, which is called the alphabet of the state. We think of each value as a separate symbol on an alphabet.

When the possible values the variable can take are limited, we speak of a finite number of states. An algorithm which uses and manipulates the values is then known as a Finite State Machine.

In a dynamical system, change is represented by changes of state. In other words, certain variables or registers change their values from one symbol in the alphabet to another. The different state values can be shown as locations in a graph (see fig. 1), and transitions can be drawn as labelled arrows, where the labels remind us of the cause of the transition (i.e. the operation which was executed that resulted in the change). Most of us remember the state diagrams for process execution in the kernel, labelled by names like "ready", "waiting", "dispatched", "idle", "zombie" etc.

Transitions between the states occur as a result of different actions. Sometimes we control those changes, and sometimes we merely observe the changes. This is an important thing to remember, as it is often assumed that once a value has been set, its state remains fixed until we alter it. But that is not true due to the external environment of a system. We might plant the seeds, but they grow all by themselves thanks to rain and sun.

What about more complicated `containers' than scalar registers? Well, permissions of more contemporary file systems have more complicated attributes than the Unix filesystem. Access Control Lists are lists of scalar values, whose state can include lists of user identities to be granted access. The system process table is a list of state information, split into different columns, i.e. formed from different categories or types of information. These are embellishments on the idea of a list. Okay, so we need lists, but these are not much more complicated than scalars.

Text files and databases, on the other hand, reach a new level of complexity. Test files are ubiquitous for holding configuration information on Unix-like operating systems. Databases are the other universal approach for storing values.

The contents of files and databases are not of a fixed length, so we cannot think of them as mere registers. If they are ASCII encoded files, then we can say that they are composed from a finite alphabet but that is not the same as having a finite number of states. File size is only limited by memory.

The most pressing feature of files is that they have structure. Many Unix files have a line-based structure, for instance. XML files, which are rampantly popular, do not have a structure based on lines but on a parenthetical grammar. Most programming languages are not reliant on starting a new line for each new statement either -- their formatting is also based on a scheme of statements organized in nested parentheses.

What is important about the structure of information is not what it looks like, but how easy or difficult it is to understand the information in these structures -- not just for humans, but for computer algorithms too. Any tool or scheme for determining the content of a file object or database much deal with the real complexity of the structure used by that file object to store and represent information.

Let's abstract this idea to get a better idea of what we mean. With structured state information, we not just deciding the colour of a single rose for our garden, but planning the landscaping of the entire estate. We cannot expect a single spade to deal with the complexity of this garden. The tools that create and maintain such a structure need considerably more intelligence. This idea has been studied at length in computer science, because it is a problem of far-reaching and general importance.

Patterns - or variations of state

As a state changes, either in time or across different locations or objects, it forms a pattern. It paints a picture or sows a garden. The pattern might be so complex that it seems inconceivably random, or it might form some kind of simple and recognizable structure. The structure might be best comprehended as one dimensional, two dimensional,...as you see, there is any number of characterizations we might use to accurately describe the pattern.

Arguably the most basic question we can ask is: is the pattern formed from discrete symbols like individual flowers or entries in a database, like the specification in figure 2.

     xxxxxxx-------------
     xxxxxxx-------------
     xxxxxxx-------------
     xxxxxxx-------------
     --------------------
     --------------------
     --------------------

or is it a continuous variation, like the curve of the path through the rolling mounds or the changing level of the system load-average (see figure 3)?

This choice between discrete and continuous points to a fundamental dichotomy in science: the competition between reductionism and holism. That is to say that discrete values generally emerge from trying to break things down into atomic mechanisms, while continuous values come about by stepping back for a broader, approximate view. Variables that try to capture average behaviour, for instance, vary approximately continuously (as in figure 3). The theory for describing patterns in these categories is rather different in each case, which explains why intrusion detection and anomaly detection are so difficult.

Patterns are everything to us, not just in configuration management, but in the management of anything. We organize things according to patterns. Patterns are how we understand when things are the same or different, There are patterns in time (repetition, scheduling) and there are patterns in space (disk management, directory management and cluster management). When we manage things poorly it is often because we have no good way of describing and therefore measuring the patterns of our resources that we need to deal with. Indeed, it is a paradox that beings that are so good at recognizing patterns are often quite inept when it comes to utilizing them. This is odd indeed because humans are excellent language processors, and the principal way of desribing discrete patterns in language. The study of computer `languages' is the study of patterns. Continuous patterns are an essential part of our visual and auditory recognition. The language of these is the calculus of differential mathematics.

What discrete and continuous patterns have in common is that patterns of either kind are difficult to identify. Our brains are incomprehensibly successful at identifying patterns -- so much so that we see patterns even when they are not there, but finding algorithms for recognizing and even classifying patterns, associating meanings with them (semantics) can be of the most difficult problems we know.

Does that mean we should not try? Is management a waste of time? Clearly it is not. Much has been accomplished throughout history by our willingness to forego complexity and employ simple patterns that we more easily can comprehend. Of course, it involves some sacrifice, but offers greater predictbility and hence reliability. It is an important lesson that the real intellectual achievement is to simplify a problem to its core essence -- anyone can make something more complicated. Indeed, science or natural philosophy is about looking for suitably idealized approximations to complex patterns, not about wallowing in detail. Also in configuration management, we must forego complexity to achieve reliable management. This is a theme I'll be discussing in the coming weeks.

COM:POSTscript

At risk of turning this into a Bruckner episode, we must leave it there for this time, before the fruits of this batch get us embroiled in a jam. In the next part of the series I want to talk about the ways in which patterns can be classified by algorithms. This is an important step towards the automation of configuration management. We'll see why the only real tool we have for pattern matching symbollically is the regular expression and why intrusion detection systems are trying to solve a hopeless task -- identifying who is a pod amongst the gardeners!

Gardens can be impressive tiered sculptures full of botanical variety, jungles rich in diversity, or scrapyards full of junk. It's the same with our computer configurations. Which is easier to understand? Which is better to live with? These are not easy questions to answer, but they are amongst the questions we shall try to address in the coming issues.

Part 2: Babel, babble, toil and grammar

Time to put the administrative house in order? Then you are going to need a way of describing that house. Configuration management, as discovered in part 1 of this series, is the management of resource patterns. If you can't communicate a configuration pattern, you certainly can't have someone create it, verify it, or maintain it. So, while there are clearly many ways to build your house of cards, you will need to learn the language of patterns if you want to make a bunch of them them exactly alike.

Parentheses (and more parentheses)

Call me not a linguist or a poet by training; my roots were nurtured in that country on Earth with surely the worst reputation for knowing foreign languages (worse even than the United States). Still, I am both uncharacteristically intrigued and betaken by language.

(What? "Betaken"? Not a word? Hmm.. stay tuned!)

These days I live in Oslo, in the southern part of Norway, but I started my life in the North West of England, (ironically, the part of England whose culture and language were "impregnated and borrowed" by Vikings from Norway in the early AD). The inheritance of that invasion is still there for the observant eye and ear.

I lived not far from a charming creek called Beckinsdale (Modern Norwegian: Bekk i dal, meaning `stream in valley'). People still call their children "bairns" in that part of the world (Modern Norwegian: barn for child). There are many examples of such glossal cross-pollination in the English language. It has gone from being a single Germanic tongue to many distinct but related ones. In fact, the languages of Old English and Old Norse were so alike that the Vikings and their victims probably understood each other quite easily. Today language scholars are often at pains to determine from which of them English certain words came.

So let's move on from glossary to grammar. In dialect, I recall verb-endings that sounded perfectly natural to me: "we've meeten, eaten, beaten, moven, proven". surely older forms of verb endings than in the modern English "we've met, eaten, moved, proved". (The endings sound somehow more Germanic, though I am just guessing (I was reminded of them on playing Deep Purple's song "Space Truckin'" where Ian Gillan sings: we've meeten all the groovy people...)) It is odd that "eaten" alone has survived in the U.K. (occasionally "proven" -- gotten, on the other hand, which has survived in the U.S., is strictly forbidden in the U.K. and yet the derivatives "forgotten" and "begotten" are standard.). Clearly the rumours of English grammar have been greatly exaggerated!

What of "betaken" - why is this not a word? It clearly fits the grammatical forms and endings. One even says idiomatically "I am rather taken by that" (preferably with a butler-like inflection) and, of course, there is a similar word "betrothed" which is in the dictionary. In Modern Norwegian it is indeed a word (betatt) and it means exactly "rather taken by", so I hereby define the word "betaken". And who can stop me?

Indeed, language changes regularly and we are inventing new words all the time, using recognizable patterns. It tends to move from complicated constructions towards simple regular patterns. If you examine which verbs are still irregular (or strong) in language, it is those verbs which are most commonly used (e.g. "to be"). There is a simple reason for this: we only remember the irregularities if we are using them all the time, i.e. if they are strong enough to resist change. In other cases we forget the "correct" form and (regularize|regularise) according to some simple synactic pattern. Anyone who has seen the British TV show "Ali G", will know from his parodical dialect that there are parts of the U.K. where even the verb "to be" is losing to regularization: "I is, you is, he is..., innit". (Prizes for guessing the last word's origins.)

In fact we add and change word endings willy nilly: in the U.S. my least favourite word at the moment is "provisioning" (which I like to call "provisionizationing") although "de-plane" is way up there (it surely means picking passenger aircraft out of the fur of a cat). These are particularly nasty examples of "verbing" and "nouning" -- especially American phenomenonizationings. In the U.K., people have an odd habit of saying "orientated" instead of "oriented", fearing possibly that the latter has something to do with a cultural revolution of cheap shoes, or harks of a country they never managed to "civilise". Or, perhaps simply by being so orientitilated that they feel they must.

At any rate, while there are definite patterns to be seen, clearly human language is driven by populism and natural selection, not by logic or design.

The Chomsky hierarchy

So much for human language. It seems to have very little to do with structure or reliability -- qualities we are certainly looking for in system administration. So let's get formal.

In the passages above, I broke several rules of writing, and made you (the reader) work harder than is generally allowed in modern literature. I served a plethora of parenthetical remarks and digressions. I am guessing that you have noticed these (and that you had no trouble in parsing them) but that they were a little annoying. You had to work slightly harder to understand what I have written. Of course, I was making a point.

The theory of discrete patterns, such as houses of cards or the last episode's flower beds, is the theory of languages. It was initiated by researchers including Noam Chomsky in the late fifties and sixties. For discrete patterns, with symbolic content, it makes intuitive sense that discrete words and their patterns might be a good method of description; but when we get to continuous patterns, like the curving of a landscape, what words describe the exact shapes and infinite variations of form? For that we need a different language: continuous (differential) mathematics, which we shall not have time to mention in this episode.

The theory of formal languages assumes that a discrete pattern is formed form an alphabet of symbols, shapes, colours etc, like a pack of cards; patterns are then classified by measuring the complexity of the simplest mechanism or computer program that could generate the pattern. The classes of patterns are called formal grammars. Their classifications and corresponding state-machines are as follows:

  • Regular languages (Finite Automata, or finite state machines)
  • Context free languages (Push-down Automata)
  • Context sensitive languages (Non-deterministic Linear Bounded Automata)
  • Recursively enumerable languages (Turing machine)
The syntax of a language is a list of all its legal sentences. Lists are fine, but they are not very helpful to us: we have trouble remembering things by brute-force, so we try instead to remember by identifying the repeated patterns and turn them into rules. These pattern-rule templates are called grammars. The simplest grammars are the regular grammars, and the patterns they represent can be modelled by a simple pattern matching language: regular expressions.

Regular expressions

All Unix users have met (or meeten) regular expressions. They are a well known and powerful way of matching text strings. The implementations we know from Unix are stylized enhancements of what are known as regular expressions in language theory.

A language is said to be regular if it can be constructed from some alphabet of symbols and satisfies a few basic rules. Let us suppose that we have an alphabet A which contains a finite number of symbols. Those symbols could be alphabetic, alphanumeric, numeric, glyphs, flowers (as in part 1), or any arbitrary collection of denumerable symbols. The rules are these:

  • The empty string and each symbol in the alphabet are regular expressions
  • If E1 and E2 are regular expressions, then so is E1E2, i.e. the concatenation of the two, e.g. expressions "b", "e", "be", "taken" and "betaken".
  • If E1 and E2 are regular expressions, then so is the union of the two (i.e. allowing alternate expressions to be combined in no particular order). This is written with the vertical bar `|' in most implementations. e.g. We have (met|meeten) ...
  • If E is a regular expression then so is E* (repeated instances). Expressions "provision", "ization", and "ing" generate "provisionizationingizationingingingization", etc ad lib.
  • Nothing else is a regular expression.

The Kleene star (*) is a shorthand for the concatenation zero or more instances of members of a set or expression. This is the parsimonious form of regular expressions. We'll not delve into implementations for now.

Languages in configurations

There has been a lot of talk about "configuration languages", as tools for sorting out Unix systems: cfengine, LCFG, now Puppet etc. Rumour has it, I wrote one of these myself. But don't let this talk of language trick you back into thinking about these tools. Rather, notice that the very problem of configuration itself involves language -- because it is about describable patterns.

For example, Unix file permissions form the simplest kind of regular language. If we take the octal representation, they consist of scalar states of constant length and a fixed alphabet consisting of the following "symbols":

Q = {0,1,2,3,4,5,6,7}
It is easy to represent this as a language. It is simply the union of each of the symbols. i.e., if we ignore the foibles of Unix syntax, then the entire language is simply written
000|001|002|003|004|...|776|777
This is all very well, but so what?

The significance of regular expressions for configuration policy is that there is a provable equivalence between regular languages and finite state machines, i.e. the simplest kind of algorithms, using a fixed amount of memory. This means that regular strings are relatively easy to parse, identify and understand. This, at least partly, accounts for their ubiquity in computer software where pattern matching is required.

Regular expressions occur in editors, where searching and replacing is required, in intrusion detection and spam detection software, in all kinds of policy languages, on the Unix command shell (as "globbing"). They are a central part of Perl, a language designed for pattern extraction (though Perl is not a regular language). Today, no computer toolbox is complete without a regular expression library.

Bring on the toil (parentheses again)

In spite of the multifarious uses for regular expressions, they are only the lowest level of sophistication in the Chomsky hierarchy. The computer languages we are most familiar with for programming or mark-up are almost all context free languages. Such languages can only be approximated with finite memory. They contain nested parenthetic structures that require an extensible stack to process. Here, for instance, are some examples of languages that use parentheses to identify information by type:

  1. <account>                          
       <uname>User1</uname>
       <passwd>x7hsk.djt</passwd>
       <uid> 100 </uid>
    ...
    </account>
    
  2. ( account (uname User1) (passwd x7hsk.djt) ... )
    
    

If the level of parenthetic nesting in a grammar is not large, we can always simulate common cases of context free languages by treating fragments of them as regular expressions with balanced pairs of brackets (as anyone who has written a simple parser will know). This is useful because it means that a simple finite state machines can make a good approximation to interpreting the string -- and this is much cheaper than employing a full solution.

However, to ensure full generality one must go beyond regular language tools and enter the realm of stack-based tools like Yacc and Bison for context free grammars. Each level of the Chomsky hierarchy grows in its computational complexity (it costs us more to parse parenthetic remarks (as you (no doubt) experienced in my introduction)). The most general patterns require a full Turing Machine (a computer with infinite memory) to solve.

The trouble with this next level of computation is that it a drastic step. It requires a whole new level of sophistication and toil in modelling, describing and understanding to master.

In configuration management, we want to use higher-grammatical patterns to design, classify and maintain structures that are context free, or harder. The structures might be inside files, in packet streams, distributed around a network, or inside a database etc, The difficulty of going beyond finite state automata partly explains why pattern recognition systems (like Network Intrusion Detection Systems) which obviously need to deal with parentheses (e.g. TCP-SYN, TCP_FIN), generally do not record such state, but rather rely on regular expression rules applied as fragments. This is "doable" if not optimal.

Data types and bases

In configuration management we meet information in a variety of forms, Lists of values are common. Line based configuration files are ubiquitous in Unix. Windows has a simple key database in its registry. What kinds of languages do these data form?

  • Scalar permissions are regular languages.
  • Lists of regular objects are also regular.
  • A line-based text file is a list and is hence regular.
  • Text files contain higher grammars such as XML are context free.
Relational databases have been used to store data almost since computing began. They add a new twist to the idea of languages, namely that the words one forms from the basic alphabet of a language (and sometimes even the symbols of the alphabet) can be classified into types. Consider figure 1.

Fig 1: Some tables in a relational database.

The figure shows the basic idea of a relational database. Certain types of data are grouped together in tables or records. Such data structures have eventually ended up in programming languages too, in the form of records, structs and now even object oriented "classes". The main point of putting information into a predictable structure is that is one imposes a linguistic discipline on the data. The tables are simple parentheses around a number of regular language items that are given names. In the first table we have a string (which is a regular object) with a name `base path', a `regex' which is a new kind of table or parenthetic grouping, and an age which is yet another parenthetic grouping. The `regex' has two regular members: a regular expression (which is a string, and is hence also a regular object) and a label (string or number) which is regular. Similarly, `Age' consists of a list of three regular objects.

A relational database is therefore a context free language. SQL is a query language that uses regular expressions embedded in a table model to locate data in the database (which has its own context-free language pattern). We cannot escape from languages or these basic pattern ideas in configuration management. They recur at all levels.

Data types are a powerful idea. They allow us to distinguish between seemingly equivalent patterns of data and therefore open up a range of flavours or colours to the flowers in our garden. This is the purpose of having tables in relational databases: we can group together objects into comparable clusters. Syntactically, all object of the same type have the same basic structure and are therefore comparable, i.e. they form the same sub-pattern.

Markup

A problem with databases is that they are not very transparent -- they can only be read with special tools, so it is hard to see the structures in data in an intuitive way. This is not a problem in computer programming languages where class hierarchies are written down in ASCII form. For many, the answer to this problem has been to adopt XML, a generic markup representation for context free data structures, making the best of both worlds. Not only does XML offer a standardized encoding of a context free structure (with generalized parentheses), it claims to make it parsable by humans as well as machines. (Let us say that the rumours of its human-readability have been greatly exaggerated.)

Every pair of tags in a markup language like HTML or XML makes a data-type out of the parenthesized region.

The <adj>quick</adj> brown <noun>fox</noun> <verb>jumps</verb> over the lazy dog.
The current adoration of XML has no real significance as far as problem solving goes. It does not solve any real problem, but it is interesting that the trend in system design has been to move away from regular line-based data, as is traditional in Unix and DOS, towards context free data. This opens the door to much greater complexity, with attendant consequences that we shall consider as the series progresses.

Revolution or Regex?

Toil, work and difficulty relate to grammars or patterns of patterns rather than to symbols or words. Noah Webster, as a slap in the face to the British, rewrote the spelling of the American English as a political act after the revolution. (No doubt my own spellings "colour", "flavour" etc have been magically transformed into American "color" and "flavor", by the editors.) The adaptation has almost no consequence (except to annoy self-righteous Brits immensely); many readers hardly even notice this change. Had Webster altered the grammar of the language, there would have been serious trouble. But the fact is that, while he obscured some of its etymology, the basic patterns of the language did not change, and therefore even the most obtuse of colonialists can still read American (although Canadians do seem totally confused about how they are supposed to spell).

The patterns that we are able to discuss and represent are key to mastering the problem of configuration management. Many system administration and management tools try to force users into doing either what the tools can do, or what is considered manageable. By asking users to limit the complexity of their configurations they plump for a reasonable strategy that strives for predictability. This might be all right in practice, for the time being, but if we are going to fully understand the problem, we must go beyond quick fixes. The challenge for any theory of configuration lies in describing what people really do, not in trying to force people to do something that is easy to understand.

In the next part of this series, I would like to run through some of the data models that have been applied to the problem of system management and ask: how can we measure their complexity and why are none of them are ever really used?

Part 3: A Shocking Lack of Ad-hocracy

In his 1970 bestselling book Future Shock[1], writer Alvin Toffler predicted the demise of bureaucracy. Toffler was a writer emerging from the 1960's, on the tail-end of the hippy revolution. They were going to make the world right, optimism was in the air, and everyone saw the pace of technological change as a force for good. Today, we are less enamoured by progress and have fallen back into a stagnant economic tumble-drier of selling and consuming that seems to have no vision or direction. Perhaps that is why Toffler's vision of the demise of bureaucracy never really came about?

Toffler predicted that centralized power-structures, with their rigid procedures for decision-making and management, designed for a slower age, an age of little change, would collapse under their own sluggishness -- buckling under the force of a cultural and technological deluge. Bureaucracies would be replaced by lean, mean decision machines, guided by simple principles, and so agile that they would win over traditional leviathans like mammals pulling tongues at the sauropods. Moreover, people like me, working in government organizations, would be freed from the slavery of application-report-archive, to live productive lives full of choice and measured reflection. He called this state of affairs ad-hocracy. Toffler wrote:

"Faced by relatively routine problems, ["Man"] was encouraged to seek routine answers. Unorthodoxy, creativity, venturesomeness were discouraged...rather than occupying a permanent, cleanly-defined slot and performing mindless routine tasks in response to orders from above, ["Man"] must [now] assume decision-making responsibility -- and must do so within a kaleidoscopically changing organizational structure built upon highly transient human relationships."

That is what Toffler said about the human workplace in 1970. This well-meaning sermon has admittedly not taken the human world by any great storm, as we attest from experience (though, if we are being fair, it has indeed made inroads). What I find ironic is that we are now reliving an almost identical discussion in a different sphere. Today, we are struggling to accept the same wisdoms in the area of computer management. It will take the next two parts of this series to do this subject justice.

Strategies in the War against Tera

What is a good strategy or algorithm for computer management? Few would argue against the idea that the sheer size of systems today practically necessitates automated tools (recall Ken's law: always let your tool do the work). Certainly I believed this in 1993 when I started writing cfengine, and today IBM certainly believe it and flag it with their Autonomic Computing initiative. Toffler pointed out that automation does not necessitate production-line thinking, in which one mass produced identical copies -- a world in which one can have any colour as long as it's black. On the contrary, he argued that ``As technology becomes more sophisticated, the cost of introducing variations declines.''

But in the management of the information technology itself, we are still hearing about "ways to mass produce 1000 workstations all identical from a common source" -- golden master servers that are to be worshipped by hundreds, perhaps thousands of clones. Ad-hocracy is not the default doctrine in computer administration.

Ever since the late 1980s, the telecommunications companies have had their own vision of computer resource management, borrowing from tried and trusted inventory systems, for warehouse and personnel management, and trying to modify them to cope with the computing age. In time they borrowed ideas from software engineering, like object oriented database models.

Industry standards organizations like the renamed Telemanagement Forum (TMF) and Internet Engineering Task Force (IETF) have continued to develop models for managing computing equipment that are essentially bureaucratic. What they perhaps failed to anticipate was the pace at which the technology would develop (something akin to the rate at which device drivers have to be written on PCs). Trying to keep up with the schema-centric definitions for all new products has led to a classic Tortoise versus Achilles race between the development of new technology and the struggle to document the growing zoological inventory. (Cisco's IOS is surely the winner of this race.)

For the Telecoms, operational Support Systems (OSS) and Business Support Systems (BSS) were the order of the day. The idea was simply to document every device and human procedure exhaustively in a huge database so that help-desk staff would be able to see an overview. Later came tools that could interact with the devices via a "management console" in order to write certain values to routers and switches, and even workstations and PCs. Today, the legacy of these approaches is still with us; they still cling onto life, even today in the largest corporations, but they are still wailing (or yawning) from their tar pits.

The complexity of those systems is legendary. No SANE engineer, in her right GHz CPU, would seriously try to build such a monster. Yet, in the wake of these support systems, designed for the telephone era, the same knowledge engineers attempted to create the new generation of forms and processes that would manage the computing age. Amongst today's species are:

  • SNMP/MIB - a hierarchical table-based data structure (The Management Information Base) that is mapped into a linear set of machine-readable identifiers. The values associated with these identifiers are simply read or pushed into place by the SNMP read/write protocol. The algorithmic complexity is very low. The data complexity is a simple regular approximation to a context-free language
  • SID - Shared Information and Data model. This is an information model that is used in both NGOSS and DEN-ng. It includes services and organizational containers in an object-oriented framework.
  • CIM - Common Information Model. An information model that provides an exhaustive replacement for MIB.
  • NGOSS: the TMF describe this as a "comprehensive, integrated framework for developing, procuring and deploying operational and business support systems and software". It includes the SID and eTOM standards. It isa complete organization map.
  • DEN(-ng) - Directory Enabled Networks. This is a model that is complementary to SID. It focuses on modelling network elements and services, using an interpretation of policy based management. DEN-ng products and locations are subsets of the SID.
I challenge readers to lookup the data models on the web to see just how complex they truly are. The DEN-ng and SID initiatives are trying to move away from a MIB-like catalogue of device attributes to an overview of an organization and its resources. In particular the notion of services is an important addition.

Even equipped with these big guns for pattern description, and having the most eager blue-collar beavers to register all of this information, the efforts of these engineers ultimately seem to have fallen on deaf ears. No one really seems to want these systems -- not even their key designers. Why? As the Soviet Union or European Union or even the State of the Union will testify, bureaucracy is too expensive.

What's on the Yellowbrick Road?

The data models above have sufficient linguistic complexity to describe the patterns we would expect to manage in an organization, just as we predicted in the last issue's episode, but something is wrong. Toffler's warning is ringing in our ears. We seem to be missing a vital part of the story. Configuration management is not merely about brick-laying and form-filling.

Configuration management (a pretty low level animal in the administrative phylogeny) has become the topic de jour in the Unix world, perhaps because it is a technological problem, which tech-folks love. But it is not the beginning or end of any story that we really care about. We have no real interest in what the configuration of a system looks like. What we really care about is how to represent the goals of our organizations and applications using patterns in order to lead to a pedictable pattern of behaviours. This leads us to a hypothesis, which as far as I know, has not been convincingly proven:

Hypothesis: There is a direct association between a "correctly configured computer" and a "correctly behaving computer", where "correct" means "policy or specification compliant".

The essence of this hypothesis is shown in fig. 1. It is not just a matter of configuring a computer, but of solving the problem of achieving the correct behaviour. Configuration is a static thing; behaviour is a dynamic consequence, but not a fully predictable one.

Fig 1. The stages from policy to behaviour. A story quickly forgotten?

There are three parts to the story (see fig. 1):

  • Planning the intended behavioural policies for all parts of a system.
  • Mapping this to a configuration that can lead to the correct behaviour.
  • Implementing the change in configuration reliably.
How do we know that we can complete this manifesto? Is it doable? If so, can it be done reliably? Well, in 2003, I proved a limited version of this hypothesis[2], showing only that it is possible to define the meaning of "policy" in terms of configuration changes so as to lead to predictable behaviour on average. This is not quite the same as the hypothesis above, what it says is: there a resticted language that maps directly to behavioural consequences, so if we restrict ourselves to that, we are okay. The part about "on average" is general, and it says that no configuration management scheme can guarantee that a host will always be correctly configured, unless the machine is never used.

You say tomato and I say ... semantics

According to the first two parts of this series, we have a reasonable account of how to manage patterns of data (with or without the monstrous data models that pepper the procedures with structural complexity). These procedures might be messy, but they are essentially just bureaucratic spaghetti, somewhat irrelevant to the deeper issues.

If we fix a bit string, like a file mode, using a numerical value, there is little ambiguity in the procedure. It seems like a straightforward problem to write some configuration to a computing device: this is like stamping out molds from a production line (e.g. chmod 755 filename). It is straightforward, easy, uncomplex -- like SNMP without MIB. Any complexity lies in the patterns themselves -- in the coding of the instructions, and in understanding what the behavioural consequences of these changes. Thus, this is not where the problem lies.

But now consider the expression of policy itself. If, on the other hand, we wish to describe an operation in terms of a high level procedure (e.g. "InstallPackage(ssh)") then this is no longer straightforward, because it is desribing the configuration coding only at a medium level, not all the way down to the bits. This is like saying: "Make me prettier!" It is not a uniquely defined or reproducible goal. Someone might say that it is their policy to make you prettier, but you cannot guarantee their behaviour from this assertion. (You might trust them more, if they told you about what end result they were going to guarantee -- see below.)

If we take only a such a shell of pattern like InstallPackage, there can be several (even many) inequivalent ways of defining the internal procedures within the language of the low level configuration. Consider the following two interpretations of an InstallPackage command, which are inspired by real examples.

InstallPackage(foo)
   Check dependencies
   Check if package README exists 
   if (!exists)
     copy package
     unpack
     run local script

InstallPackage(foo)
   Check if existing binary is executable
   if (!exists_and_exectuable)
      Check dependencies
      Copy packages
      unpack all
      copy files to /usr/bin

The resulting patterns are described and implemented in terms of language syntax, as we have already noted, and computing is obsessed by syntax today -- but if the complete syntax is missing from the explanation, the call for InstallPackage is meaningless. Several of the big data models above boast a specification written in the Unified Modelling Language (UML), which is based on an object orientated syntax, i.e. hierarchical class structures. Thus it is fundamentally built as a bureaucracy of types. Moreover, XML has become the bureaucratic memo-paper of choice. XML is no more than an empty syntax "desperately seeking semantics".

This is pretty much what happens in configuration management tools. By attempting to be user-friendly and high-level, many configuration tools sacrifice operational clarity for human readability. If we try to define configuration in terms of vague high level precepts like this, then it is like trying to tell a story like this:

"A man (motion-verb) into a (drinking-place-noun) and (communication-verb) a drink..."

We can fill in many alternatives which lead to grammatically correct sentences -- which obey a pattern language that is recognized by our system. But the patterns all mean quite different things, or perhaps nothing at all. There is no clear way to say what we meant was that a man walks into a bar...

If we are to successfully govern systems, either externally or autonomically, we need to be able to complete the chain from top level goals, to a clear and reproducible set of operations, to a definite configuration that leads to predictable behaviour. This is not an impossible task, but it is far from guaranteed.

How to say what you mean

At the 2001 cfengine workshop (later followed up by Paul Anderson and opened to a wider community, becoming the configuration management workshop), a discussion almost became an argument. My friend Steve Traugott, bless him, told me I was wrong. Thunderclaps sounded, screams were heard. Tempers were enraged. In the mean time, Alva Couch and I were quietly interested in Steve's point as others were doing battle over it. I thought: clearly I was not wrong, I am surely never wrong... and yet Steve pressed his point, which has since been studied in detail by Alva Couch and which I have come to understand better as I have pondered the matter using different reasoning. Of course, neither of us was wrong, but importantly something was learned.

The matter concerned two design strategies that have been discussed for constructing configuration management schemes:

  • We specify the final state and leave it up to the program to figure out the details of getting there.
  • We specify the starting point and a specific programme of steps to take.

For reasons we won't go into yet, these were labelled "convergence" and "congruence" respectively. To borrow Alva Couch's terminology, we can rather refer to these as pre-condition based and post-condition based specifications.

Ultimately, I believe that the first of these is preferable for a number of reasons, including parsimony, consistency and aesthetics (stay tuned), but the real difficulties associated with configuration management are present in both cases. They cannot be avoided simply by choosing one.

In both cases there is the matter of how it is possible to change from the old state to the new state. Suppose a computing device is in a state that is not consistent with policy. We require a procedure, whether that means a static bureaucratic procedure or a lean-mean entreprenuer procedure, to fix it. In the first case (post-condition), we define this procedure generically, like a template, once and for all (we define what we want to get out of "make me prettier"). In the latter, we define the procedure in each case, making it potentially inconsistent. Steve said we can still achieve consistency by always starting from a known state and following a precise chain of preconditioned actions, meaning that if a computer gets messed up, you wipe it clean and start over. This is a reasonable approach to take if one thinks in production-line terms about configuration management, but this is not my vision.

Mass production undone

Production-line factory thinking requires a chain pre-conditions When you create a chain of operations that depends on previous operations, each step is preconditioned on what came before. If one step fails to be implemented, all subsequent steps fail ("I'm sorry sir, I can't make you prettier, your nose is in the way.").

Fair enough -- we just have to figure out how to get it right without getting stuck. That might be possible, but in fact it is harder than in the post-conditional case, because the compositional complexity of the approach has to be delt with in one go, whereas it only has to be dealt with for one operation with post-conditions. But the real problem with pre-conditions is that the approach fails to easily support a wide variety of different adaptations. It takes us back to Toffler's fear of the totalitariam-commy-nightmare of mass production of a single unvarying model.

Oddly, in system administration, many still worship the totalitarian gods of mass production. The god of small things, to paraphrase Arundati Roy, is still being trampled by the heavy boots of bureaucratic thinking.

Suppose we assume that the postcondition model is possible (cfengine uses this approach, so it works at least in some limited capacity). Then we can (at least try to) never base an operation on something that came before. Then the order no longer matters, and only the final state is significant. Now, while this approach is achievable, in principle, it is also beset with problems. Its chief selling points are

  • Consistent semantics.
  • Specification of final state is often simpler than specification of the steps needed to get there.
  • You do not have to wipe out a machine if something goes wrong; the system can adapt in real time.
It's main problem is a residual ordering ambiguity caused by creation and deletion and competitive adaptation.

Black boxes and closures

The inner workings of bureaucracy are generally opaque, but for reliable administration this is not necessarily a bad thing. Black boxes are a mainstay in computing because they hide inner complexity and also protect inner details from outside corruption.

As Alva Couch and his students have pointed out, the computer science black-box notion of closure gives us a level of predictablity, by locking out the environment that generally confounds predictability. This is the same environment that can screw up chains of preconditons, as Alva's work has taken some pains to model in detail. The trouble is, while closure is easily implemented for things like database transactions, it is quite difficult to implement in the area of system administration, because systems are constantly being exposed to the environment by uncontrollable backdoors. Moreover, they often share operational state (routing tables, databases etc) that breaks open closures.

The story of order-independent operations is also rather non-trivial and is based on a very low-level approach to operational semantics. Cfengine has focused (some think too much) on this approach, and hence often fails to provide higher level expressivity, which other projects like Luke Kanies' Puppet are trying to remedy (hopefully keeping the low level semantics intact). Paul Anderson has long told me that he sees cfengine as a low level language that one compiles down to. This seems sensible to me. In the mean time I am developing a more precise theoretical model for these low-level semantics that will eventually be incorporated into cfengine 3, lately together with Alva Couch.

Even if a configuration is reachable without any ordering problems, there are some features of behaviour that depend on the order. This has to do with the fact that creation and deletion are catastropic state-destroying operations that break commutativity, on present day operating systems. It is conceivable that one could build an operating system that did not have this property, but it would be quite difficult. A fair approximation would not be too hard to build however, so we could have commuting operations and the order of procedures would be entirely irrelevant to the final state.

The king is dead: long live the laissez-faire army

Humans being have a remarkably capacity to view the world in terms of subordination, and system administrators are no exception. You'd think we'd all done military service, or were trying to establish ourselves as king or emporer by conquering fourth-world tribes of disorganized computers and sending them for Pygmalion execution lessons on how to behave in The Kingdom of the Data Centre.

In the 1990s, as telephonic empires were crumbling, small-business entrepreneurship invaded their turf and took computer management in a different direction. Small furry businesses started making it up as they went along and, thanks to the new tools like small computers, Unix, the ad hoc solutions of Windows and Macintosh. With an excitement for progress rather than control, mammals evolved and dinosaurs were left floundering. The Unix world has bothered itself little with the data models we mentioned above. Cfengine, Isconf, LCFG, and of course every site's home-brew scripts have been much more ad hoc in their approaches -- almost devil-may-care informality. And yet they work. Why?

Toffler made an important insight in his book, an insight that it is appropriate for us to relearn. His point was this: in the 1960's, still scared of the looming presence of communism, it was assumed that the industrial age, the rise of technology meant a future mass-produced, in which everything was the same -- there was no variation, no choice, just an overwhelming amount of factory produce, because the duplication of fixed pattern was marching to the tune and beat of industrial nations' sternest batton.

What Toffler realized was that better technology allows one to manage more variety, greater diversity and importantly greater choice. We do not have to fear diversity. What, after all, is the point of information technology if not to manage the complex array of specially tailored blueprints? What is the reason for improving management of productivity if not to cater for the whims and desires of minorities and special interests?

The weight of bureaucratic constraints, just to maintain a large information model is overwhelming. It is too slow. If you are attached to a fleet of steel balls by a cat's cradle of elastic bands, your best career choice is not acrobat.

A bearable lightness of being

There is a myth that, if you do not control something, the result will be chaos. There is a belief that if you do control something, its behaviour will be in accordance with your wishes.

I believe that, there is some linguistic confusion at the heart of this debate. The word we want is not "control" because that is a word of hubris. You can tame a horse but you will never control it. There is a world of difference between control and management. Toffler pointed out the answers in 1970. We are fighting the wrong battles.

"Rising novelty renders irrelevant the traditional goals of our chief institutions...Acceleration produces a faster turnover of goals. Diversity or fragmentation leads to a relentless multiplication of goals. Caught in this churning, goal-cluttered environment, we stagger, future shocked, from crisis to crisis, pursuing a welter of conflicting and self-cancelling purposes."

The real measure of intellectual achievement is to take something complex and make it simpler -- by suitable abstraction. Anyone can make excruciating syntax, an exhaustive list or a database of every possible detail. There is absolutely no evidence that tight bureaucracy leads to greater predictability. What can lead to predictability is clearer semantics -- perhaps with a lighter touch.

In the next episode, I want to dispel a related myth: why centralization is not the necessity that has generally been implied.

Reference

[1] Future Shock, Alvin Toffler, Random House (1970)
[2] On the Theory of System Administration, Mark Burgess, Science of Computer Progamming 49, (2003) 1-46

Part 4: There's no I/O without U

Trade and communication have been in partnership since symbiosis emerged from evolution's curiosity shop, as a trick for smuggling contraband into the sum of the parts. This partnership is central to management of systems. Communication pervades, from the question a user asks at the help desk, to the data received from a router, to traffic flow statistics, to the implementation of configuration operations performed by software -- there is an exchange of messages about state, about intention and about change. Computer management is also increasingly about trade, for all this communication is not for whim or curiosity, it has a direct value to the us in terms of time, money or service. However you look at it, the world of networks is the world of commerce.

When two parties wag their tongues or dance their dances, they are both altered by the exchange. When more parties are involved, there is a cumulative effect which spreads out along dentritic paths, binding the squawking flock together and forming a network. The trails blazed by those conversational patterns fashion the resulting behaviour of all parties in the network. This wave of influence is the essence of management, whether it spreads like a diplomatic envoi or like a forest fire.

In this piece, I hope to persuade you to re-examine your beliefs about centralized, authoritative management in computing (or elsewhere). I want to argue that, in our modern world (surfing its way on the rising wave of free-market economics) we need to re-think the tradition of hierarchical, centralized governance in guiding the behaviour of systems. It's a familiar song: delegation and decentralization are not only desirable but inevitable if we are to cope with the rate at which we have to (re)configure and repair systems in a vibrant and adaptive network, built on the economics foundation of trade and services.

From a cat's cradle, like a bat out of hell

This is the network age, an age in which webs of communication are accelerating our technological and economic development. What is a network? We overload this most important word with a plethora of meanings. Even in the limited domain of computer science, its meaning is not clear. Sometimes we mean the cable that joins the computer to the wall; sometimes we mean the infrastructure that enables communication between computers, including the routing and the switching; sometimes we implicitly mean the protocols that are spoken over these channels of copper and air; and sometimes we mean the abstract collection of computers themselves that are connected by the infrastructure (a social network of interaction formed between human and computer).

A network is `simply' a device or construct which joins many things or places together. The first technological networks were roads and sewers -- many thousands of years B.C. Even before that, humans formed tribes linking humans together in structures of about thirty. But we should not be fooled into thinking that networks are human creations. In nature, networks are everywhere: crystals and molecules are held together by networks of inter-atomic bonds. The structures of these networks determine the large scale properties of the substances they form. In biology it is not genes that generate the complexity of the living world, but rather the networks of interconnections formed from the proteins that the genes encode. Our bodies are ripe with networks, as Arab scholars discovered and drew in exquisite detail during the first millennium: networks for blood transport, nerve signals, immunity, etc. Biology is a testament to the successful cooperation of multiple communicating parts.

Face-Centred Squareness

But as humans, we are frightened by complexity, even as we embrace it. The structures we configure purposely into networks, i.e. their topologies, are simple-minded. As they grow beyond our designs, we fret like worried parents over the consequences of this growth and try to protect systems with firewalls and other barriers. Recall Alvin Toffler's comments about industrialization from the last episode of the series.

It is no coincidence that published maps of the Internet look like snowflakes or leaves (see the images famously generating by Bill Cheswick at AT&T) -- it is not that the Internet appears biological, the point is that these bio-structures are themselves networks. This is what networks look like when they emerge in the natural world purely as a result of mutually beneficial interaction. When engineers design networks they look quite different -- like stars and trees. Humans only build in this "organic" way when things are "out of control", i.e. when they are no longer designed by an engineer, but they "grow" following an economic principle of development rather than a regulated one. Why not? If these structures are so successful in nature, why don't we build like this? Perhaps we are small-minded.

Humans love to build centralized and hierarchical structures, whether in industries, governments or in armies. Possibly there is an social-anthropic reason for this (perhaps an expert, on reading this, will tell me the answer): there is evidence to show that people have evolved to work in groups of around thirty at the most. Once this size it exceeded, they tend to break up into sub-groups which cluster around a new leader. Another reason might be cultural, though the structure of families in which a family clusters around a dominant male (as attested to by the unfailingly nauseating and predictable references to "my father" or "daddy, daddy" (never "my mother" or "my family") at every tearful moment in American TV and film).

But there is an fascinating phenomenon going on here. Even if the size of our attention is limited, there is nothing obviously programmed into us that says how these groups must be organized, or is there? Well, roll up! Be amazed by our human propensity (perhaps desire) to subordinate ourselves before an authority figure. We behave like a bureaucracy of sheep in uniform. I always think of the scene from Monty Python's Life of Brian in which Brian tells the crowd: "Don't listen to me -- you have to think for yourselves", to which the crowd cheers in unison "Yes, yes! We must think for ourselves!".

Why command hierarchies? There is no evidence to suppose that such a structure is better than another. It is somewhat "natural" perhaps, like a dividing river or a branching tree, but it is far from robust. The tree structure has a certain clarity to it, but also great fragility. A tree is about not reproducing the same alternative twice. A branching tree is a clean, economical and logical separation of concerns, but the same properties also make it a structure of minimum redundancy and therefore quite fragile and blooming with bottleneck inefficiency.

Fig 1. From centralized star topology, to hierarchical centralization, to decentralized mesh topology.

Create like a God, command like a King, and work like a Slave

In terms of overhead, we can see why subordination (i.e. centralization) is appealing -- if everyone follows a single master (Fig 1a), then there are only N-1 agreements instead of N(N-1)/2 in a community of N agents. And yet public keys have shown us that we can form peer to peer collaborations with only N agreements and a little skill. What about consistency?

Every piece of knowledge must start from somewhere. That means there is a source and a direction from which the information spreads. If there is only one source of information, then it must be consistent. Hence star-like topologies are perfect for local consistency. Q.E.D. If we move up a scale, then coordinating local communities according to a common policy can also be done from a single source, hence star-hierarchies solve the problem (fig 1b). Tradition wins the day.

So, fine, centralization is sufficient, if we assume that the chief of the network can handle the burden -- but is it necessary? The odds seem to be in favour of centralization. But is this good engineering? Let's look at this dipolar list:

  • Centralization (single source) versus delegatation
  • Top down versus bottom up
  • Hierarchical versus peer to peer
  • Data normalization versus data-mining

What if we look at surviable networks, like biological organisms. Biological "devices" evolved only to survive in a changing environment. How? Through redundant distributed networking, the opposite of centralization. Even though some few of our major organs are singular (brain and heart) there is still redundancy built in: we are amazed by stories of how people who have suffered brain injuries learn, for the most part, to reroute their brain functions due to the phenomenal inbuilt redundancy. The single points of failure (spinal cord, heart, etc) are still our greatest weaknesses and these things limit our growth.

But surely all this redundancy and variation is much too expensive to maintain! I quote Toffler once again: ``As technology becomes more sophisticated, the cost of introducing variations declines.'' Recall that the fear Toffler spoke of in industrialization was precisely that mass-production would lead to an inflexible lack of choice in a market -- that you could have any colour as long as it was black. Well, he also argued that this was nonsense once you have technology, because that is when you really can afford to make things cheaply.

So do we see any such technologies that diversify the playing field for configuration management? Indeed, we have various levels of automation tools from SNMP based Tivoli and Openview to policy-template configuration tools like Cfengine, LCFG, Pikt and the web-based interfaces provided by a variety of operating systems. Although SNMP is now widely regarded as a failure (even by the IETF) for everything with the possible exception of monitoring, the template-based tools albeit imperfectly enable great variability in mass-produced environments. The largest cfengine installations, for instance, run into the tens of thousands on a single site, some with large variations from laptops to supercomputers. Clearly variation-management is no longer a real issue for technology. Let's see how it comes about.

Speak to me in many voices, make them all sound like one

The first universal theory of symbolic communication was pioneered by Claude Shannon in the 1940s. He turned the idea of communication into a science and implicitly solved the issue of maintenance at the same time. His theory of communication over a noisy channel is one of the classics of electrical engineering (or information science, take your pick). It makes that important point that you cannot escape from the problem of signal noise in a system. All systems contain noise (uncontrolled variations), in some manner or form. If a message is communicated in a noisy environment it becomes unclear, rough and grainy, the chance of it being understood and obeyed is much smaller (as the pony said to the ventriliquist -- "I'd love to talk to you, but I'm afraid I'm a little horse").

Symbolic (digital) communcation was the basic technology that enabled electronic networking and computation. After this, the idea of queueing and packet switching brought us from the fast train-track communication of the telephone network to the automobile diversity of the Internet protocol. Networks now relay communications between peers by encapsulating any kind of message with a single lingua franca of IP (more or less).

The messages might now all sound like a single language, but they carry more diversity than ever before. By deregulating the centralized structure of the telecoms and by deregulating the single source content using the open (emergent) standard of IP, diversity has grown into a commerce of communication with a tolerable level of variation. Remarkably a plethora of standards has converged into one, just as kids who are left to dress without school uniforms converge on jeans and T-shirts and a small number of basic themes. The lesson here is that when you deregulate something, you might actually end up with greater uniformity than before because people lose interest in fighting for supremecy -- they become content to live and prosper in their own niches.

In previous issues, I talked about the structure of patterns in a configuration. A network too is a configuration, which can be laid out as a formal language. The hierarchical tree-structure we are used to in a context free language (like XML) has no a priori superiority to that of a more random peer-to-peer structure. Hierarchies possess relative computational simplicity, meaning that they is cheap to parse or build by linear computation, but they are only marginally more expressive than regular grammars that correspond to peer relationships.

Why would we think that a context free, military hierarchy was the best solution to management? Perhaps because the alternatives are currently too hard for us to fully understand. Grammars that use parentheses make it easy to put things in boxes and this is a comfortable way of marking out territory, assigning responsibility. But, in practice we do not even use deep hierarchies in the organization of network patterns -- sometimes only two levels usually: master and slave hosts. That depth of pattern grammar can easily be built as a regular language -- it is "faux", more about limitation than structure. It requires only that each slave in the network promise to follow the instructions of a master, which in linguistic terms is just a prefix. This is perhaps a clue that what we really value is perceived cheapness rather than subordination. We just think that hierarchy must be cheaper than a less structured pattern, although the theory of languages says otherwise.

Selling your soul at the crossroads

In the past few years, users and researchers alike have come to realize the economic limitations of hierarchical regulation. A hierarchy implies a set of bottlenecks and barriers, of permissions and subordinations, but users have been given the power to compute and, by George, they do! Service Oriented computing has arrived to stay. It is direct, it is valuable to individuals and it is subordinate to no-one.

Technology is no longer the plaything of governments and governing boards for strategic purposes, it is in the hands of ordinary folk who simply want to trade. This desire to trade, to exchange information and services, in mutually beneficial ways is what has driven the Internet into a state of biological complexity. It is a new technological symbiosis that enables our society to move to a new level of cooperation that can handle groups of bigger than thirty.

What of the cost of this ad hoc, symbiotic organization? Price is clearly a subjective point of view in this story, and the cost of management is somewhat dependent on your particular skills (Toffler: as technology becomes more sophisticated...). Today these currencies for management are in flux, and centralization is being displaced by peer services, through the web and through file-sharing software. Could it be that planned structure could be supplanted by an organically (economically) grown emergent structure?

Well, this might all sound like a dream from some 1970s biological material, but don't reject this thought without seeing the wood in the trees: just because a network was not designed does not mean that it is less functional than one that is. Just because behaviour emerges from the cradle of economic self-interest (symbiosis) does not mean that it is less predictable than a military operation. All life and society emerged this way -- and we do very nicely, thank you.

So, in case you thought I had forgotten about configuration management in this daydream about networking, let's tie the floating pieces in our Article-Area-Network together seeing how we can have the best of both worlds: predictability and freedom, personal safety and opportunistic self-interest.

Autonomous meditation: the state of standing still

Shannon's model of the noisy channel applies equally to computers talking to themselves. Self-interest begins with the correct functioning of the individual. What easier way to maintain system state than to have it chant that state over and over again until it works harmoniously?

The passage of time brings many influences to bear on systems that we do not have any control over. Developers of computer systems have made a frequent error in viewing configuration management only as change management (as in a transaction system like a database). It is a bit like believing that the weather is really an air-conditioner with a nice neat knob to switch it on and off.

Self-maintenance is communication if you see a computer system as being in a constant state of meditation, repeating a mantra which we can call its state, at each moment. We would like it to chant a message that agrees with our policy for its state. By repeating the message one then reinforces it. This metaphor describes the idea of autonomous configuration management.

From the previous articles, we think of the state of the computer as some string of configuration-operational characteristics that forms an alphabet. This can be coded in any imaginable way e.g. suppose it is "ABHEKSYGHETFDH..." where A means something like "chmod 644 /etc/passwd", etc.

After a while, corruption of the state message due to run-time interactions, meddlesome users and network connections (i.e. noise) could lead to this state-message being garbled, changing some of the symbols into others. Such a change must be corrected by reiterating the actual policy. e.g. "ABCD" -> "ABXD" -> "ABCD". Just like the message over a noisy channel, we have to correct these errors.

The convergent operations we mentioned in the last article deal with this problem nicely. In operational language, a single operator is a unit of one kind of instigator of change, which we write

O q = q'

to mean an operation applied to a state q leads to a transition to a new state q'. But rather than thinking about a transition from one state to a new state, think of this rather as error correction. A "convergent" operator is a message that tells any state to transform into a policy compliant state.

C (Any state) --> (Policy-state)

The policy state is said to be a fixed-point of the policy operator since once you get there, you stay there. So, in terms of this language, all we need to do it to repeat the entire policy over and over again, like a never-ending mantra, with a separate operator for each independent kind of change.

C1C2C3 ... Cn (Any state) --> (Policy-state)

Alva Couch called this the Maelstrom property in his LISA paper from 2001. The importance of this property is that a computer can simply chant its policy message to itself, applying strings of these operational messages, and this will ensure that it is always in the error-corrected, policy-compliant state. This is not science-fiction. The approach was developed originally and used for cfengine, in a limited form, and something like it is now being used in NETCONF and some other configuration management technologies designed to replace SNMP.

Though I speak with the tongues of convergent fixed-point operators

There is, of course, a danger to emphasizing the role of communication, language or error correction too much. In the network management communities people often get stuck by confusing basic ideas with the technologies that communicate them. It would be no surprise to us that our leaders failed to govern the country if they held the view that management were the same as SNMP. Ideas generalize implementations, they should not be limited by them.

In the case of SNMP there was a conscious decision made by the IETF to avoid complexity in the protocol. This regulation, in turn, led to an explosion of complexity in the data structures (MIBs). You cannot suppress noise by trying to pretend it is not there. What fixed point semantics offer us (at least in the cases where it has been possible to implement such a thing) is the guarantee that a message repeated will be easily implemented, rather than being a Mission Impossible.

What I am proposing here is that it can be sufficient to manage device internals individually, while allowing networks to trade freely. The result will not necessarily be "out of control" in a bad way, it will regulate itself autonomically.

Five farthings, say the bells of St. Martin's

For managing the configurations of computers, networks are not essential. Is it clearly possible to administer changes and repairs to a completely isolated, stand-alone computer either manually or with the aid of automation, "one-on-one". However, the network opened the door to both collaboration and interconnection, and therefore we have the possibility to manage systems remotely. This is only true, however, if such collaboration has an economic justification.

We can disconnect any computer from a network and take over the management of the device and no one can stop us, hence the idea that computers are "controlled" from outside is only a convenient fiction. They are controlled insofar as they want to be. It is an act of voluntary cooperation to allow oneself to be managed by an external authority. Just as we bow to authorities, network management researchers find this idea almost impossible to accept.

Messages of different types have a different value to us. Just being associated with a service provider in a BGP peering agreement can be worth a lot of money -- kudos to the peer that earns respect and hence the promise of future profit. Association is a social currency that is worth something -- not necessarily money. We have to learn to recognize the different forms of currency in play in economic cooperation. The face of commerce is changing.

Why would anyone talk about trade if they could control everything, nail everything down. You only have to compare the state of dictatorships with democracies to answer that one. The authoritarian regime might argue, you need me for:

  • Offloading and convenience.
  • Specialist knowledge.
  • Separation of concerns.
  • Mutual advantage.

But all of these things can, in fact, be obtained from your neighbour and today these are used as the primary reasons for outsourcing to companies that are considered to be subordinate not superordinate authorities.

Don't we need a law-maker, an authority to govern? In contract law it is observed that the threat of an litigation in an authoriative regime is essentially insignifiant in determining whether the terms of a contract are upheld or broken. The principal factor that determines whether or not people break the law is the potential loss in economic value to the participants in the contract. Hmm. Think about that one.

Haggling over the price

It is possible to describe policy patterns and structures graphically using a theory of promises. We have been developing this theory in Oslo for the past two years. It allows us to study these matters of economically motivated cooperation. Promise theory tells us that we do not have to abandon personal autonomy to have distributed cooperation.

The economics of system adminisration change. At the beginning of the 1990s, when I developed cfengine, there was a particular need for competetive garbage collection in systems -- simply to keep them alive. Disks had finite size and the memory of the system was limited. This shaped the functions that were built into the the configuration engine. Today, this recycling task has (for the moment) been deemphasized, as we have wider margins in modern systems. Tools that are produced today place more emphasis on installation, as if our fossil-reserves of storage were infinite, or they ignore the presence of unnecessary processes, as if the heat produced by these unnecessary computations will not cost us dearly in the long run (either from increased electricity bills or the melting of the polar ice caps). The only thing that will apparently convince us to be more careful today is our confused paranoia over "security" (whatever that means to us).

Today "over-provisioning" (over-provisionizationing) allows us to swagger richly through the data centre paying little attention to the waste. This too is purely a matter of economics. We currently have no incentive to try to improve, but the time will come again when this bountiful cretaceous era of information diversity meets its tertiary boundary, and our systems will once again have to deal with loads that tax their resources to extinction. Limits will be reached and we shall be operating once again on the edge where the balance of trade crucial to the survival of the species.

Please Sir, ISO want to buy your BS enabled ITILity!

So what of the tail end of this tale? Services. The service paradigm has arrived to stay, in business, in commerce and certainly in computing. This paradigm has all but wiped out the conventional notion of system administration in the eyes of network research and development and the telecom service providers. To them Unix is an application service, Windows is something to be updated over a network.

Businesses are gearing up for service management with ITIL --- standards of good practice for Service Delivery Management that were developed by the British government in the late 1990s and have grown into a ubiquitous standard of practice in business today called the Information Technology Infrastructure Library (ITIL). The manuals and documents for this standard are still sold at great expense, and a summary was published as British Standard BS15000 and now also as ISO20000. These documents mention configuration management, although they say nothing about how (or indeed if) it can be implemented. They merely recommend that a software engineering type of versioning be used for the management for all kinds of configuration data. This is not the same story I have been telling in these articles, but it is a higher level aspect of it.

It is probaby possible to describe any enterprise as a number of interacting services, trading with one another for mutual advantage. Once the advantage disappears, the enterprise falls apart. ITIL, and its beauracratic rival the NGOSS/eTOM (enhanced Telecom Operations Map) help managers to see some aspects of good governance, so that service providers will be able to document their organizations and associated feel-good behaviours, but they speak only of quality of the process surrounding the service (a sort of managerial version of the meditation discussed above). They say nothing about practical implementation, the technical challenges.

In this series I have tried to show that the matters of system configuration and policy guided behaviour have a technical basis that is sometimes ignored. The future of low-level configuration management (in the sense of computer governance) lies with automation, while the high level behaviour of interaction lies in commerce. There is much research to be done in this area. We must understand the science and the economics of this, not merely the tradition and the doctrine -- and that science is of communication. Communication, in turn, requires language: there must be a language to express the changes, desires and policies of our self-regulating systems. And only when all of these pieces of the puzzle are in place can we say that we have fully understood configuration management.