In Search of Science Behind `Complexity'
Deconstructing infamous and popular notions of Complexity Theory**
[Part 1][Part 2][Part 3]
Complexity theory, as it is often called, has caught the popular imagination in recent years, with many authors claiming to write about the subject from a layman's perspective, applying it to all manner of problems from business to system design. But is there a science behind the bravado?
Complexity and self-organization are common words that anyone with some authority could adopt---but they do not always represent the body of theory that has become `complexity theory'. But the value of scientific concepts lies in their having a specific technical meaning, independent of who uses them, and preferably based on measurement.
In this essay, I try to go back to basics and briefly summarize what complexity science actually says, and why it might be reasonable in some cases to draw broader conclusions about complexity in human-machine systems.
Personal note: Back in the mid 1980s, when I was contemplating a PhD, the chair of Physics at Newcastle University in England P.C.W. Davies showered me with a collection of the latest writings about the New Physics of Complexity (for publication in a book he was editing called The New Physics), hoping to interest me in a thesis on this topic. After studying the papers at length for some time, I eventually turned down his offer in favour of the more concrete Quantum Field Theory. A PhD is hard enough when the rest of the world knows what the subject is about, but `complexity' was something nascent, and fascinating, that had yet no particular coherence. Later, when I was elected to the rank of Professor in Computer Science, one of the panel judges was connected to the Santa Fe institute, judging my work on human-computer systems as computing not physics. I did not quite manage to escape its clutches! Today, however, `complexity' is far more widely known, having grown from the success of James Gleick's first book Chaos, the first of many, and if we have not understood it any better, then at least we are less shocked by it. Today, no one denies the existence of collective phenomena.
Origins of complexity - the history of information
Complexity theory emerges from the study of dynamics in systems, where unexpected and unpredictable effects arise as a result of interactions between the system's parts. Examples include the polar opposites chaos and self-organization. The apparent universality of `complexity' as a cross-disciplinary theme derives from the universal importance of information and scale in all kinds of scenarios.
The concepts around complexity grew out of the statistics and combinatorics of Boltzmann's thermodynamics in the 19th century (where the roles of scale and equilibrium were appreciated), as well as the analysis of species that came from Darwinism (where environmental selection pressures were appreciated). Boltzmann was amongst the first to quantify the relevance of information and combinatorics to explain bulk properties of large-scale systems, and distinguishable system configurations, while Darwin saw the heuristic mechanisms at work in biology.
Previously, most physical phenomena has been easily isolated from their surroundings, only weakly coupled and hence more or less free to evolve from an initial state to a final state. But now a new class of systems were being discovered, where the defining equations of the best mathematical models were non-linear, and the boundary information (from the surrounding environment) became an actual input to the future evolution of the system behaviour, just as important as the initial state.
These empirical phenomena, exhibiting what I'll call `technical complexity', were probably first noticed in the fluid dynamics of the 19th century, and particularly the study of weather. Unlike models of other physical phenomena, the Navier-Stokes equations describing fluid flow were known to be non-linear, which brought feedback from the fluid on itself. This seemed to have a connection with certain unusual behaviours. (Henri Poincaré's solution of the 3-body problem in Newtonian mechanics is also often mentioned as one of the precursors to the indentification of so-called deterministic chaos.) So-called emergent phenomena like turbulence, and the magical self-organizing cellular structures of Bénard convection were known to appear spontaneously under certain conditions, governed by a dimensionless ratio called the Reynolds number. However, it was not until the 1960s that the full importance of these unusually `complex' behaviours began to be understood.
Turbulence was a phenomenon, at one time, never experienced by humans, but it is now taken for granted by anyone who flies or sails. It is an erratic motion that we would now called chaotic. Conversely, Bénard convection is highly regular and organized (now called self-organized). As turbulence (caused by the aircraft's motion through the air) buffets the aircraft itself, its trajectory is altered in a non-linear way, by Newton's third law. This is completely deterministic in principle, but not easily predictable, because information from surroundings is mixed into the evolution continuously, expressing the non-linearity. These are two examples of the hallmarks, often associated with non-linear behaviour, and thus both irregular and unusually-regular phenomena are both seized upon as evidence of underlying `complexity'.
The subject of complex (literally `intertwined') effects of feedback grew throughout the 20th century, with an almost exponential trajectory of its own. Following the second world war, attention turned to smart machinery, computers and robotics, as witnessed by Asimov's tales, where feedback used as regulation and control was now treated as a tool, not just a phenomenon. William Ross Ashby , writing in the late 1940s about cybernetics, credits Sir Ronald Fisher, a bio-statistician and time-series scholar who mathematicized Darwinian evolution as the originator of the concept of complex behaviour in the 1920s --- showing an awareness of developing ideas. Warren Weaver , popularizing Claude Shannon's information theory, contemporaneously describes two forms of complexity: disorganized complexity, and organized complexity. It was the 1940s when interest in the subject began to have very practical technological consequences, and the subject was launched in earnest. This is closely associated with the understanding of information[1,2,3].
Already here there is scope for confusion. Is it the system that is complex, or its resulting behaviour under certain conditions? Clearly it is possible for a system to appear structurally complex (e.g. a flower), but it is also possible for simple structures to exhibit complex motions (e.g. Brownian motion). There the complexity latent in the system is hidden within hidden strong couplings. The two do not have to be manifest together. Some authors try to distinguish these by calling them complicated and complex respectively, however, a complicated system is also a complex result, and often imprints that complexity back onto its behaviours. This distinction is not clear. Perhaps the terms temporal and spatial disorder/complexity would be better choices.
Consider some unusual collective, cooperative behaviours in physics. Superconductivity, Bose-Einstein condensation, and lasing are collective effects, which could also be called emergent, but they do not seem particularly complex. Quite the contrary: they are stable equilibrium phenomena. Similarly laminar flow of fluid in a pipe, or formation-flying of aircraft are not complex, though they emerge in tandem with their surroundings, whereas turbulent flow or stationary flocking behaviour of birds are---yet all might be called emergent, adaptive, and cooperative behaviours. Thus another dimension of complexity seems to be unstable, non-equilibrium behaviour. The Nobel Laureate Ilya Prigogine has made a career out of studying non-equilibrium physics . What we are seeing here is regions of stability and regions of instability in systems. One of the important questions in science is: how do we map out these regions or phases of behaviour?
A related subject area known as Complex Adaptive Systems has emerged along side these more natural phenomena. It is related to computation. This includes the study of agents, including ants and swarms. `Complex' refers to the presence of many interacting agents that modify their own environmental boundary conditions continuously. `Adaptive' refers to the idea that the behaviour of the agents is governed by the competition between exploratory freedoms and selection pressures (as in evolution). In one context, an agent might behave differently than in another. Finally, `system' suggests that the agents somehow work in concert.
It is not difficult to call Complex Adaptive Systems a science, as observation plays an important role, but it is difficult to define it theoretically and concisely, without alluding to non-linearity. This leaves it an easy target for abuse. To go beyond heuristics, it is challenging to measure the feedback in a controlled environment, and few principles have come out of it. Its modelling is more closely related to evolutionary game theory, or the study of economic games. However, rule based automata (sing. automaton) have been studied as computational machines, and as models of the processes of living systems[22,17,20]. Cellular automata are dumb agents whose trajectories are based on rules about surrounding agents, so the agents are both the trajectories and one anothers' boundary conditions. Non-linearity is implicit, though hard to see: because of the way agents apply the rules globally in synchrony, each agent's behaviour depends on the state of its neighbours, and they---in turn---depend on the behaviour of the agent itself in the recent past.
To track causation, in such a system, it would require an exponentially growing amount of memory, like snapshots in software version control (which might allow you to `roll back' in the sense of a data transaction, or what physics calls a closed system). A statistical way of saying this is that they are not Markov processes. Thus, while all evolutionary processes forget their past states, in linear systems it doesn't matter because one could reverse engineer them, or `roll them back' to undo a sequence of incremental changes, but in non-linear systems mixing and merging of information would make it like trying to un-bake a cake (see part 2 of this essay). Software engineers know how `version control' is more `out of control' when multiply branched sources get merged.
The lesson here is that both the nomenclature and the origins of complexity are somewhat complex themselves! One need only witness the plethora of pages about complexity on Wikepedia, as writers stake their own claim to a version of the truth as they see it.
Name dropping on complexity
Some of the important contributers to complexity (in a complex order):
- Navier-Stokes (equation) - the basic mathematical model of fluids, from which chaos first emerged.
- Osborne Reynolds - fluid mechanics, as in Reynolds number, which governs the onset of chaos.
- Edward Lorenz - meteorologist, and originator of the famous Lorenz equations and butterfly effect (chaos).
- John von Neumann - early work in information theory, economics and game theory, quantum theory, and computation.
- Claude Shannon - Informational entropy - density of information per unit message
- Norbert Wiener - Brownian motion and feedback systems, control and signal theory, leading to cybernetics.
- William Ross Ashby - Formalizer of Wiener's cybernetics for non-digital systems
- Andrei Kolmogorov - mathematician who defined information and complexity.
- Steven Wolfram - early pioneer in cellular automata (after von Neumann) - later famous for Mathematica
- Noam Chomsky - definition of languages and grammars and early algorithmic complexity, classification by automata
- Robert Axelrod - multi-agent systems and competitive game theory.
- Murray Gell-Mann - particle physicist (Nobel laureate) and early pioneer in complexity.
- James Gleick - author, wrote a popular book in the 1980s on Chaos.
- Stuart Kauffman - researcher at the Santa Fe institute, studying biological self-organization, NK model. "Life begins on the edge of chaos", where the creative exploration powers of complex trajectories around a space of outcomes can just flip over into meta-stability long enough to allow a result to survive. This is a model of evolution as a search algorithm.
- Ilya Prigogine - pioneer of self-organization around non-equilibrium systems..
- Benoit Mandelbrot - mathematician who discovered fractals and self-similarity.
- Eric Bonabeau, et al. - swarm behaviour and `swarm intelligence'.
- W. Brian Arthur - Complexity in economics.
- Mitchell Feigenbaum - Discoverer of the Feigenbaum constants in the bifurcation route to chaos.
- Santa Fe Institute - a character in its own right, and one of the strongholds of complexity research.
Concepts relating to complexity
- Automaton - a rule-based processing machine, e.g. computer program or biological cell.
- Agent - any part of a system that has a capacity to act, change, or select.
- Degrees of freedom - the capacity of a system to change, e.g. 3-dimensions are three degrees of freedom. This is the starting point for any mathematical description of system behaviour (physics).
- Constraint - a limitation on the freedoms experience by an agent in a system.
- Equation of motion - the constraint that describes how systems move or behave in a particular trajectory.
- System trajectory or orbit - a solution to the equation of motion, i.e. the set of degrees of freedom that are compatible with system constraints.
- Fitness landscape (potential) - a value function of the degrees of freedom which selects preferred behaviour.
- Chaos - unpredictable system trajectories, usually non-linear and associated with period doublings or bifurcations in phase space.
- Non-deterministic - trajectories are not completely governed by a simple equation of motion, but probabilistic outcomes.
- Deterministic chaos - an unpredictable trajectory governed by a definite equation of motion, with non-linear sensitivity to boundary conditions.
- Attractor - an orbit or trajectory to which system behaviour tends over time.
- Strange attractor - an orbit that describes deterministic chaos
- Evolution - the change in a trajectory over time
- Propagation - tracing a trajectory from one location to another
- Information - A pattern in the degrees of freedom on a system
- Entropy - A measure of the information density in a region of space (see note in the box below)
- Order - A regular pattern in the agents of a system (low information density)
- Disorder - A pattern of high information density - interpreted as `irregular' or even random.
- Graph - the mathematical name for a network
- Phase (transition) - a system state in which the agents have a characteristic level or organization. A transition marks a change in the basic structure of system constraints.
- Long range order - the persistence of a pattern over a long range, implying some correlation or cooperation of agents.
- Catastrophe - the mathematical term for a sudden or discontinuous change
- Self-organizing - a system in which order arises spontaneously
- Coarse graining - the digitization of a system at a lower resolution (like pixelization) eliminating information.
Note added: When scientists and technologists talk about entropy and information, it can be confusing, because physicsts and computer scientists define information differently. Information in phyics is about uniqueness of an experiment within an ensemble of experiments, so it measures the uniformity of statistical patterns across many episodes (external time). Information in Shannon's theory of communication is about uniformity of states within a single episode, thus it measures the uniformity of patterns within a single experiment (over internal time). This is why information scientists would say that information is entropy, and why physicists say that is it the opposite of entropy!
How do we define and measure complexity?
The term complexity pops up as a theme, but to what extent is this just a banner for a collection of different phenomena? If complexity is a phenomenon, then it must be observable and characterizable. Ultimately these measures can be interpreted as measures of information density, such as how much net information is injected into a region of space.
Although I don't know of any consensus on this issue, my opinion is that complexity should be viewed as a characterization of a systems behaviour. Some trajectories are simple and some are complex. I say this because a trajectory is a kind of information stream, (a narrative), whose complexity can be characterized and computed by the definitions below.
A system is too large a thing to characterize, usually, (except for trivial flowcharts, without decision boxes). A system can have regimes of simple or complex behaviour. A good example is fluid mechanical systems (see the video visualization in part 2 of this essay), where these regimes can be predicted precisely using a scale parameter called Reynolds' number. Another is regular behaviour and thrashing in shared-information systems.
Here are some of the main candidates for measures of complexity:
|Message length or traced trajectory (information)||The total size of information describing a behaviour or trajectory. e.g. a circular orbit is simple, as it is described by its radius. A random walk is complex, because it requires many points to characterize it fully.|
|Shannon entropy (intrinsic information density)||The smallest description of a behaviour, compressed into bits to per unit message (dimensionless ratio).|
|Kolmogorov complexity||The size of the smallest computer program that could generate a behaviour (compute its trajectory). This is closely related to the entropy.|
|Algorithmic complexity||Various classes like P, NP, PSPACE, LOGSPACE, etc, measuring the length or time or memory required to execute a behavioural algorithm on a reference computer.|
|Grammatical complexity||The Chomsky grammars classified by the type of automaton required to parse or generate a grammar.|
|Gell-Mann effective complexity||A summary of measures reflecting irregularity.|
|Self-dissimilarity||The opposite of a fractal regularity - a random walk|
|Logical depth (a theory by Charles Bennett)||An algorithmic measure used in quantum cryptography|
|Network average node degree||The average connectivity of a graph or web.|
|Degree of hierarchy (Herbert Simon)||Based on sub-systems, hierarchy and near-decomposability.|
|Hierarchy depth||The number of sub-parts in a hierarchy or branching process.|
|Thermodynamic depth||A list of all the events of an evolutionary process.|
|Statistical complexity (Jim Crutchfield & Karl Young).||The minimum amount of information to optimally predict the behaviour of a system (Peter Grassberger speaks of effective measure complexity).|
|Mandelbrot Fractal dimension,||A measure of the effective degrees of freedom needed to span a trajectory.|
|Hurst exponent||A measure of the self-similarity in a time-series.|
|Lyapunov exponent||The rate of divergence of a dynamical orbit or trajectory, i.e. a measure of loss of control in a deterministic computation.|
Although some of these measures are equivalent, others are completely different. In practice, the Shannon entropy and Kolmogorov complexity are useful are measures of the variability in a signal. They are, in fact, measures of the shape of frequency distributions for behaviours, hence they are universally applicable.
What does it mean that a system is hard to describe? Relative to what? In science, only dimensionless ratios matter in the characterization of behaviour, because they represent relative comparisons of scale. For instance, the size of my feet is not intrinsically important, but the size of my feet relative to yours could be. Fractals are well popularized examples of something scale invariant, and can be characterized completely by such dimensionless numbers. Hence fractals are simple, even though they make beautiful and complicated patterns. It is when scaling symmetry is broken, and comparative ratios change that information floods into systems at its boundaries, changing behaviour. Imagine having one foot suddenly twice the size of the other.
Shannon entropy is actually a dimensionless measure of sameness in a
message or trajectory. This is related to order or disorder of the
message. It measures the density of information per message
length. The more a message contains non-repeating symbols or values,
the greater its entropy. If it repeats itself it can
fully characterized without the additional repetition.
Shannon defined entropy to be a measure of how uniform, or
spread-out the distribution of different symbols or components is, and
this was later shown how to agree with earlier definitions from
Boltzmann thermodynamics, underlining the deep connection between
equilibrium and information. The extreme cases are clearly,
minimum entropy in which there is only one category that
everything falls into, e.g. a single direction, a single musical note, a repeated character.
So in physics we say that a system is ordered, or low entropy and hence low complexity, if
all countable occurrences of something are tidily composed of only
one flavour of state, or perhaps just a few.
The more different cases,
flavours or alphabetic categories we find, the higher the entropy.
So, then there is maximum entropy, where everything is divided equally
to fill the maximal number of different categories, e.g. a trajectory changing
direction without a repeated pattern, or a `random walk' or a garbled
message. This is what we mean when we same `random'.
A message or trajectory with this statistical behaviour would be considered maximally
complex (random), given the alphabet of degrees of freedom.
This is the interpretation of entropy in the sense of passwords. A high entropy
password is one that uses the maximum possible range of symbols (evenly without
favouritism) in the available
alphabet, making it harder to guess in the sense that an attacker has a equal probability
of finding any character for each letter of the password (increasing the number of
possible guesses by factorial levels for each character).
The problem is applying these measure is that they assume a system can
be (and has been) decomposed into a constant set of states, with a
certain resolution (coarse graining) which might not be a natural
decomposition, e.g. in convection cells.
Information theory predicts maximum-entropy solutions to distribution
functions, under constraints to be the natural outcome of systems at
equilibrium, or after long times. Some of these are well known:
Wed Feb 18 13:27:45 CET 2015 I am most grateful for the advice of some reviewers on how to present the points in this series.
Thanks to James Urquhart, Katri Saarikivi, Jabe Bloom, Esko Kilpi, and John Allspaw for comments.
They are naturally freed from any responsibility for what is written.
Food for thought: ask yourself why we build monuments and logos
that look like the histogram above, i.e. as simple singular forms with
low information density. We always associate maximum
significance or meaning (semantics), with low information content, not with high
information content. Then we can add the meaning ourselves.
Popular Book References
 Norbert Wiener, Cybernetics: Or Control and Communication in the Animal and the Machine (1948)
 Shannon and Weaver, The Mathematical Theory of Communication (1948) reprinted 1949  William Ross Ashby, An Introduction to Cybernetics (1956)
 Arun V. Holden, Chaos (1986)
 P.C.W. Davies, The New Physics (1989)
 Thomas Cover and Joy A Thomas, Elements of Information Theory (1991)
 Stuart Kauffman, The Origins of Order (1993)
 Ilya Prigogine, The End of Certainty: Time, Chaos and the new laws of nature (trans. La Fin des Certitudes) (1996)
 Robert Axelrod, The Complexity of Cooperation, Agent-based models of competition and collaboration (1997)
 D. Dasgupta, Artificial Immune Systems and Their Applications (1999)
 Eric Bonabeau, Marco Dorigo, and Guy Teraulaz, Swarm Intelligence (1999)
 Stuart Kauffman, Investigations (2000)
 Steven Johnson, Emergence (2001)
 Albert Laszlo Barabasi, Linked (2002)
 P.C.W. Davies and Henrik Gregersen, Information and the Nature of Reality (2010)
 Mark Burgess, In Search of Certainty (2013)
[++] Melanie Mitchell, Complexity - A Guided Tour (2009) - a nice overview somewhat different to mine
 J. Von Neumann, The General and Logical Theory of Automata. (1951)
 Seth Lloyd, Measures of Complexity - a non-exhaustive list
 Murray Gell-Mann and Seth Lloyd, Effective Complexity
 Stephen Wolfram, Universality and Complexity in Cellular Automata
 Cynefin framework for sense-making of complex knowledge
 Cellular automaton papers by S. Wolfram
 Notes on Melanie Mitchell: defining and measuring complexity
 How Can the Study of Complexity Transform Our Understanding of the World?
The nature of causation in complex systems (George Ellis)
John Allspaw, Translations Between Domains: David Woods
Richard Cook, How Complex Systems Fail (Medicine)
Combining complexity with narrative research.
Risk, and human factors
 J. Tainter, The Collapse of Complex Societies (1988)
 D. Dorner, The Logic of Failure (1989)
 J. Reason, Human Error (1990)
 S. Dekker, Drift into Failure (2011)
Shannon defined entropy to be a measure of how uniform, or spread-out the distribution of different symbols or components is, and this was later shown how to agree with earlier definitions from Boltzmann thermodynamics, underlining the deep connection between equilibrium and information. The extreme cases are clearly, minimum entropy in which there is only one category that everything falls into, e.g. a single direction, a single musical note, a repeated character.
So in physics we say that a system is ordered, or low entropy and hence low complexity, if all countable occurrences of something are tidily composed of only one flavour of state, or perhaps just a few.
The more different cases, flavours or alphabetic categories we find, the higher the entropy. So, then there is maximum entropy, where everything is divided equally to fill the maximal number of different categories, e.g. a trajectory changing direction without a repeated pattern, or a `random walk' or a garbled message. This is what we mean when we same `random'.
A message or trajectory with this statistical behaviour would be considered maximally complex (random), given the alphabet of degrees of freedom. This is the interpretation of entropy in the sense of passwords. A high entropy password is one that uses the maximum possible range of symbols (evenly without favouritism) in the available alphabet, making it harder to guess in the sense that an attacker has a equal probability of finding any character for each letter of the password (increasing the number of possible guesses by factorial levels for each character).
The problem is applying these measure is that they assume a system can be (and has been) decomposed into a constant set of states, with a certain resolution (coarse graining) which might not be a natural decomposition, e.g. in convection cells.
Information theory predicts maximum-entropy solutions to distribution functions, under constraints to be the natural outcome of systems at equilibrium, or after long times. Some of these are well known:
Wed Feb 18 13:27:45 CET 2015
I am most grateful for the advice of some reviewers on how to present the points in this series. Thanks to James Urquhart, Katri Saarikivi, Jabe Bloom, Esko Kilpi, and John Allspaw for comments. They are naturally freed from any responsibility for what is written.