The Brain Horizon

At what scale should an individual become a society?

As technologies evolve, the pendulum often swings between the paradigms of centralized and distributed control, in all areas of human life, from government to work. Rhetoric polarizes supporters of each approach into opposite camps. Surely centralization doesn't scale? Surely centralization is the only way? But what does this really mean? I like to think about this in terms of brains and societies.

[Talk based on this essay]

Brains and Societies

A brain model is one in which there is a centralized controller that reaches out to a number of sensors and actuators, through a directly connecting network. The intent is to control the network, by thinking faster than the typical processes in the network itself. If it cannot work significantly fast than the network it controls, the brain is just along for the ride, and is an irrelevant passenger (like an appendix). Thus a brain model is about separation of timescales. Patently, this is how brains work in animals: the brain is both localized (inside the head), and acts a controller for the nervous system. The nervous system connects it to the rest of the body, for sending and receiving pulses of communication. Not all of the body's systems are managed in this way, of course, but the intentional aspects of our behaviour are.

A brain model is a push-based signalling model. Signals are transmitted from a common command-and-control centre out to the provinces of an organism, and messages are returned, in the same way, to allow feedback. We are so familiar with this approach to controlling systems that we nearly always use it as the model of control. We are used to controlling the world in this way, and the technologies we make are often designed like extensions of our bodies, to be moved, like limb, with the push of a level or the pull of a string. We try to extend this sense of control to govern societies too, from the top down, using police or military power as a nervous system. We form feudal and company hierarchies, with a brain at the top and actuators at the bottom. This controller/dictator model has served most societies around the world for millenia (the Western rhetoric around freedom and democracy is much overstated). The advantage of this model is that we can all understand it. It feels like an extension of our bodies. But there is one problem: it doesn't scale favourably to `large size'.

The alternative to a brain model is what I'll call a society model (assuming a society that is not a central dictatorship police-state or a military junta). I imagine that it has a number of autonomous parts (perhaps called institutions or departments, communities or towns, depending on whether the clustering is logical or geographical), which are loosely coupled, without a single central controller that covers all aspects. There is specialization into different controller `organs', and there is redundancy. There can be cooperation or independence between the parts . There are places where certain information gets centralized (libraries, governing watchdogs, etc), but the parts are capable of surviving independently and even reconfiguring without a central authority.

We say (loosely) that societies scale better than brain models, because they have no obvious single processing bottleneck or point of failure, and can form local cells that interact only weakly at the edges, trading or exchanging information, but not dragging one another down in a crisis. If one connection fails, it does not necessarily become cut off from the rest, and it has sufficient autonomy to reconfigure and adapt. There is no need for there to be a path from every part of the system to a single Rome-like location, which then has to cope with the load of processing and decision-making by sheer brute force.

Brute force can be a successful strategy for operating an organism, but when it fails, it fails catastrophically (think of images of a falling high-rise/skyscraper and compare to a house). When a society fails, it is a slow process of decay.

What does it mean for a system to scale?

Engineers like to bluntly declare `that won't scale', using gut feelings based on experience. What they generally mean is that there are likely to be unspecified problems as they attempt to increase the size of a system, by increasing its workload.

In physics, there is a specific (essentially `economic') meaning to scaling: physics is the economics of energy, or economics is the physics of money, which ever suits you best. It is about whether a system continues to perform in measurably the same way, as it changes size. The relationship between these accounting units (of energy or money) and scale owes to the behaviours of spatial topology or networks. Computer systems, for instance, only vary over a few orders of magnitude, but if we look at natural science, we can measure properties from just a few things to billions of billions of things, and it turns out that there are often predictable scaling patterns.

In biology a well-known result is the `75%' scaling rule, or economy of scale. Since Kleiber 1932, it has been known that the metabolic rate B, or energy usage in most animals and plants follows a scaling `law':

B ~ M0.75

where M is the body mass. This is called sub-linear (less than 1) scaling. To put it simply, if you double the size of a organism, you only need 75% more energy, but you slow down and get less out of the energy. Think of a fly versus an elephant. Most other rates, such as heart rates, reproductive rates M-0.25 and reproductive rates and lifetimes scale with quarter powers M+0.25. In other words, the bigger you are, the slower you are, but the longer your life span, and the more energy you use (cubed!)---but still less than your investment. Size has a diminishing return on investment, and the energy savings are not enough to justify the lower return. Thus there is a maximum size to organisms. These patterns have their explanations in the properties of networks.

How does this help us to explain information systems? Let's consider another example, which is more like information systems, namely the scaling behaviours in cities. This brings up a new issue. Whereas biology is dominated by the dynamics of networks, cities have an additional aspect to consider, which is a characteristic of human intent, which I call semantics. When semantics play an important role, the behaviour can be qualitatively different.

Studies at the Santa Fe Institute, by L. Bettencourt, G. West, and many others, apply the physics notion of scaling to study cities. They have shown that there are two kinds of networks in competition in a city: an infrastructure network (as in biology, with economies of scale), and a social or semantic network with a surprising positive return on investment.

So, as a city grows, you need less petrol stations per capita, length of roads, electrical lines. These are economies of scale (at about 85%), similar but not identical to biology (after human-machine infrastructure is qualitatively different). However, the meaningful (semantic) paybacks in wages, patents, disease, GDP, waste, police, and crime, all exhibit super-linear scaling (115%), because cities bring people into closer contact, enabling functional or semantic payoffs that could not be possible without multiple, compounded efficiency gains. In biology, semantics are locked into DNA, which is wrapped in the same cells, so creativity only happens through sexual reproduction. In a city, creativity is happening everywhere, but it can be focused at central hubs (see Smart cities and clouds).

Competition between social semantic networks (superlinear) versus resource networks (sub-linear) drives the growth of cities. In biology, dynamics limit semantics, but in cities semantics optimize dynamics by changing the nature of the interactions! The semantics and the dynamics scale differently.

What does this mean for information infrastructure and other mixtures of dynamic and semantic behaviours? Companies and computer operating systems are like very small cities, in a sense, but they show different behaviours than than cities. Again, semantics dominate, but without the dynamic robustness of size. While cities rarely die, companies and organisms die all the time, suggesting that robustness lies more in size than in density.

In Computer Science, one thinks about scaling as the linear processing throughput response of a load-handling queue (see Gunther's Universal Scaling Law), or Amdahl's Law. However, other scaling ideas relating to semantics have been proposed, analogous to physics and biology in promise theory.
Does centralization play a role in scaling?

A city packs effort and communication into a small space, bringing human reason together in a creative way, like a brain. The city can spread its innovative gains to outlying areas, elevating a certain region around it, as an organism. Is it the centralization itself that brings about this benefit, or merely close proximity? There are two things at work here:

  • Density (proximity) within a city or brain enables interior networks to quickly share information and reason, with positive benefits.
  • Exterior transport networks connects this central `intelligence' directly to where it might be applied.

One interesting comparison, at the level of societies, can be seen from recent history. The Chinese economic reform, these past thirty years, offers rare and fascinating insight into (de)centralization on a social level. Europe's rapid rise into industrialization, over recent centuries, which quickly surpassed and dominated the much older and more technologically advanced Chinese civilization, has been attributed to the greater centralization in European towns and cities, where viral spreading of innovation arising from urban living, led to rapid spread of advancement. Much of this happened through competition in warfare (`social networks'), by contrast with China's widely distributed, peaceful, self-sufficient, and agrarian society.

In China, logical central regulation of industry, by the communist party, contrasted with the decentralization of its populace. Central edict was slowly relaxed by Deng Xioping. It has been claimed that the decentralization of life established under Mao enabled China to weather several local crises during an attempt to build industry under the central control of the government. The recent growth of China is a complex interplay of centralized regulation and decentralized robustness. Later, more direct control was ceded by the government to allow the Chinese economy to grow unfettered[4]. The scaling laws mentioned above are consistent with this explanation, though the story is obviously more complicated than my simplistic rendition.

What this suggests is that centralization of intent is not necessarily related to centralization of resources, and the natural world around us mixes these extremes in a pragmatic way to optimize costs.

The argument for centralization - intelligence as a service

How should we use this information to scale and understand human-technology systems?

What a brain gains from bringing information together is the ability to make quick comparisons. All the information is at hand (proximity). Once you are able to compare quickly, you can also make decisions. This is what we think of as intelligence, and it is compelling. The question is, can we transmit the results of those decisions to the entire system fast enough to make a difference? A promise that is kept too late is not kept at all. If all the information close by (you don't need to collate information across a wide area, or what you need to know can be cached locally), then decentralization is a superior strategy that avoids the diminishing return transportation scaling cost.

What this boils down to is really that intentional control can be viewed as a fault model in which adjustments are like repairs to desired state. The Mean Time To Repair ("MTTR") is made longer by the need to transport the state over distance, so transport scaling plays a role in throttling the size of a system.

Different networks can use different strategies on a `need to know' basis. DNA activation has all its information locally, so it is distributed and autonomous. It decouples from the brain. The immune system is mobile and can find its information and act on it locally, so it is distributed and autonomous. It is largely decoupled from the brain.

Many of the technologies we build are remote-control dashboards for a human homunculus to interact with, or a semi-automatic controller that does some remote regulation from a central place. The problem with transporting data is that it has a high cost (think of the diminishing returns on size in biology and cities). You don't want to transport very much data, else it would take too long to react and make decisions. This makes the transport channel (rather than the brain itself) a bottleneck that puts a maximum rate of response.

The other problem with transporting all the data is that you bring it to a finite decision-making resource: the brain itself has a finite speed and a finite capacity to read at a certain rate. Then, add to that the time it takes for signals to be sent, and you can calculate the maximum size and reaction time of the organism it can handle. A society can embrace a "big data" problem in a parallelized distributed sense, and process it as distributed state without any bottleneck of communication or processing.

Centralization costs the brute force needed to receive all the information (dynamics), and the requisite diversity or knowledge base to speak the language of all your sensors and actuators (semantics), i.e. having a central government that can understand all the nuances of the districts without succumbing to generalization. If your brain needs a `device driver' for every replacement sensor cell, you either prevent the independent evolution of the distributed sensors and actuators, or conversely try to keep updating your brain software to match their evolution, then you've amplified the congestion of the bottleneck. The greater the variety of semantics you have to support in your brain (the more kinds of promises), the more of a burden it becomes to support environmental diversity.

So there are two scaling problems in centralization:
  • The central capacity to collate and process decision-making with diverse semantics.
    The study of cities suggests that this favours centralized proximity.
  • The mean time to communicate results to where they are needed.
    The study of biology and cities suggests that this favours decentralization.

Is a brain the only way to build reactivity, or is there also reactive intelligence in a distributed society of autonomous agents? Reptiles and invertebrates have a greater degree of autonomy. If you cut a worm in half, it continues as two worms. Perhaps the balance between centralization and decentralization has more dimensions than we think. It always boils down to two key forces: dynamics and semantics.

Finite speed of communication - and CAP conjecture

All signals are transmitted with a finite speed. That has several implications.

  • We can only respond as fast as information travels.
  • There is an event horizon beyond which we cannot see at all, because signals do not have time to reach us.

This basically summarizes the CAP conjecture, and the limits of information availability and its consistency. Our ability to form and maintain relationships (knowledge) with remote parts depends on them being close at hand (proximity). Long distance relationships don't work as well as short distance ones!

If messages take longer to send and receive, then an organism must react more slowly, so scaling up size means scaling down speed, and vice versa. Large animals like whales and elephants are slower than smaller creatures like insects. The speed of impulses in our bodies is some six orders of magnitude slower than the speed of light, so we could build a very big whale or information infrastructure using photonic signalling.

The centralized approach used in many technologies today is a modern luxury, afforded by high bandwidth reliable networking, where the overhead and reliability of signalling are favourable to this strategy. Yet simple physics tells us that there has to be a limit to the size of such an animal, and the speed of light cannot be improved much, so physical reach will be limited. Fortunately, our planet is about 0.4 vacuum light seconds in diameter, which means it is a few signal seconds journey time on a good day. Centralized solutions can be fast enough on a planetary scale, but they might not be safe.

Paralysis by hardware and software - single points of failure

A fragility of a brain model is that signals can be cut off. If you sever an animal's spinal chord, it is pretty much paralyzed as a sensor network. Its separate respiratory system keeps it alive in a process sense. Even without a physical break, if the `device driver' software becomes outdated to talk to the external world because the brain can't keep up, there will be a virtual disconnection. One can of course rely on redundancy to make systems more resilient, but then you are basically starting down the path to a society model. The need to committing to the full transition will come from economic imperatives.

Engineers are often taught to think in terms of controllers, encouraged towards centralization by habit. This is a natural problem-solving reaction for an engineer, thinking: how can I get involved in this process? Give me sensors and state that I can calculate with and manipulate. Engineers want to insert their own logic, if not their bare hands, into processes and be part of the system.

Architects, on the other hand, and town planners have to think differently. They want to design systems that will stand on their own merits, with all of the cooperative parts necessary, and all the continuous relationships internal to the structure. Brute force can allows us to avoid confronting the issue of robustness, by trading effort for sound design. The tradeoff means that, when the system fails, it is likely to do so in a catastrophic way.

Trade-offs in centralization - some promise theory

At what point do we need reactive intelligence (as a service)? Let's examine some of the promises that agents can keep, and dissect these models according to promise theory principles. Promise theory does not say anything about centralization per se, but we can find some simple conclusions by following the principles.

Two basic tenets of promise theory are:

  • Every agent promises only its own behaviour.
  • Independently changing parts of the system exhibit separate agency and should be modelled as separate agents.

A brain `service', whether centralized or embedded in a society, promises to handle impositions thrust upon it by sensors, and to respond to actuators with its own impositions. It also promises to process and make decisions, thus any brain will benefit from speed and processing power. Any agent can handle up to a certain maximum capacity C impositions per second, before becoming over-subscribed. Thus every agent is a bottleneck to handling impositions. Horizontal expansion (like a society) by parallelization handles this in a scale-free manner. Vertical expansion (like a central brain) has to keep throwing brute force, capacity and speed at the problem. Moore's law notwithstanding, this probably has limits.

Borrowing a simple truth from queueing theory, it is now easy to see how to define a horizon for a centralized brain model, in a scale-free way. It is the queueing instability, when the rate of incoming impositions reaches the average rate of processing them (see figure). A society will scale more or less linearly until the cost and limits of communication flatten its growth.

Specialization implies regions of different agency, with all kinds of implications, such as learning, etc. Here promise theory simply tells us that these should be understood as separate agents, each of which promises to know how to do its job. Those regions could be decentralized, as in societal institutions, or it could be specialized regions of a central brain (as in Washington DC or the European parliament). A centralized brain makes it easier (and faster) for these institutional regions to share data, assuming they promise to do so, but it doesn't place any limits on possibility.

Only observation and correlation (calibration) require aggregation of information in a single `central' location. Control and dissemination can be handled in layers of details:

  • Micro-management and central authority - fragile, scales by brute force
  • Policy guidance and local autonomy - robust, scales through caching
  • Complete autonomy with no sharing - lacks intelligence of shared state

What we must understand about systems is that they are governed by timescales at each individual point. Each independent agent measures time according the its local environmental processes. What seems fast in one context, is slow in another. In each case individually, the optimization we seek is: do I have time to wait for a signal, and do I have access to the intelligence to respond well? The impact of these decisions to semantics and dynamics are what is at stake.

We can summarize:

Brain (centralized)Society (decentralized)
Easy to understandHarder to understand
Easy to trustHarder to trust
Direct causation by serialized signallingEmergent parallel causation spontaneous
Global information
"God's eye view"
Local information
"Local observer view"
Sense of hands-on controlShepherded control
Chance to determine a global optimumLocal optimum
Push thinking (imposition)Cooperative thinking (promise)
Long signal time overheadShort signal time overhead
Slow reactivity over wide areaFast reactivity over small area
Quick adaptation to global stateQuick adaptation to local state
Catastrophic failure modesAttrition failure modes
Fragile architecture (strong coupling)Robust architecture (weak coupling)
Congestion bottleneckNo congestion bottleneck
Calculable uncertaintyCalculable uncertainty
Scale dependent
(fixed by brain capacity and signalling speed)
Scale independent
DependsDepends
Possibility to exploit comparisons
(assumes availability and low latency)
Exploit parallelism in situ
Hidden treasure, X marks the spotHere be dragons
Centralization does not bring determinism

The arguments that are often presented for centralization are often based on a mistaken gut feeling that it brings a determinism to system control, because it feels `hands on'. If centralized systems are more deterministic than decentralized ones, it is because they focus intent in a simplistic way, and are thus easier to understand, not because they are more certain of outcome. The problem we face in engineering is that we must deal with indeterminism, whether we like it or not. Often that benefits from distribution in time and space. The same arguments may be applied to any kind of system, and even to the process of creative problem solving. Eventually, as scale increases, you will have to confront the diminishing returns of centralization and learn to go "society", either by subdividing into shards or specialized services. If you go the shard route, you are creating multiple organisms. If you go the route of service ecosystem, a scaled society becomes just a brain, at a larger and slower scale.

Companies like Google, who wield massive computational resources, are re-centralizing and going to brute force route with their SDN strategies. They have the speed, expertise, and the capacity to make this work for longer than most others, but they are still vulnerable to catastrophic failure modes. To avoid those failures, information scope has to be limited by the formation of specialist cells. Today we speak use shards, and service oriented architectures.

How to choose?

Organisms with centralized brains and organs evolved for a reason. Brains are quick-thinking and have a position of analytical speed and superiority over the domains they influence. Centralized specialization can make organisms smarter, more adaptable, because it is small and agile. If such organisms or systems are quick on their feet, they can avoid danger by modelling and adapting.

A society, on the other hand, adapts slowly as a whole towards an equilibrium, but has resilience and can quickly fight off a localized problem, without needing the permission of a brain. Societies scale by embedding policy (like DNA in cells, or norms and rules) so that all of the parts know how to react independently.

Engineers are sometimes irrationally afraid of the kind of emergent behaviours that rule societies, because one does not have the tactical superiority of a speed advantage; but, the very systems that support us every day all exhibit emergent behaviour: agriculture, the weather, the atmosphere, the Internet, society itself, and indeed our brains themselves. Predictability is all about the timescales. We organisms will die one day, but society can live on. What does that say about the way we should design information infrastructure?

Tue Jul 22 08:59:54 CEST 2014, Revised Tue Dec 23 12:15:57 CET 2015

[Original text]

References

[1] Banavar JR, Moses ME, Brown JH, et al. A general basis for quarter-power scaling in animals. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(36):15816-15820. doi:10.1073/pnas.1009974107.
[2] L. Bettencourt, The Origins of Scaling in Cities, (2012)
[3] M. Burgess, Spacetimes with Semantics (II), Scaling of agency, semantics, and tenancy, arXiv:1505.01716 [cs.MA] (2015)
[4] W. Hutton, The Writing on the Wall: China and the West in the 21st Century (2007)

How CFEngine exploits a decentralized "society" model - see my velocity presentation.