Virtualizing virtualization to scale infrastructure to an Internet of Things

The curious case of the missing abstraction

(Shortcut to paper)

The concept of virtualization (simulated abstraction) is as old as computing itself. Lately, virtual machines and process containers have stolen much of the attention, but all this seems like a tiny fraction of the issues we face in comprehending and managing information infrastructure at scale.

The purpose of abstraction is to make generalizations, in which we overlook unimportant differences to see what is of common value. But we are quite selective about where we draw our virtual lines.

Why, for example, would a collection of uniform container resources, like computational hosts (virtual or not), be addressed differently from a collection of database records, or storage units? Why would we need entirely incompatible descriptions for allocating and using them?

Our failure to find a common understand of these different exemplars, leads to all kinds of peculiar transducing (aliasing) technologies (kernels, DNS, ARP, VLANs, cloud-stacks, container allocators, filesystems, etc) to allocate one resource in the guise of another. Witness the many colours of network protocol used to describe the exact same network. Now, imagine describing a design in a policy language, storing in a central database, distributing to different locations and then implementing that image on the canvas of the infrastructure itself. Today, this involves countless technologies and a ridiculous number of incompatible representations.

A network of many Things

Our generation will experience an information infrastructure every bit as complex and pervasive as life itself. It will be both partitioned and networked, mobile and stationary, embedded and de-localized, everywhere and nowhere, machine and human. It will interact with human processes and narratives. We shall count and measure it in nanoseconds, transactions, versions, projects, technology epochs, and perhaps even lives. Space and time are at the heart of the challenge.

Clearly, scaling such an infrastrucure is not merely about managing large numbers of boxes. It is not even about `big data'. It is also about speed, mobility, longevity, i.e. the scales of time and space. Now add to that purpose and motivation (tenancy, utility, function, etc).

I believe we need a new set of abstractions to understand the challenges, and so I am starting a new project to explore the possiblities to that end. As always, in my view of the world, I am starting at the bottom.

Combining semantics and dynamics

There are different ways one could approach this, but one approach, which respects the scaling, is to completely de-humanize the human aspects of infrastructure. This keeps ad hoc issues under bounds, and limits the intrinsic non-scalability of semantics (in this work so far, one sees how this arises). This is the same approach taken by resource allocators: concentrate on the boxes, not what is inside them. Dynamics before semantics.

In my popular account of the science of infrastructure In Search of Certainty, I discussed how we can go about unifying our understanding of the two aspects of information technology: semantics (purpose, meaning, how things are qualitatively interpreted) and dynamics (the speed, size and measurable quantities) -- indeed, how we must do so to fully grasp the problems that lie ahead. For example, if we take something like resource sharing.

  • Dynamics: Multi-process sharing.
  • Semantics: Multi-user, multi-purpose sharing (multi-tenancy).

To see these two aspects as related (they are formally independent but not orthogonal, to use a math-phrase), would allow them to be treated on a par. This alone would favour the understanding of a world of smart pervasive infrastructure, where human needs and technology collide all the time.

Modellizing (or dehumanizing) the human aspects is a sensible way to approach quantitative engineering of systems in a unified way.

Over the past decade, I have described the mathematical tools to address these various aspects of the challenge in Analytical Network and System Administration and Promise Theory respectively, as well as spending some years understanding the long overdue perspective of Knowledge Management. From these works, tools have emerged (and existing technologies have been better understood as) building on the principles. But principles are more important than ephemeral implementations. Today, scale and scalability (apologies to Jane Austen) dominate our concerns, and sometimes in ways that we don't even realize, as they extend beyond the leaky boundaries of any private infrastructure.

A new angle, based on space and time

So, I want to propose a new angle for exploring abstractions, building on all of the above. It offers a foundation for this scaling, and unifies the different aspects to address a number of basic questions for networking, computation, storage and knowledge.

  • How do we organize resources and processes?
  • How do we find resources and trace movement?
  • How do we interpret resources and understand narratives?

In this new approach (which integrates much well-known theory), our current understanding of virtualization can itself be virtualized away, or alternatively highlighted as a special concern, just as we like. We find no interest in `boxes' as indentifiable things anymore; they become resources just like bytes and words. More importantly, we can understand the limitations of location, consistency and time for planning and prediction.

Virtualizing space and time (again)

Space and time are two things we understand quite well, perhaps not at an abstract level, but at least intuitively. I posted some familiar scales in an earlier blog. As I discussed in In Search of Certainty, the physics of materials is a good starting point to understand the spatial relationships in modern infrastructure, with or without current ideas about virtualization.

Here are some examples of things we may call space and time in information infrastructure:

ContainerAd hoc networksReleases
HostMobile devicesJob completions

Descriptions of space and time have been studied extensively over many centuries and there is a wealth of rich abstractions to build on. This is the approach I've used to get started. It has already revealed some flawed thinking in datacentre designs.

Although we are used to thinking of spacetime as an empty and immutable backdrop on which the real stuff of the world takes place, its description is inextricably linked with what goes on inside it; so much so that one can easily analyze solids, liquids and gases using the same formalism and language.

It turns out that, if we consider places and times as discrete and finite in number, then we are actually forced to look at both spacetime and what goes in inside it as the same thing. Using Promise Theory to build up the usable resources and properties at different locations, one ends up with the notion of a semantic space. The following would then be different kinds of material spacetime:

  • Uniform computational fabrics, like clusters
  • Datacentre network fabrics, e.g. Clos, BCube
  • Mobile devices interacting with smart infrastructure
  • An Internet of Things interfacing with a fixed cloud
  • Collaborative human-computer systems (like Netflix micro-services)

Once one has a description that clarifies the equivalence of space itself with a map of knowledge, then we can choose to view smart infrastructure or a thinking space as an index of its own capabilities, or we can construct separate indices and coordinates to locate places within it effectively.

Perhaps shockingly, we discover that network IP and MAC addresses do not give us a coordinate system for a network. This leads to a plethora of `routing' structures on top of network protocols to repair that deficiency. The question then becomes: how can we use what we know about spacetime to lay out smart coordinates to describe what is there? This is how we'll be able to locate resources most effectively.

A model of general semantic spaces

To begin this effort on the right footing, I am appealing to a mathematically fluent audience. I apologize for this: it's the way I believe one must think about problems to get develop robust concepts. The notes linked to below review some existing mathematics, and lay out the fundamentals of a new description incorporating Promise Theory (without too many examples) on which I later intend to build more technologically motivated examples. I have tried to ask fundamental questions to get at deep issues. Not all of them will end up being useful, perhaps.

Given its depth and technical difficulty (even in this semi-rigorous form), I want to just open up for discussion early before trying to explain it in detail (this could easily be a PhD project, perhaps two). The notes are here:

In the coming months and years, I hope this will unravel for a wider audience, and I hope others will help by building on these ideas, or replacing them with something else. The simple essence of the idea is to address the three practical questions mentioned above, at any scale.

MB Oslo Thu Nov 20 15:42:14 CET 2014