Notes from the USENIX/LISA Knowledge Management Workshop

I recently chaired a Knowledge Management workshop at LISA 2010. I have said on several occasions that Knowledge Management is one of the central challenges of the next decade. Below is a brief summary of the workshop.

Knowledge - the new challenge for system administration

Knowledge Management is probably the single greatest challenge for system administrators today, but one of the least represented in terms of resources and tools. Organizations waste time and money every year reinventing wheels because there is no effective knowledge transfer between IT workers.

Knowledge management includes a variety of issues, including the understanding of specification of systems, relationships between system dependencies, version control on system changes, strategies for streamlining information from logs and monitoring feeds, and more.

The LISA 2010 Knowledge Management Workshop had 18 attendees and discussed the scope and techniques for knowledge management in system administration.

Why Knowledge Management?

The division of labour between humans and automation systems is making system administration increasingly about knowledge management. As machines take over the grunt-work, humans are left with a more strategic job. Knowledge management applies to:

  • Knowing the job you have to do. (Training, redundancy of personnel, etc)
  • Knowing the state of the system you work in. (Having good intel' and reconnaissance.)

There are lots of reasons for wanting to know systems well, as well as knowing the job better. Feature creep and growth occur in systems because we don't know them well enough. It is common to reinvent wheels from ignorance. Often that ignorance is self-inflicted.

It was agreed that the information deluge is growing every year. There is a need for increasing self-discipline to keep up with it, or try to simplify matters and get rid of it.

An important reason to get control of knowledge is the quick turnover of staff This creates a significant overhead for organization, both as a brain drain and in retraining. It is a risk to the continuity of an organization. Organizations and systems can stagnate as a result of power plays by certain problem-individuals who claim `ownership' of certain areas. What happens when these people go on holiday?

Someone asked: why do sysadmins equate management with coding? This harks of the DevOps movement, but is that really management, or a way to keep sysadmins `precious' in their jobs by locking companies into their expertise?

What is knowledge anyway?

What do we mean by knowledge? We need to know this if we're going to manage it. This is an interesting question in and of itself. Many of us stockpile gigabytes of data, but is it any use? When does data become knowledge?

Most people would agree that the mere possession of raw data does not constitute knowledge. So accumulating bytes does not make us wiser. It is probably more than information that is written down. Knowledge seems to come in different categories and subject areas. We often call these categories `metadata', but this is really just a convention. When we make taxonomies and hierarchies to map out knowledge, we are just creating one out of many possible viewpoints -- like one special spanning tree of the network of relationships between the topics.

What about information that we don't agree with? Is knowledge merely information that we either might, could or do believe in? In other words must knowledge be validated to distinguish it from junk? How do we distinguish signal from noise? Think of monitoring systems. Many sysadmins stare at line graphs that wiggle up and down -- do they learn anything real from this, or does it just consume their time?

Another definition of knowledge might be information we believe we can use? That makes it valuable, and introduces the idea that we can make a value judgement about certain information.

Understanding is an important issue, but do we really have a good understanding of what it means to understand something? One answer to this is that understanding means that we have a model for it. Generally people feel they understand something when they feel they know how to categorize the information, or know how it fits into a story. Understanding means that we can put information into a box and that we trust that choice.

The issue of trust brings up an interesting issue. Trust is something we build through familiarity. Do we perceive knowledge merely as information we have a relationship with? Ultimately we are trying to cope with our uncertainties about the world. Mark once described science as `uncertainty management'. Its about being able to quantify our confidence in knowledge,

What are the goals of KM?

So, if we are to get something out of managing knowledge, what is it? There's a number of steps in any form of communication:

  • Acquire knowledge.
  • Retain knowledge.
  • To spread knowledge.

Then we should move from from facts to understanding. That means we frame what we've learned in a model that `explains' things. Part of that is abput recognizing patterns and replacing `instances' with `categories' or sets that match `patterns', e.g. we replace `Sarah', `Jane' and `Gina' with `women' in one context, and with `actresses' in another.

Norms shape our identification of patterns and our choices of words. Every piece of knowledge has a cultural context that we have to be sensitive to.

Approaches to KM

What can we do to cope with knowledge? Clearly we can write stuff down. But this is no good unless people read it. Often we have to encourage this -- we only start reading when it is already too late. It's necessary to foster a culture of continuous, regular interaction with knowledge, else it becomes just an archive.

A challenge is that most people don't write well in the first place, so getting people to write things down doesn't always help. Wiki projects tend to die off. We need pedagogical or didactic skills to engage people! Next there is editing. We put stuff in and take stuff out of a knowledge bank. Both processes involve trust.

Often a division forms between writers and readers (consumers). There is a trust relationship too. Do you trust the author (think OPEN SOURCE!) Because knowledge is context sensitive, it is sometimes irrelevant and even outdated. When should we forget information? When is incorrect certainty knowledge actually harmful?

The way we compress lists of instances into general categories is to look for patterns. That means we need to be on the lookout for general rules and patterns.

What is a model?

Model based configuration, model based management -- the term model is coming back. Theory is sometimes unpopular, but there is `nothing more practical than a good theory' (we have no certain knowledge of who said this first -- Lewin, Maxwell, or Einstein).

A model is a compressed form of information that allows us to see simple patterns in a sea of instances. We model (as per Francis Bacon) by classification of types -- those 20 boxes with sticks underneath are a pattern, let us call them horses! Start by choosing a level of detail, or approximation. We can't model everything, so there is always a trade off. Ultimately we have to live with approximation and uncertainty.

Challenges - lost in the web

Getting information in and out of a system can involve barriers and costs that discourage maintenance and use of knowledge. How easy is it to insert relevant information? How easy is it to get into the knowledge index or repository (ingress)? How easy is it to extract an answer? How quickly do you get out of the knowledge index or repository (egress)?

Search technologies provide some help to scalability of knowledge models, but they are quite poor. They don't handle semantic relationships well. For that we have Topic Maps and RDF technologies, but these are also in their infancy.

One trick that teachers use is to developing a `mythology' for a subject, i.e. a set of plausible of even true stories about a subject. We are story-tellers before writers.

Robin Dunbar pointed out that there are limits to human cognition. We can only have a close relationship to about 5 things. We can have a working relationship with about 30 things or people, and we can only be acquainted with about 150. The `Dunbar numbers' are cognitive limits that we have to work around.

One way to build stories is to form goals, and intentions. These are naviagtional aids for knowledge. They provide direction, or storyline. Taxonomies and hierarchies also imply a direction of `breaking down knowledge' from a root concept -- but as we know from Object Orientation, you can make fatal modelling mistakes with trees. If two related things end up in completely unrelated categories by modelling mistakes, you have to unpick the whole thing and start over again.

Will Knowledge Management make experts redundant?

Knowledge Management requires experts in order for it to work. It needs those people who can see patterns and tell stories. In his book `The Third Wave', Alvin Toffler argues that in the agricultural and industrial ages humans are replacable, but in the information age humans are more valuable again, because they are doing something more unique to humans.

Miscellaneous points

  • We have fallen in love with the Wiki, but these seldom work because no one is interacting with them often enough to keep the knowledge fresh and accurate.
  • Config data does not belong in a Wiki, need a CMDB. (Why? Because this is a special model?)
  • How do we do Quality Assurance on knowledge? When do you draw the line between not/good enough
  • A simple idea: expiry of documents in a Wiki or document system. Items should expire unless tagged as ok. Authors should get notication that an article is old. Flag the person who needs to fix it. This is about trust.
  • Dealing with `ownership' of materials. Successful knowledge requires the participation of the reader not just the author, as both are stakeholders in the relationship. It is not really about ownership. but responsiblity. (Think of the Open Source model, or science.)
  • Intellectual property becomes relevant when you publish information. Who is allowed to update the info? Who has the right? Can you make a process `chain of evidence' to preserve original intention, and maybe ownership.
  • Is there a difference between discovered information and creatively-originated info?
  • Peer review has always been seen as a strength for science, but we increasingly see it degenerate into cliques and corruption. How can this be avoided? There need to be ethical principles.
  • What are the economics of knowledge? How much are we willing to pay to develop and retain a knowledge culture?
  • Cultural changes around us mean that old knowledge no longer serves the pupose it used to. We need to update, like having new editions of books, encyclopaedias etc.
Cfengine notes on Knowledge Management