Organising Knowledge Revisited

The first time I reviewed Organising Knowledge was 2007. The book was new then. However, Organising Knowledge has become a book that seems to have a lot to say – even 11 years later. When I wrote my last review, I was just beginning with knowledge management, information architecture, and organizational effectiveness. Since then, I’ve done many projects and read many books. In preparation for a new information architecture course I’m preparing, I wanted to go back and revisit the book to make sure that I didn’t miss any of Patrick Lambe’s insights. This time I read it on Kindle – which means I’ve got more extensive notes – and my writing style has changed quite a bit in the last 11 years.

The Six Pack

Terrifying. That’s the word that most new pilots use to describe their first solo landing experience. Practice makes perfect, they say. (Though Ericsson has some other requirements in Peak). When you strap yourself into an airplane for the first time and look at the dashboard, you see a collection of vaguely familiar instruments that convey what you need to know about the plane and what’s happening, but which instruments do you need to really focus on?

Pilots are taught to scan six key instruments: attitude indicator (sometimes called artificial horizon), turn-bank indicator, directional gyro, altitude, vertical speed indicator, and airspeed. Collectively, these instruments tell you most of what you need to know about flying the aircraft – and provide a double check should an instrument fail during flight.

The first time with the instructor in the plane that I was landing, I can remember all the variables I was trying to compute. Airspeed, altitude, rate of descent, pitch, engine throttle position, flap position, radio calls, and maneuvering were just some of the thoughts running through my head. Over time, some of these things became automatic. I didn’t have to think about them. Experience had taught me how to “feel” the aircraft and when I could ignore some of the data coming at me – and when I needed to pay attention. Gary Klein would say that I started to build a model of the airplane (see Sources of Power). It wasn’t an academic model, it was a tacit model that I could call upon to simplify what my explicit, rational mind had to process.

That’s the way I feel about the process of building taxonomies and organizing knowledge. I can see all the variables, but I’ve learned which ones I need to pay attention to and which ones are the most important. It’s in that context that I approach this revisit. With the experience, I’ve learned some of the things that I can ignore.

Planning for Failure

There are three very disconcerting things when you’re flying. First, when your engine quits, you start sweating. The propeller is more than the big fan in the front to cool you off. If you lose your engine in a single-engine aircraft, you’re going to be coming down. The good news is that essentially the plane is a powered glider. Unless you’re in a critical phase of flight (takeoff or landing) losing an engine isn’t that big a deal.

The second disconcerting time is when you’re in clouds, and your brain is telling you one thing, and the instruments are telling you another. Many pilots have been killed by spatial disorientation and failing to listen to their instruments. Perhaps the most famous is John F. Kennedy, Jr. The good news is if you learn that your brain will lie to you, and, collectively, the instruments won’t, you’ll consciously override your feelings and fly by the instruments. (See Incognito for more on how our brain makes up things.)

The final disconcerting time is when an instrument fails. You are mid-flight, and, without warning, one of your instruments starts “drifting” from the correct answer. Whether it’s the attitude indicator starting to indicate a gentle turn to the left, or you notice that your directional gyro starts to spin like you’re in the Bermuda triangle, it’s sure to get your attention. Luckily, any single instrument failure can be identified and eliminated from your scan by cross-checking other instruments. Planning for failure is designed into aircraft. That’s why losing an instrument is just disconcerting.

Facets serve this purpose in organizing your content. Though we think of filing away our information in a single large hierarchical tree, with one level followed by the next, invariably creators and consumers will get stuck and be unable to properly file or find content in its correct spot. Using facets allows users to file and find content, because when one facet fails, generally others are effective.

Finding Facets

Ideally when you’re looking for facets, you’ll find a set that are mutually exclusive (said differently, orthogonal), unambiguous, and complete. That is, it would be great if the facets you find can help you accurately and completely define the content that you hope to file and find.

Defining Dimensions

Before getting into how to define effective facets, it’s important to understand what they are. In practical terms, facets are aspects of the information to be organized. Each characteristic of the item can be said to belong to the set of values in the facet. For each facet, then, an item has a value – a metadata value. Organising Knowledge defines metadata (plural) as “a collection of structured information about a document or piece of content.”

Taken in singular form, this means that a piece of metadata is structured information about the document – structured along facets. To create organization along facets, you’ll necessarily be capturing this structured data. Almost universally this is done as a column or field stored with the document having the field name similar or identical to the definition of the facet. The values of the metadata field or column can either be fixed – as in a controlled vocabulary – or open. For instance, a facet that describes color may be a set of fixed set of colors – like those in a coloring box. An open facet might be one like weight or height that is a continuum of responses that don’t have a fixed set of values.

Facets are an important boundary-spanning object, because they can be used for “big data”-type analysis of content and other analytical techniques beyond the simple retrieval of information.

Following Footsteps

The first part of the path is easy. You can follow the footsteps of others who have walked the path before you. You can start by the father of faceted categorization, S.R. Ranganathan. A librarian in India, he proposed that the existing classification systems, like the venerable Dewey Decimal System, were insufficient. His ideas were a footnote in the annals of library science until electrons ruled and atoms were ancient. It’s a different world to organize digital information, where copies are easy to make, and findability is the key issue. In a traditional book library, you organize one literal thing; in the electronic world, everything is substantially more fluid.

Ranganathan’s facets were personality, matter, energy, space and time. He proposed that these facets would allow you to describe most things – and for them to be findable. Lambe expresses these plus the addition of subject matter – which agrees with Morville and Rosenfeld in the classic Polar Bear book (Information Architecture for the World Wide Web). Morville and Rosenfeld also include price – which makes sense when you consider the eCommerce perspective of their work. Richard Saul Wurman – the father of the term information architecture – adds alphabet and hierarchy (or scale.)

In aggregate, these give you a great place to start.

Standard Steps

Lambe provides a helpful table (2.1) that relates practical examples of ways of breaking down hierarchies, which I’ve reproduced here:

Taxonomy	Superordinate Term	Subordinate term	Relationship term
Military rank	General	Colonel	Power/authority
Biology	Genus	Species	Common Evolutionary history
Family	Parent	Child	Genealogical
Vehicles	Car	Steering Wheel	Whole: part
History	1960s	Assassination of JFK	Period: event
Geography	United States	Alaska	Whole: part
Landscape	Mountain	Everest	General: specific
Terrorism	Al Qaeda	Osama Bin Laden	Group: member
Disease control	Infection	Symptoms	Causality, sequence

This table provides several relationships that you can walk in the top-level facets to get to more detailed areas of the tree.

Stepping Out

Unfortunately, after these good solid starting points, it’s necessary to get specific to the content that you’re organizing. That requires tuning into an inherent capacity that humans have for breaking information into meaningful chunks. Take a baby and show them a set of dots traveling together, and then have one of the dots fly off out of formation from the others, and the baby will indicate their confusion. Even small children can convert groups of objects into a single object for consideration.

To further the development of the facets, we must look towards the natural groupings that people – particularly the consumers of the proposed facet – are likely already creating. While these categories are rarely perfect when encountered, they often form a solid but flexible framework on which a taxonomy can be created.

Taxonomies

Organising Knowledge defines a taxonomy with three characteristics:

Classification Scheme – It needs to be a way that we organize our world.
Semantic – Meaningful and transparent to the end users
Knowledge Map – While a map is not a territory, it represents the territory, and so, too, a taxonomy should represent the knowledge territory.

In this context, each facet is a taxonomy. It’s a way of viewing the information much the way that you can organize ideas in a book by chapter and have a separate organization of ideas in alphabetic order in the back of the book – called an index. In many ways, the index at the back of a book is like a thesaurus for the taxonomy. A thesaurus is simply a taxonomy organized alphabetically rather than by subject, as is ideal for taxonomies.

Pragmatic

One would think that a rule-following person might be great at taxonomic work. After all, they’re good with rules. Unfortunately, building taxonomy is an odd blend of skills that requires the rigor of looking at the details and sometimes stepping back to see the forest – it makes it a cognitively difficult feat. Further, taxonomies are necessarily fraught with compromise. Consider the number of items that should be at a single level.

Hick’s law says that, when the chooser knows the order, having all the items in one list is the most efficient way of organizing information. Contrast that with the work of Barry Swartz in the Paradox of Choice, which explains the fact that anxiety increases with the number of items being chosen from. This ignores the primary practical issue with Hick’s law, which is that it relies on a central condition – the users know the order.

This is a semantic slope that’s easy to miss. Is the category for automobile called automobile, car, ride, vehicle, or transportation? Without a complete knowledge of the list, it’s impossible to know which term is used in every case. This is particularly true when we’re considering that, most of the time, the users of our taxonomies today don’t know whether the information they’re looking for exists – much less what the “correct” terms are.

Conversely, if you create a set of subcategories, you necessarily make some items difficult to find based on their difficulty of being categorized along the facet. Where do you put the color blue-green – under blue or under green? If you have a category for furniture, will the user understand that they need to look there for rugs or lamps?

Additionally, even knowing the number of subcategories under a given category can be challenging. While much research here points towards The Magic Number 7, that research has been clarified to not mean what people think it does. We do know as the number increases, anxiety increases, but too narrow of choices produces a very deep click tree, which causes people to become frustrated and abandon their attempted navigation. The deeper the trees go, the more people you’ll lose before they get to the bottom.

The pragmatist tries to find a balance between these competing factors. They also find ways to “cheat.”

Polyhiearcy

One of the solutions to the taxonomic problem of which category something belongs in is to leverage polyhierarchies – that is, to associate a single term to multiple parent terms. Tomato can be associated both with fruit and vegetable. (See The Accidental Taxonomist for more.) However, this is an impure solution that breaks the neat and orderly progression of a traditional hierarchy. It’s decisions like these that make it difficult for rule followers to make the hard calls to break the rules.

Journey Not Destination

In the end, you’ve got to live with the consequences of those places where you break the rules. Organization isn’t a one-time thing. The things that you’re organizing change – and grow. That means that organization is a shifting sand that you must continually adjust to. One way to start the journey is to learn more about Organising Knowledge by reading.