View from the Rock by Rock de Vocht - Taxonomies: Pay attention!

Aug 09, 2023
Hello – I am Rock de Vocht – welcome to the first in my new series of thought pieces. With topics ranging from deep tech to semantics and maybe even a little something about music there’s going to be a lot to talk about!. Some of these will be simple thoughts from me – others are conversations with young Rock.

To help my younger self I have been going back in time as an older, more statesman-like Rock and explain to a younger me some pretty complicated topics that I just couldn’t get my head around when I was a more of a pebble than a Rock. It’s just some fun - an opportunity to talk about different ideas, obscure notions and things that float around my head late at night. Feel free to dive in with your own ideas – of course feel free to disagree I am sure that I or Little Rock will have an opinion!

This my first of my Big Rock/ Little Rock conversations – if you are dipping your toe in to taxonomies…it can certainly take you to another dimension. Now – pay close attention.

Want to learn something about taxonomies. This Little Rock wanted to understand what taxonomies have to do with organising data. When we got onto ‘multidimensional data’ was kind of mind blowing (or taxonomy blowing!) See for yourself…

Little Rock: Hey Old Rock - how’s it hanging? Seen any moss recently?

Rock: Hey - don’t be cheeky! You don’t want me to roll over you do you?! You’ll see who’s covered in moss.

Little Rock: Only kidding. I just wanted to get your attention. There's something I read about last week - I don’t really get it. It said that data could only be seen in one way and needed to be ordered in a very specific hierarchy.

Rock: Wow - that’s just completely wrong. Data is highly dimensional. A book, for instance, is more than just its content. Other dimensions of a book might include the title, the author, its year of publication…

Little Rock: The colour of its cover? That’s something you might want to search for? It might be relevant to someone.

Rock: Exactly. We deal with data like this every day; data that has different characteristics with different meanings to each person. So at different times we need to access this data through different lenses.

Little Rock: Like if I wanted to search for a report by its author, but you wanted to search for it using publication dates?

Rock: This is what we mean by the dimensionality of the data and it is this dimensionality that renders artificial taxonomies - placed on your data - just a stop-gap solution.

Apprentice: But can’t you construct a taxonomy using some of these dimensions, like a book or report’s author and publication date?

Little Rock: You can, but a taxonomy cannot dynamically change dimensions and here is its weakness. If you were to construct a taxonomy for a law-firm, you might naturally have a top-level that represents different clients or cases, maintaining a separation of matters.

You might want to add a second level consisting of evidence, and the people involved. But…

Apprentice: We are already running out of room and starting to mix dimensions.

Rock: What would be a better way to do this?

Little Rock: We should be able to focus on a particular dimension of the data just as and when we need it. The dimensions are already in the data, so we should be able to use them. Alternatively, we can use an AI to deduce what dimension of a document to use given a user’s query.

Rock: And when they aren’t already present, what if they could be dynamically added to your data to further enhance it… something to think about!

So why impose an artificial taxonomy which only limits and hinders our ability to navigate that data? Instead, we should empower any of these dimensions, whether they are categorical or numerical ranges.

Little Rock: What’s the difference?

Rock: Categorical metadata items are sets or groups of values. For example, a type of document like a PDF file or a Microsoft Word document.

Numerical ranges apply to monetary amounts, sizes and dates. So, imagine you are looking for documents only created or modified in the last six months.

Little Rock: A taxonomy can’t help you there, unless it can be changed to include date ranges?

Rock: Nope. Perhaps the most problematic thing is that taxonomies are absolute and rigid. Once it is set in place, a taxonomy can’t easily be changed. Creating taxonomies and classifying data accordingly takes a lot of effort, so undoing this work will be just as hard.

Little Rock: Frustrating. So if an organisation has taxonomies in place, they are pretty stuck, they can’t ever search for the different dimensions in their data?

Rock: That’s not necessarily the case. We can work with data whether or not it has taxonomies applied to it and we can unlock dimensions that aren’t identified within a taxonomy, to enhance its searchability.

So, Little Rock…what did we conclude?

Little Rock: Taxonomies are useless!! Only joking!

There is hope where taxonomies are concerned. If you were setting up a new information system tomorrow, would I recommend you use taxonomies to organise your data? No. If you want to use your data more dynamically even though it is already contained within a system of taxonomies, are you stuck? No!

Rock: Good answer…I would recommend that you start by asking the question: in a perfect world, how would you use your data? From here, you can begin to establish the dimensions of your data; what is already searchable through your taxonomies, what dimensions exist but are not built into the taxonomies (and therefore are not searchable), and the dimensions that are not yet present but if they were, could be very helpful indeed. Work on the basis that anything is possible and, in our data driven world of fast-paced innovation, it probably is!

About the Author:

Rock de Vocht is Chief Scientific Officer at SimSage and has been working for a long time...

He has over 40 years' experience in IT as a Software Developer, Software Architect, mentor and development team lead in a wide range of industry sectors. Rock holds Bachelor of Science (with honours in Computer Science), and Master of Science (in Computer Science) degrees from the University of Auckland, New Zealand.