How to Trust the Black Box that is Data Analytics

Let’s introduce you to some characters that often appear in meetings where data is intended to educate, enlighten or inspire others to action but where it typically breaks down into conflict.

Businesses leaders that once used their instinct to make decisions in the best interest of the business are being asked to take a back seat to a new age of data-driven decision making. The rise of the ‘black box’, data analytic methods and techniques that are increasingly opaque, has become a cause for concern for decision makers. Why? It can be summed up in one word TRUST. Business leaders trusted their gut; they don’t know how to trust black boxes.

This trust gap is not confined to the upper tiers of the business or to shiny new black boxes. Individuals at all levels are using data on a daily basis to communicate but they are often asking their audience to implicitly trust them, the data or their analysis.

Let me introduce you to some characters that often appear in meetings where data is intended to educate, enlighten or inspire others to action but where it typically breaks down into conflict:

  • The Dinosaur: We’ve already discussed them. Resistant to change and driven by their gut instinct over everything else.
  • The Rabbit: Typically, someone associated with a department, data source or process that is the subject of discussion. They are likely to know a great deal about their area and are seeking the tinniest hole in your data, analysis or assumptions to undermine its credibility and save their own skin. Their persistence means the meeting agenda goes out the window once a sticking point is identified.
  • The Grenade Artist: A positive meeting is abruptly interrupted when this character pipes up and says, “This might not be important but…I did my own analysis a few weeks ago and I’m pretty sure that there is something wrong…” In one sentence, everything has been blown apart.
  • The Visual Viper: These people don’t like the aesthetics of your presentation and can’t let it go. Much like the Rabbit, if an error is spotted or they can’t agree with what they see, then that’s it, game over.
  • The Silent Dissenter: Walking out arm-in-arm, they undermine the decisions taken by ignoring them silently through inaction or worse combine this with the “I knew that would happen” moment to pooper-scoop the project into oblivion.
  • The Conjuror: It’s You, the data magician. “I’m telling the story my way and the facts speak for themselves”. Why is no one listening to you? Why can’t they see it your way?

Full confession, I’ve been all of these characters at some point but more often than not I’ve found myself as the Conjuror standing there thinking “Why aren’t they listening to me?”

Ultimately, these characters emerge in our work places because of a lack of trust. We can’t actually blame them; they are a manifestation of our human nature. But we do need to address them otherwise projects will never move forward, decisions will be stressful or worse incorrectly made out of desperation.  The lack of trust has far reaching implications.

As discussed in Peter Hermann and Valerie Issamy’s book Trust Management: Third International Conference, IT Trust 2005, “one of the most important aspects of trust is that it enables decision making to happen in situations of doubt and distrust” which, by the nature of the environment, for cyber security is incredibly important.

Transparency is the key to building trust. Consider for a moment why the characters in our story react the way they do? It’s likely to be the first time that they have been consulted or engaged in the process and the Conjuror is presenting the results as a fait accompli! Running through the minds of the entire cast is “How could the Conjuror possibly know all the nuances of my business/my department/my data?” The difference between the Grenade Artist and the Rabbit is that the former has been stewing on not being consulted for weeks and decided to take action while the latter is only now realising they were absent in the process and are searching for anything to regain control. Therefore, the first steps in building trust in any data analysis is:

  • Collaborate with key stakeholders
  • Engage them frequently to make sure they are on the journey
  • Consider them as partners in the creation of results

After all, it’s much harder to argue against something you were involved in producing! It’s unlikely that you will be able to get everyone involved as the time and effort required would be too onerous but bringing some key players along is vital.

If only it was that easy right? There is still the small matter of getting the stakeholders to agree with results. Fortunately for us one key factor is frequently cited as being the biggest reason that stakeholders lack trust – Quality. Quality has a broad scope but can include the following:

  • Accuracy – the degree to which the data agrees with/represents the real-world object or event
  • Uniqueness – the removal of duplicate entities or events
  • Completeness – the expected comprehensiveness of the data
  • Consistency – the degree to which data across different systems reflect the same information
  • Integrity – the validity of a data source and its relationship to others
  • Timeliness – the need for data to be up-to-date and available when expected and needed

There is no universal standard for the above. Agreement needs to be reached as to what is an acceptable level you are trying to achieve. However, what is important is to share the assumptions that you make along the way and to make them readily available to those interested. This is not something that should be hidden, as it’s one of the origins of the ‘black box’.

Quality is a something that should be measured throughout the entire data processing pipeline, but your audience won’t expect an end-to-end walk through so let’s focus on where they do – the visualisation of data. How can you take those upstream measures and assumptions and make the audience aware of their existence?

 One of my favourite examples of sharing information about data quality with a user is discussed by Andy Kirk in his ‘The little of visualising design’ blog series and is in relation to a chart from Gapminder.

Read Andy’s blog for a full overview but essentially in the lower right-hand corner of the chart there is a red warning triangle labelled ‘Data Doubts’. When clicked, a text box is presented which informs the user that they should be cautious about the completeness and accuracy of the data used in the chart and gives the user the ability to get more information. It acknowledges that there are errors or uncertainty in the data. This is a technique that would neutralise many of the arguments our cast of characters might want to throw our way.

On the same chart is another feature, which helps the user contextualise the metrics (or analysis) they are viewing. Next to the axis label (itself a basic but vital feature in helping users understand what they are seeing) is a question mark symbol. Clicking it brings up a text box that describes the metric in more detail. I haven’t discussed the need for explain-ability of algorithms (and I won’t today) but being transparent with the assumptions underpinning the analysis further drives up trust in your results.

Finally, Timeliness is an easy one, though often forgotten. It’s pretty easy for a date or date range to accompany your chart to help inform the user of the currency of the data. Imagine how effectively you could neutralise the Grenade Artist if you were able to confidently state that your data is fresher than their data?

You may have noticed that all of the examples here relate to interactive or dynamic content. That’s no coincidence as this type of experience (including drill through and dynamic dimension picking) further enhances transparency and increase trust. However, the information discussed can and should be applied to static presentations where it has the same effect.

I could discuss many more examples of where transparency increases trust but instead, I encourage you to comment or get in touch directly and share things that have worked well for you

If our cast of characters can feel part of the process, agree on the assumptions underpinning the analysis and share these openly then we are in an environment in which we can feel empowered, take decisions and act with confidence.

After all, a black box is only a black box when you don’t let people see what’s inside.