How to Trust the Black Box that is Data Analytics
Let's introduce you to some characters that often appear in meetings where data is intended to educate, enlighten or inspire others to action but where it typically breaks down into conflict.Businesses leaders that once used their instinct to make decisions in the best interest of the business are being asked to take a back seat to a new age of data-driven decision making. The rise of the ‘black box’, data analytic methods and techniques that are increasingly opaque, has become a cause for concern for decision makers. Why? It can be summed up in one word TRUST. Business leaders trusted their gut; they don’t know how to trust black boxes.This trust gap is not confined to the upper tiers of the business or to shiny new black boxes. Individuals at all levels are using data on a daily basis to communicate but they are often asking their audience to implicitly trust them, the data or their analysis.Let me introduce you to some characters that often appear in meetings where data is intended to educate, enlighten or inspire others to action but where it typically breaks down into conflict:
- The Dinosaur: We’ve already discussed them. Resistant to change and driven by their gut instinct over everything else.
- The Rabbit: Typically, someone associated with a department, data source or process that is the subject of discussion. They are likely to know a great deal about their area and are seeking the tinniest hole in your data, analysis or assumptions to undermine its credibility and save their own skin. Their persistence means the meeting agenda goes out the window once a sticking point is identified.
- The Grenade Artist: A positive meeting is abruptly interrupted when this character pipes up and says, “This might not be important but…I did my own analysis a few weeks ago and I'm pretty sure that there is something wrong...” In one sentence, everything has been blown apart.
- The Visual Viper: These people don’t like the aesthetics of your presentation and can’t let it go. Much like the Rabbit, if an error is spotted or they can't agree with what they see, then that's it, game over.
- The Silent Dissenter: Walking out arm-in-arm, they undermine the decisions taken by ignoring them silently through inaction or worse combine this with the "I knew that would happen" moment to pooper-scoop the project into oblivion.
- The Conjuror: It’s You, the data magician. “I’m telling the story my way and the facts speak for themselves”. Why is no one listening to you? Why can’t they see it your way?
- Collaborate with key stakeholders
- Engage them frequently to make sure they are on the journey
- Consider them as partners in the creation of results
- Accuracy – the degree to which the data agrees with/represents the real-world object or event
- Uniqueness – the removal of duplicate entities or events
- Completeness – the expected comprehensiveness of the data
- Consistency – the degree to which data across different systems reflect the same information
- Integrity – the validity of a data source and its relationship to others
- Timeliness – the need for data to be up-to-date and available when expected and needed
One of my favourite examples of sharing information about data quality with a user is discussed by Andy Kirk in his ‘The little of visualising design’ blog series and is in relation to a chart from Gapminder.
Read Andy’s blog for a full overview but essentially in the lower right-hand corner of the chart there is a red warning triangle labelled ‘Data Doubts’. When clicked, a text box is presented which informs the user that they should be cautious about the completeness and accuracy of the data used in the chart and gives the user the ability to get more information. It acknowledges that there are errors or uncertainty in the data. This is a technique that would neutralise many of the arguments our cast of characters might want to throw our way.On the same chart is another feature, which helps the user contextualise the metrics (or analysis) they are viewing. Next to the axis label (itself a basic but vital feature in helping users understand what they are seeing) is a question mark symbol. Clicking it brings up a text box that describes the metric in more detail. I haven’t discussed the need for explain-ability of algorithms (and I won’t today) but being transparent with the assumptions underpinning the analysis further drives up trust in your results.Finally, Timeliness is an easy one, though often forgotten. It’s pretty easy for a date or date range to accompany your chart to help inform the user of the currency of the data. Imagine how effectively you could neutralise the Grenade Artist if you were able to confidently state that your data is fresher than their data?You may have noticed that all of the examples here relate to interactive or dynamic content. That’s no coincidence as this type of experience (including drill through and dynamic dimension picking) further enhances transparency and increase trust. However, the information discussed can and should be applied to static presentations where it has the same effect.I could discuss many more examples of where transparency increases trust but instead, I encourage you to comment or get in touch directly and share things that have worked well for youIf our cast of characters can feel part of the process, agree on the assumptions underpinning the analysis and share these openly then we are in an environment in which we can feel empowered, take decisions and act with confidence.After all, a black box is only a black box when you don’t let people see what’s inside.