Last month, a few of my colleagues and I ventured down to Austin, TX for SXSW. After attending countless sessions, eating my own weight in tacos and jalapeño-laced food, and walking a little over 30 miles in five days, I’m back in my office mulling over all that I learned. The most important lessons were about common data analysis and visualization mistakes and how to avoid them.
1. Overlooking data quality issues
We’ve all been there. You’re preparing for a presentation, putting the final touches on your visuals, and you discover that something is wrong. Maybe a metric was included twice. Or the data just isn’t matching up. Maybe the data wasn’t even properly scrubbed to begin with. Somewhere, something is wrong. No matter what, it’s important not to be afraid to return to the previous step if you uncover errors in the data.
While frustrating and time-consuming, data quality issues should be regarded with the highest importance and mitigated at all costs. If necessary, build time into the process for these types of adjustments. After all, what are the analysis and visuals worth if the quality of the data itself is poor to begin with?
2. Fixating on specific relationships
When analyzing any type of data and the relationships within, it is important to remember the difference between correlation and causation. Just because two things happened around the same time does not mean that one caused the other. These things can be correlated without one having caused the other.
Additionally, you need to take the whole picture into consideration, not just one piece of information. Here are some examples of other things to consider when conducting analysis of digital advertising data:
- Did we change out creative?
- Did this run at a different time of day/week/month/year than usual?
- Was the work targeted to a different audience? Is this audience typically more engaged?
Lastly, while it’s important to avoid looking at any one metric in isolation, it is equally important not to overeat at the buffet of All You Can Eat Analytics.
3. Cognitive biases
While cognitive biases are inevitable, we must be aware of their potential in order to help minimize their effects. These cognitive biases can come from many sources, but there are two sources in particular worth noting when analyzing data – the analyst delivering the information and the audience receiving it.
For the analyst, cognitive biases can come from forcibly massaging the data or omitting key factors in an attempt to make it look better for ourselves. This is not acceptable if it tells the audience an inaccurate story.
Chris Kerns, VP of Research & Insights at SpredFast, explained in a session at SXSW that to avoid some of these biases, a good process is to start with a question that needs to be answered, then use data to answer it, rather than examining the data and trying to discover what questions it can answer. By doing the latter, you are more likely to pick and choose the data that helps build your own case, when in reality the data tells a much different story.
The audience receiving the information can also have cognitive biases that analysts need to take into consideration. One way to account for this is to understand what makes a visualization “good” in the eyes of those receiving the information. Jeffrey Heer, professor at University of Washington, explained in a session at SXSW about interactive data visualizations that some methods of encoding information are more accurately decoded than others.
4. Stop going through the motions
The last big lesson that I learned at SXSW is to stop just going through the motions. Just because the information was presented a certain way in the past does not mean that it must be done the same way again.
Chris Kerns of SpedFast reiterated several times that when data becomes a burden, it shouldn’t be used and that data doesn’t need to be included just because it’s out there. Rather than robotically spitting out the information just because we have it or because it was done that way previously, we should take a step back and evaluate what is important to our audience and what the most meaningful way to encode that information is. Because different people receive and use information in different ways, it’s important to consider the audience and even tailor presentations and information to the individual audience, if possible, rather than just recreating the same thing that was done previously.
Do you see common pitfalls that you would add to this list? Let me know in the comments below!