We‘ve been interviewing some of the best thought leaders in Data Visualization and Analytics. This week, Randy Olson, the community leader for the wildly popular /r/DataIsBeautiful and a frequent contributor with some of the top content, is sharing his insights. He is an AI researcher at the University of Pennsylvania, ushering us into the next era of Artificial Intelligence without the side effects from Skynet. He tweets daily about data analysis and visualization at @randal_olson. Visit his popular personal blog to follow his expertise as a data visualization practitioner.
Data visualization allows stories to be told in a way that can effectively represent massive amounts of data. It’s no wonder that more and more organizations are looking to data visualization to communicate their stories.
People love the ability to consume information in such an engaging fashion. That’s why /r/DataIsBeautiful has become such a popular section of reddit, serving over 2 million unique visitors a month.
In this interview series, Randy talks about what data visualization is, the importance of data visualization, the components of a quality data visualization as well as the outlook for data visualization.
What does “data visualization” mean to you and how has it served you?
At /r/DataIsBeautiful, a data visualization must meet the following criteria:
- Automatically generatable (as opposed to manually drawn), i.e., infographics are not data visualizations.
- Based on non-visual data. It can’t be a form of image effect or pixel shader.
- Based on real or simulated data. If the image represents one number (pi), sequence (primes), or equation (sin(x)), then it is not a data visualization.
- A mapping of information to a visual property. Text in a table is not sufficient. A data variable must be transformed and mapped onto a visual property such as color, size, or position.
- Made with the intent to communicate data. A music visualization from a media player, while pretty and mesmerizing, doesn’t convey information. You can’t differentiate songs just by looking at the images.
These are criteria established by leading figures in the field, and I couldn’t agree more with them.
In layman’s terms, a data visualization to me is an image that transforms data — spreadsheets of numbers, text files, etc. — into insight. Humans are visual creatures and have an innate ability to recognize patterns and trends, but only when they’re presented to us visually. We can’t process massive tables of numbers easily, if at all, so we rely heavily on visualization to teach us what the data shows.
The old adage, “A picture is worth a thousand words,” couldn’t be any more true when it comes to communicating data. Perhaps the best example is Charles Minard’s famous map, which tells the entire story of Napoleon’s disastrous Russian campaign of 1812 with a single visualization.
Entire book chapters have been dedicated to describing that 1812 campaign, yet Minard only needed a single map.
Is data visualization a nice-to-have or a must-have?
I firmly believe that data visualization is a must-have in any business that involves communication. Slides and reports full of text and statistics are boring and easily glossed over, whereas data visualizations can communicate the important information quickly and leave the details to the text.
What are challenges in producing an effective data visualization? What are recurring bottlenecks and pain points?
I think the biggest challenge in producing effective data visualizations is that people tend to focus too much on visualization tools. Most of the requests for advice that I receive from students and professionals tend to focus on the tools: What tools should they use, or how can they make a certain kind of chart with some tool. When it comes down to it, you can make excellent visualizations with any tool — including much-berated Excel — if you understand the basics of effective visualization design.
What are the first things you look for in a visualization?
When I’m judging a chart, the first thing I look for is whether it tells a clear story. Can the chart communicate the story by itself without any accompanying text? Visualizing data for the sake of visualizing data is fine for your own purposes, but the chart design has failed if the viewer walks away from a chart not learning something new.
The next thing I look at is whether the right type of chart was used to tell the story. This is when I go through the steps in the chart suggestions guide below.
I’ll note that pie charts — one of the most overused charts — are really only recommended in one case, displaying shares of the whole.