3 Crucial steps to follow in data visualization

with examples from D3

SangGyu An
CodeX

--

Because we all have different understandings and interpretations, not everyone gets the same idea by looking at the same subject. This concept also applies to data visualization. So without considering it, you could create a poor visualization that is hard for the audience to understand. To avoid this, there are 3 steps to follow. Know your audience, choose appropriate visuals, eliminate clutter & draw attention.

Step1: Know your audience

The first thing to consider before graphing anything is knowing your audience. From students to project supervisors, there are many types of audiences. Depending on those groups, what you need to tell them and what they are looking for from a visualization changes. This means, to effectively convey your point, you need to emphasize different aspects for different groups and change style accordingly. Thus, rather than creating your visualization based on a general audience, it’s better to choose a specific audience and message.

Photo by Jonas Jacobsson on Unsplash

Step2: Choose appropriate visual

After you decide who and what you are going to talk about, it’s time to choose how you are going to convey your message, and there are several options you can choose. If we go deep into this topic, it will be endless. So I will only provide summaries of some visuals, and you can search more about those later.

Simple text

  • Use numbers directly
  • Inevitably lose information from simplifying
  • Different metrics convey different messages → choose based on your message
Key points from a movie data set collected by Daniell Grijalva

Table

  • Makes the audience read instead of to see→ not highly recommended for a live presentation but could be a good choice for handouts.
Table of a movie data set collected by Daniel Grijalva

Heatmap

  • Can direct the audience’s attention with color
  • Using lots of colors could confuse the audience → use color saturation
Heatmap of median value of each genre per year since 1980 (left: color saturation | right: various colors)

Scatter plot

  • Useful when showing a quantitative relationship
Scatter plot of budget and gross median values of animation films

Line graph

  • Useful when plotting continuous data like time
  • Use consistent time points to not mislead the audience
  • Don’t overlap too many lines since the audience won’t be able to follow → use color to highlight your point
Line graph that shows the change in median budget of action films by year
Spaghetti graph of change in median budget / Line graph that highlights action film line

Slope graph

  • Shows relative increase or decrease of different categories
Slope graph that highlights genres that increased their budgets

Vertical bar graph

  • Familiar to most of the audience → easy to understand the point
  • Use the zero baseline to not mislead the audience since people generally compare relative endpoints
  • Enough white space between the bars
Bar graph that displays median budget in 2019 by genre
Bar graph with non-zero baseline Bar graph with too small whitespace

Horizontal bar graph

  • useful when category names are long

Stacked bar graph

  • Each bar represents 100% → which type of bar graph to use depends on what message you want to convey
  • Using too many colors to represent different categories distracts the audience → use a single color to highlight your point
Stacked bar graph that shows percentage changes in gross by year / Stacked bar graph that highlights action film

Visuals that should be avoided

Besides what I mentioned above, there are more visuals you can use. But among those, there are a few visuals that people generally avoid because they confuse the audience.

Pie / doughnut chart

  • Hard to tell which segment is larger because it’s hard to tell the quantitative difference in two-dimensional space
  • Instead, use a stacked bar graph, simple bar graph, or numbers
Pie chart / doughnut chart

3D visual

  • Skews visual perception
  • Introduces unnecessary chart elements
3d bar chart created by James Saunders

Secondary y-axis

  • Takes time at reading and understanding what the visual means
  • Instead, label the points directly or separate the graph vertically
Gross change in action and comedy in one line graph with a secondary y-axis
Secondary y-axis split vertically

Step3: Eliminate clutter & Draw attention

After you choose an appropriate visual, you need to remember this. Every element that you present to the audience equals to additional cognitive load for the audience. Thus, you should try to remove things that do not add meaning to your point. Let’s look at each point one by one.

Alignment

In lots of situations, there are generally three types of alignments: left, right, and center. One could think that center-aligned texts look clean. However, in most cases, left or right could be a better choice.

Center aligned title / left aligned title

As you can see, center alignment can’t create a clean vertical line whereas left alignment can, which looks cleaner. Thus, left or right alignment is preferable, and the decision between the two should be based on the arrangement of other elements in your visualization.

Another point to remember with alignment is avoiding diagonal elements. In a situation where you have limited space but too many texts to fit in, people generally think of two options. Either shrinking the text size or making the text diagonal. However, both choices make the text harder to read. On top of that, the diagonal text draws unnecessary attention since everything else except the text is normally oriented.

Bar graph with diagonal tick labels

Thus, instead of using diagonal, it’s best to consider other options. For instance, in a bar graph, you can use the horizontal bar graph instead of the vertical one.

White space

The next thing you should consider is the white space. Oftentimes, this can be overlooked because people don’t know its significance. But if you look at two different versions, you would understand why we need to consider white space.

Text with little whitespace

What is your first impression when you first see the text above? Do you think you can easily read information from such a visual? Now take a look at this version.

Text with enough whitespace

Compared to the first text, isn’t it more comfortable to look at?

As you have just experienced with the two text images, the white space separates elements and gives the audience a moment to think about what they are looking at. The same idea applies to other visuals. So it’s crucial to include some white space in your design.

Quote image created by UXChoice

Size

This aspect could seem obvious. Size = importance: bigger element means it’s more important. So how does this apply in a graph?

Graph that has equal size text

Let’s say we have the following graph that has a title, axes label, axes ticks, and extra explanation. Among those, the title and extra explanation are relatively more important than the axes ticks and labels. Therefore, to visually tell they are more important, it is natural to use a bigger font for those elements.

Graph that has larger text sizes for relatively important elements

Modifying text

Besides size, you can also highlight text with bold, italic, underline, and color.

Among those, bold text stands out and adds minimal noise at the same time. Italic text also adds minimal noise but stands out less compared to bold. Lastly, underlined text adds more noise than the other two methods and makes the text noticeable at the same time.

Bold / Italic / Underline

Another way to highlight a text is through color. But since color is one of the most important aspects of visualization, we will talk more deeply about this in the following paragraph.

Color

When properly used, color could be the most effective method to grab the audience’s attention. However, when it’s not, it will only distract the audience. So the first step you should take before applying any color to your visualization is pushing everything to the background. This means setting every element into a light gray so that none of the elements grab your attention and for greater contrast when color is applied in the later step.

Graph of every element pushed to the background

Then, think about your message and what you should highlight to convey that message. For instance, if you want to highlight the animation film’s median grossing from the graph above, you can set that bar into color.

Graph that highlights animation’s bar

But what if you have more than one message to convey? For instance, you also want to highlight the two lowest grossing genres in 2019. Do you add multiple colors?

Graph that highlights two points

One main thing to remember is that the more things you make it different, things stand out less and the audience could feel disoriented. According to the Universal Principles of Design, you should highlight at most 10% of your visual. So if you think you are going over that point, it’s better to repeat your visualization by highlighting different aspects in each iteration. This way, you can convey your message effectively, and the audience can learn different points without using too much brainpower.

Repeated visuals that highlight different points

Now, the question is which color you should use. This largely depends on who you are working with. You have to remember that colors evoke emotion, and certain colors can mean different things in different cultures, companies, and situations.

For instance, when you type internet speed dial on Google, you can see that a certain color does not always convey the same meaning. Sometimes, green stands for fast, whereas sometimes red does.

Internet speed dial search result on Google

So instead of blindly choosing a color, you should connect color with the message you want to convey like the graphic from the Guardian below.

Graphic on world’s glaciers loss created by Guardian

As its legend shows, Guardian uses red to indicate a decrease and blue for increase. Considering the message of the visual and the characteristic of glaciers, it isn’t hard for the audience to connect the dots between blue with gain and red with loss in glaciers and figure out that the graphic is highlighting the loss in glaciers.

Therefore, there is no one right answer on color. Instead, it goes back to the first point of this post. You should know your audience and your message before choosing the color.

Lastly, about 1 in 12 men and 1 in 200 women in the world are color blind, and there is no way of knowing whether your audience is color blind or not beforehand. Thus, it’s best to keep them in mind and use colorblind-friendly colors. Most commonly, it’s hard for them to distinguish between red and green. See how red and green look like in Deuteranopia and Protanopia color wheels. So instead of red and green, it is better to use orange and blue instead.

Color wheels for different eye visions, created by Coolwinks

However, your client can ask you to use colors other than orange and blue. In this case, you can visit Coloring for Colorblindness to get an idea of which combinations work well or check how different types of colorblind view your choice of color combinations.

A screenshot of Coloring for Colorblindness

Other considerations

Other than what I mentioned above, there are a few more things to consider.

First, you shouldn’t overcomplicate your visual. As mentioned previously, every element you give to the audience equals to additional cognitive load to them. Thus, instead of making your visual flashy, it’s better to make your visual easy to understand. This includes using an action title so that the audience gets a headstart on understanding your message, using consistent font and color to not confuse the audience, and using straightforward language instead of acronyms.

Secondly, every element is not equally important. Some should always remain in your visual, while others can be removed or deemphasized. And the decision on which elements to remove or not depends on your message and your audience’s cognitive load.

Generally, axes ticks aren’t as significant as the actual data points. However, we don’t always remove them. For instance, let’s say you want to convey a general trend of your data. In this case, it is a good idea to preserve the axes to show the big picture. But when you want the audience to compare specific points, it could be better to remove the axes and label data points directly.

As another example, axes metrics are less significant than data points. However, removing them would give your audience additional cognitive load since the audience then needs to search what the data points mean outside of your visual. And this could potentially distract the audience. So, it depends on the situation.

Therefore, when deciding what to remove from the visual, ask yourself. Would eliminating this change anything? If it does change something, how would the audience feel about the change?

Reference

[1] Knaflic, Cole Nussbaumer. Storytelling with Data: A Data Visualization Guide for Business Professionals. John Wiley & Sons, Inc., 2015.

[2] D3 visual examples

--

--