Finding the best chart for your story
It’s September again, and I was reminded of this chart redesign I started a year ago, but never really finished. In September 2023, the rector of the Catholic University of Leuven (KU Leuven) gave his speech for the opening of the academic year. In this speech, he stressed the international role of the university, and I was happy to see that he used a lot of charts to support his arguments.
Unfortunately, the quality of the charts made me a bit sad. Not only where they simply default Excel charts, they also showed that not much thought went into them. Chart choices, axes, legend, colors, titles,… a lot of things could be improved.
If you’re interested, you can read the full speech on the university website, or you can watch it on YouTube.
One of the crucial figures in the presentation shows the fraction of Belgian versus international PhD students at KU Leuven:
I have quite a few issues with this chart. Some are major errors, some are just tiny changes that would make it much stronger.
- Poor vertical axis choices. The absence of axis title and ticks hides the fact that the vertical axis does not start at zero. For a bar chart, where the data is encoded in the size of the bars, this is a major problem. Our brain tries to compare the data values by comparing the length of the bars, but the differences between the bars are visually exaggerated because a large part of them has been cut off. For example, in the academic year 2022–2023 the orange bar is visually more than twice as big as the blue bar, but the data value is only around 20% bigger.
- Poor signal-to-noise ratio. For every academic year, the sum of both fractions is always equal to 100%, as PhD students only be Belgian, or international. That means that the same information is duplicated in the chart. Simply showing the fraction of international students would suffice.
- No title, caption, or annotation. The data can be understood by looking at the chart, but to understand why this chart is shown, and how it fits in the story, we are forced to read the (very long) text it is part of.
- Sloppy details. I can only imagine how much work was put into writing the perfect opening speech, but unfortunately far less effort seem to have been put into making the charts. The basic Excel color scheme, overlapping labels, a superfluous legend in stead of direct labeling,… None of these are dramatic, but with a little bit more time and effort, the chart could look a lot better.
A first attempt to fix the bar chart issue could be to start the vertical axis at zero. Our chart then looks like this:
This chart is more accurate, but to be honest, it is not really what we are looking for. The reason why we are showing this data is to explain that the fraction of international PhD students has continuously increased over the past few years, and has surpassed the fraction of Belgian students. But by zooming out, this increase is much less visually impactful. Even though it is only a few percentage points, it is an important trend and we are looking for a chart that clearly highlights this trend.
The solution for our problem is probably to switch to another chart type. Because the total value of (Belgian PhD students) + (international PhD students) always equals 100%, we could be tempted to use a chart type suited to visualize a part-to-whole comparison. One alternative that comes to mind is a stacked bar chart (rather than the original clustered bar chart):
Hmm, that didn’t really help. In fact, our problem got worse and the growing trend is now even less obvious. That’s because the vertical range has grown from 60% to the full 100%, so changes have become even smaller.
To solve our problem, we need to zoom in on the trend, not zoom out by showing the entire 100%.
A common approach to zoom in while avoiding the axis not starting at zero problem is to switch to a dot plot — which uses the vertical position of dots rather than the length of bars to encode the data, or a simple line chart. A dot plot could look like this:
This alternative chart type enables us to nicely align the dots vertically, so it is clear that we are comparing them for the same academic years. And we can go one step further if we want to highlight the gap between Belgian and international students first shrinking, and then increasing again. We could visually show the gap by connecting the dots with a line. The result is a connected dot plot, or a so-called dumbbell chart:
This is a much clearer chart for this type of data, clearly emphasizing the changing gap between both categories. However, I’m not really happy with the result. You can see the gap shrinking and growing again, but I have the feeling the direction and evolution could be even more clear. One approach could be to color the gaps according to which category is bigger:
That’s better, but an even powerful approach might be to switch to a line chart, with colored areas between the two lines:
My apologies for the anticlimax where the simple line chart turns out to be the most clear option…
Now that we have decided which chart type we want to go for, it’s time to clean up the design a little bit, and make some modifications:
- Step away from the default Excel colors (let’s choose some KU Leuven themed colors)
- Remove the legend in favor of direct labeling
- Remove some of the labels to declutter the chart
- Add tick labels on the vertical axis to see the range we’re looking at
- Reduce the size of those circles a bit
Here’s where that leaves us:
That looks nice, but we’re not done yet. As always, it’s a good idea to make sure our key message is loud and clear. In this case, I would do that by adding a title, and maybe a clarifying annotation around the crossover of the two lines. This makes the chart able to ‘live by itself’ — it can easily be shared without needing too much context.
Purists would (rightfully) claim that we can always go further in reducing the amount of clutter in the chart, and increasing the signal-to-noise ratio. They would probably end up with a chart like this:
Is this a good chart for our data and story? Yes, I think so. Does it have a high signal-to-noise ratio? Absolutely! Do I prefer it over the previous option? Hmm, maybe not. I have the feeling we removed a bit too much:
- Even though the information is duplicate, people might want to compare the international students with the Belgian students. It also makes the crossover more clear.
- I’m a fan of horizontal gridlines, because they gave us the ability to have a more detailed look at the intermediate evolution over the years.
- Adding an additional color gives the chart more of a ‘KU Leuven’ branded look and feel, which is quite appropriate for an opening speech by the rector.
In summary, the most ideal chart for this data is a boring line chart, but we can maximize the signal and reduce visual noise by making some clever design choices, and paying attention to all the details.
What do you think? Do you agree with my modifications, or not? Any suggestions for further improvements?
Looking for more chart inspiration? I’m the author of ‘Powerful chart’, a hands-on book about the art of creating clear, correct and beautiful data visuals. Read all about it, including the Table of Contents and a sample, at baryon.be/book.