I've never received as many emails from readers as I did for my most recent post on line graphs, analytics, Amazon, Excel, and Tufte, among other things. It turns out that countless consultants, bankers, and analysts still wake up in a cold sweat at the recollection of spending hours formatting graphs in Excel. I opened every last Excel spreadsheet you sent me, apparently we all have one or another lying around as a souvenir of our shared trauma.
[It's fun to hear directly from my readers, I encourage more of it. When so much of online discourse is random strangers performing drive-by violence, or chucking Molotov cocktails at you, it's somewhat old-fashioned and comforting to receive a friendly email. It reminds me a lot of the early days of the Internet, a more idyllic time, when Utopian dreams of the power of this technology hadn't been crushed by the darkness in the souls of mankind.]
Many readers just shared their stories of Excel frustration, but a few offered helpful tips. One class of these was just a recommendation to try alternative programs for charting, like Tableau, ggplot, or D3. I have not had time to play with any of these, though I remember working with Tableau briefly at a company that had purchased a license. I'm not familiar enough with any of these to offer any meaningful comparison to Excel.
I still hope that someone who works on Excel will just upgrade their default charting options because of the sheer number of Microsoft Office users. Most companies don't license Tableau, and many may not have the inclination or time to learn to use ggplot and D3, though the latter, in particular, seems capable of generating some beautiful visualizations. In contrast, I can't recall any company I worked at that didn't offer me a Microsoft Office license if I wanted one.
A few readers mentioned they wrote macros to automate some or most of the formatting tricks I set out in my post. Maybe at some point this will become an open source tool, which I'd love. If I spot one I'll share it here. Excel users are a tight knit community.
A couple readers offered a similar solution to my problem with generating dynamic data series labels, but my favorite, the most comprehensive option, came from a reader named Jeffrey, a finance associate.
It's a clever little hack. Here's how it works:
- Fill in your data table in Excel as normal, with data series in rows, time periods (or whatever the x-axis will be) in columns.
- Add one duplicate row for each data series below those data series in your table, and name each of those the same as the data series above, in order. For example, if your data series are U.S., Japan, and Australia, you now add three more rows, named U.S., Japan, and Australia.
- For these dummy data series, leave all the data cells blank except the last one, which in my example was the year 2014. For those three cells, just use a formula so that that cell equals the corresponding last data point from the corresponding data series above. So, for the dummy data series for Australia, my formula for 2014 just points to the value in the cell for 2014 for the actual Australia data series.
- Now create the chart for your data table, including both the actual and dummy data series. In the resulting line graph, you now have six data series instead of three, but the three dummy data series just have a single data point, the last one, which overlaps the corresponding last point in the actual data series.
- Do all the visual hacks I mentioned in my previous post. When it comes to your three dummy data series, instead of displaying the actual data label, select "Series Name" in the Format Data Labels menu.
That's it! Now you have data labels that will move with any change in the last data point of your three data series.
It's a hack, so it's not perfect. You still have to update your table each time you add to it, moving the dummy pointers one cell to the right. And selecting the data series when it exactly overlaps the actual data series can be tricky. But the thing is, all my old ways of setting up the charts were hacks, too, and this one saves the work of moving data series labels whenever the scale of the graph is changed.
It would be easier if Excel just offered a way to display the data labels and the series name for every data series. Maybe someday? The communication arc of the universe is long, and I like to think it bends towards greater efficiency.
Thanks Jeffrey! Also, thanks to my old coworker Dave who, more than any of my other readers, sends me precise copy-edits to my posts. I wish I were a better editor of my own work, but familiarity breeds myopia.