FiveThirtyEight, data journalism and thinking about the counter-factual

I like the re-launch of FiveThirtyEight. The politics and U.S. elections were excellent reading but kicking it up a notch to cover sports, economics and a bunch of other stuff is even better. As a reader, with no paywall, it’s hard to see the downside. More quality content on a bigger budget with a broader reach.

As an example, this article – An America without Irish Immigrants – would struggle to find a home on other mainstream news websites. But its a quirky way to tie St. Patricks Day to immigration policy. Bonus. The article itself reinforces two important points in many policy environments; small changes can have big effects over time and scale matters.

However, I’ve got an issue. The article establishes about 27.6m people in the U.S. today can be linked to Irish immigration. This is assembled from Census data and is a pretty objective measure. The author points out this is as many people as Rhode Island, New Mexico, South Caroline and New York combined, a sizeable chunk of people. This allows the reader to give some thought to how the Irish have contributed to the United States through an empirical scale. Quantifying the cultural effect is harder to do than providing a nice number to think about which means this content is original and helpful. Further, while it is historically in correct, one can imagine situations where Irish emigrants from the 1820s chose not to travel to the U.S. and therefore changing the demographic make-up of the U.S.

What is a much bigger stretch, and what I find a questionable use of immigration statistics, is the way the author asks us to imagine what this would mean for Ireland. An additional 27.6m would increase the population of Ireland by over 500 per cent. We are talking here about historical hypotheticals which is a strange place to find oneself (albeit a common internet phenomenon). But unlike the U.S. example, you cannot simply these people didn’t arrive. In this case, we are asked to imagine these people never left. This is a major difference as it ignores emigration to any other country. From immigration history and theory, we know these differences are significant and meaningful.

Could the U.S. through small quirks of history ended up with – 27m people? There is a not insignificant probability. Could Ireland through small quirks of history ended up with + 27m people? No. There is no probability. Europe, Australia, Canada. Anywhere except poor, hungry Ireland.

This is not a major flaw. It’s a isolated example in a single article. Yet I think it points to what is really hard about data journalism and that’s understanding the story behind the data. To date, articles on FiveThirtyEight have been interesting and engaging. I spent more time than I should of looking through their College Basketball coverage. The data, and the counterfactuals you build by the data, have to be worthwhile and show something clear. This happened in talking about the Irish in the United States but was lacking when discussing Ireland itself.

I look forward to reading more and I’m sure little things like this will improve along the way also.


