Thursday, June 23, 2011

PSA: Correlation is not causation

Today I'm going to do a public service announcement: correlation is not causation. How does this relate to healthy habits and weight loss? Stick around!

Journalistic standards tend to fall when authors are reporting on medical findings. I believe that the reason for this is two-fold: 1) journalists aren't necessarily well-versed in science, and 2) saying something is associated to another factor is far less interesting than saying that it causes another factor. The most common problem I see, aside from misinterpreting some results, is that reporters write as though correlation is the same as causation. I saw a headline today that really threw me--potatoes can add plenty to waistline--so I had to investigate!

Anyone who has taken a basic statistics or research methods course can tell you that there are three rules for determining causation:
  1. Establish correlation.
  2. Define time relationship.
  3. Eliminate intervening variables.

I'll briefly cover what those mean. The first one is pretty simple; you have to show that two things are related reliably (meaning that your test can be replicated and return the same results) and validly (meaning that your test measures that which you intend to measure). The second one is also fairly simple; you have to demonstrate that item A (the causal factor) precedes item B (the caused result).

The third is slightly more difficult, and it's the one that journalists skip most frequently. Intervening variables include anything that could have an effect on item B, even though item A may appear to cause the change. My statistics professor gave the best example I know. Let's say you're at a football game. It starts to rain. Everyone opens umbrellas, and the game continues. However, suddenly the rate of fumbles increases. After a number of games where you observe this, you notice that people always open umbrellas before the fumbling increase. Therefore, you conclude that umbrellas cause fumbles.

Nonsensical, right? We've established 1 and 2, but we haven't eliminated an important intervening variable: the rain! For such a simple example, anyone would confidently say "Wait! You've got it wrong! The two are correlated, but they aren't causal." A recent study from Harvard University scientists (Mozaffarian, Hao, Rimm, Willett, and Hu) published in the New England Journal of Medicine found that "4-year weight change was most strongly associated with the intake of potato chips (1.69 lb), potatoes (1.28 lb), sugar-sweetened beverages (1.00 lb), unprocessed red meats (0.95 lb), and processed meats (0.93 lb) ) and was inversely associated with the intake of vegetables (−0.22 lb), whole grains (−0.37 lb), fruits (−0.49 lb), nuts (−0.57 lb), and yogurt (−0.82 lb) (P≤0.005 for each comparison)" [emphasis mine]. Several dozen newspapers and blog sites around the country picked this important finding and completely distorted it.

Each of those three articles uses terminology to indicate that potatoes "cause," "contributed to," or "led to" weight gain. That's not what the scientists say. It's a subtle difference, but the studies found that daily servings potatoes, particularly fried in some way, were associated with higher weight gain. Everyone in the study gained weight on average, so it's not as though the potato lovers were the only ones packing on the pounds. Journalists are overlooking key intervening variables! Do people who regularly eat some form of potato have other lifestyles that might contribute to higher weight gain? The scientists even point out that you can't say that one food or drink can be shown to consistently affect weight gain across the board: "Eating more or less of any one food or beverage may change the total amount of energy consumed, but the magnitude of associated weight gain varied for specific foods and beverages. Differences in weight gain seen for specific foods and beverages could relate to varying portion sizes, patterns of eating, effects on satiety, or displacement of other foods or beverages" [emphasis mine].

Furthermore, I'm guessing that the journalists didn't actually look at the results graph. The category of potatoes--that they all like to point out adds 1.28 pounds over four years--includes two subcategories: 1) French fried (3.35 pounds over four years), and 2) Boiled, baked, or mashed (0.57 pounds over four years). If you view it that way, French fries were the highest weight gain correlation with 3.35 pounds, far higher than the 1.69 pound gain from eating potato chips. While boiled, baked, or mashed potatoes still weren't a negative correlation, indicating weight loss, that subcategory ranked 8th of the 23 variables in terms of weight gain, landing it between trans fat (0.65 pounds over four years) and refined grains (0.39 pounds over four years).

I know this has been a long blog post, but it's important to keep the correlation versus causation distinction in mind when we read these stories that could influence us to change our lifestyles. Is it easy to learn to interpret statistics to find meaningful results? No, it's certainly not. Is it worth the effort not to mislead the American public? Yes, I absolutely think it is.


  1. Honestly, I think most journalists aren't... quite sharp enough for this stuff. I have my degree in journalism, and they don't teach this stuff in school -- generally, you don't learn specialty reporting at all. There are science reporters, of course, but in the case of more "mainstream" studies, you get run-of-the-mill journalists writing about them. And, yes, both they and their editors want the most salacious headline possible, to sell newspapers (or, nowadays, get clicks).

    I've always been clued into causation & correlation because I'm a HUGE NERD and when we studied it in my HS psychology class I was like OMG COOL and did more reading. I love stuff like this, because it's like a logic puzzle -- read the misleading headline, then the article, then click through to the source and see how poorly the journalist interpreted the study.

    So I think it's part willful misunderstanding for the sake of selling news, and part lack of understanding of the nuances and subtleties of scientific study. With weight related studies, in particular, it gets RIDICULOUS because everyone wants the headline that trumpets OMG X, Y, Z MAKES YOU FAT.

  2. @CurvyNerd I'm always surprised when people don't learn specialties; then again, as a library science major, we couldn't possibly have enough grad courses to cover each library for which we could ever have the opportunity to work (law, engineering, transportation, etc.).

    "[I]t's like a logic puzzle -- read the misleading headline, then the article, then click through to the source and see how poorly the journalist interpreted the study." That's exactly what I do every time I see an article that gives me pause. Librarians are trained to do information literacy, so hopefully I can spread the word through this blog.