The saying “Correlation does not imply causation” has been echoed throughout statistics classrooms and science hallways world-wide, and is seemingly uttered at least once at every major scientific conference. In short, it means that just because two variables are correlated (i.e., associated with each other), one does not cause the other. But too often, especially in observational health research, correlations between diseases, biomarkers, risk factors and lifestyle practices are interpreted as causative, potentially impacting health policy, practice and interventions without solid evidence.
Why does this happen? Well, sometimes a correlation can appear when there are just too few datapoints under review. If these datapoints happen to be randomly associated, then a relationship appears. Take, for example, the relationship between lemon imports and highway fatalities. Data from a five year period (i.e., just 5 datapoints) suggest a strong, inverse association between highway fatality rates and lemon imports from Mexico. But clearly, this is a a coincidental association mediated by additional factors. Sometimes, however, correlations occur even in within larger, seemingly robust datasets. A great new website, spurious correlations, illustrates some very humorous examples of large scale federal database comparisons. For example, the divorce rate in Maine correlates with per capita U.S. consumption of margarine. Butter is apparently better as far as marriage is concerned! Other spurious correlations suggest that total U.S. ski resort revenue correlates with the number of people who died by becoming tangled in their bedsheets. As you can see if you visit the website, these examples are so ludicrous that no one would suggest public policy or social programs be designed to target one variable in order to improve the other.
Where correlations become truly problematic, though, is in our interpretation and response to findings that have the potential to change behavior and knowledge. For example, a 2009 study looking at the relationship between number of marathons run each year and medication use concluded that marathon runners have lower rates of high blood pressure, high cholesterol and diabetes. Why? The data showed that the more marathons an individual ran, the less likely he or she was to use medications for blood pressure, cholesterol and diabetes. While these data have some value for clinicians, they don’t truly depict whether running more marathons is preventive for cardiovascular and metabolic disease. For example, runners may use less medications than nonrunners for preventing and treating disease because exercise is an effective alternative to medication. Or, there may be genetic differences between runners and nonrunners that also associate with disease prevalence. Correlation is not causation.
In fact, even longitudinal observational studies can be misleading with respect to interpretation and representation. For example, this recent study looking at caloric and fat intake over time in statin vs. non-statin users has generated substantial media coverage since over the 10 year observational study period, adults on a statin (a cholesterol lowering drug) increased fat intake and caloric consumption, whereas non-statin users did not. Consequently body mass index increased to a greater extent in statin vs. non-statin users, leading authors to subtitle the article: “Gluttony in the Time of Statins?” Anyone reading the media coverage might become concerned that statin drugs, which are life-saving drugs that dramatically reduce the likelihood of cardiac events, contribute to obesity. But of course these data can’t actually distinguish between various cause and effect scenarios. Does statin use really increase appetite? Or are statin users simply prone to less healthy diet and lifestyle patterns (perhaps this is why they needed a statin at baseline)? Or do the pyschological effects of a prescription lead to a false belief that diet is less important for disease prevention?
The point of this is simply to say that before you get worried about lemon imports, bedsheets, and butter consumption, keep in mind that research and interpretation are always more limited by what we don’t know than what we do.