The Use and Abuse of Statistics

Statistics is wonderful for the way in which it can help us understand the patterns within data, but its widespread use has also sometimes led to widespread abuse. One of the more common ways in which statistics is abused is by causing people to confuse correlation with causation. A good correlation between two variables simply means that there is a moderate to strong association between the two variables so that we can predict when one them will increase or decrease simply by knowing whether the other is increasing or decreasing. However, as any first semester statistics student knows, correlation doesn't imply causation. For example, the chart below shows that there is a moderately strong correlation between the number of pictures that Nicolas Cage makes in a year and the number of people who drown by falling into a swimming pool that year. However, I seriously doubt that Nicolas Cage making a movie causes a person to drown. Correlation does not imply causation.

When we do have a strong correlation between two variables, this is sometimes the result of a third variable which affects them both. When this happens, we call that third variable a confounding variable, and this is another reason why we should not conclude causation when all we know is that two variables are correlated. It just may be that it is the confounding variable which is the real cause of what we have observed. For instance, tropical conditions such as high temperature, high humidity, and high rainfall are positively correlated with the number of cases of malaria, but tropical conditions, in themselves, do not cause malaria. Rather, it is the case that as the tropical conditions increase, the number of malaria-causing mosquitoes also increases. In this instance, the variables of tropical conditions and mosquitoes are confounded with one another, and if one didn't know better, then one might make the erroneous assertion that humidity causes malaria.

Below is a very misleading graph from Smart Approaches to Marijuana (SAM), an organization that opposes the legalization of cannabis for recreational use and that also opposes use of the whole plant for medical purposes. The title of the graph is "Youth Use in Colorado Going Up." My biggest problem with this graph is that it appends certain events to the timeline in order to give one the impression that those events are causing what is seen on the timeline. In this instance, we should remind ourselves that absolutely no causal relationship has been established. We could just as easily add an event such as "George W. Bush re-elected" to suggest that Republican presidents cause youth to turn to marijuana!. Another problem with this graph is that recreational marijuana was legalized in Colorado in 2012 and went on sale in 2013, and that period isn't even covered in the graph below. Additionally, notice that we see a drop in teen use from 2011 to 2012, and that the "Decision to allow UNLIMITED # of caregivers" occurs in the middle of an increase in use, not at the beginning. Thus, it's really difficult to conclude from the information presented that medical marijuana is a cause of any increase or decrease among youth 12 - 17. From a statistical perspective, I find this entire graph unethical.

A few years ago at the website for the White House, I found a publication titled "Marijuana Myths & Facts," and one of the "myths" it contained was the assertion that "marijuana makes you mellow." A link to this document is given below along with links to the two studies it references in order to imply that smoking marijuana causes violence. In the first study, the population being studied was African-American, inner city, lower economic status, young adults who used drugs, and even the author of this study states that the results may not apply to middle-class African-Americans, let alone other populations! Additonally, those individuals in the study who used marijuana the most were also more likely to be involved in drug trafficking. Consequently, it is quite likely that the violent behavior observed was due more to the drug trafficking activities than to the use of cannabis. In the second study, the adolescents who used marijuana more frequently and who exhibited more troubled behavior were also more likely to come from households without two parents and to have moved 2 or more times within the past year. Can you say "confounding variable?" Neither of these studies provides proof or convincing evidence that marijuana use causes an increase in violence. Furthermore, we should not discount the personal testimonies of the millions of users who will readily tell you, "Yes, marijuana does make me more mellow."

In summary, anytime you see a statement that says that marijuana is "associated with" or "linked" with a certain outcome, remember that all that you are given is that a correlation exists, and that, in itself, does not prove that there is a causal relationship between marijuana and the given outcome. Sometimes, there may indeed be a causal relationship, but that is something that has to be established by additonal science. Many other times, the relationship between cannabis and the outcome may be either accidental or due to a confounding variable. However, to take correlational relationships and to imply to the general public that they are causal relationships is just extremely dishonest and blatantly unethical.

Another problem that I often see in the media are the frequent reports that so-and-so had marijuana in their system when they were involved in some incident. What you need to know is that while marijuana may make you high for a few hours, the matabolites of marijuana can remain in your system anywhere from a few days to a month. In other words, just because a person had metabolites in their system when they had a traffic accident, it in no way means that they were high when the accident happened. In fact, they could have last used cannabis almost a month ago! Thus, anytime you hear a report that someone had marijuana in their system when something happened, just ignore it. The current tests for metabolites of THC tells us nothing about the person's state of consciousness at the time of the incident.

And finally, you will sometimes read that there are conflicting studies regarding whether or not cannabis is associated with a certain outcome. When this happens, it's a pretty safe bet that marijuana is not the cause of the outcome in question. For example, if A causes B, then we pretty much expect B to be present anytime A is present. If instead, when A is present, B is present only half the time, then clearly something else is going on, and the conclusion that A is a strong cause of B is definitely in doubt. In this case, more research is needed before any definitive conclusions can be drawn!

Overall, I find many of the government websites to be biased in their assessment of cannabis. They tend to exaggerate the harms, ignore the health benefits, and present associations as causal relations, usually using a phrase such as "marijuana has been linked with ... ." The following Dilbert cartoon, in my opionion, summarizes their biased approach.

There are, however, some exceptions to this bias within the government, and these exceptions are generally found at federal websites that serve as repositories for scientific research. Below are a couple of the better websites I've found.