Thursday, June 20, 2013

Study Concludes that Youth Access Laws Reduce Adult Smoking Rates, Even Though Laws Have Been Shown Not to Affect Youth; The Problems of Confounding and Multiple Comparisons

A new study published online ahead of print in the American Journal of Public Health concludes that state youth access laws (i.e., laws that reduce youth access to purchasing cigarettes) significantly reduce the rate of smoking when these youth are young adults.

(See: Grucza RA, Plunk AD, Hipp PR, et al. Long-term effects of laws governing youth access to tobacco. American Journal of Public Health. Published online ahead of print on June 13, 2013: e1-e7. doi: 10.2105/AJPH.2012.301123.)

The study examined the relationship between respondents' reports of their smoking status at age 18-34 at the time of a national survey conducted during the period 1998-2007 and the status of the youth access law in their current state of residence when they were less than 18 years of age.

The study examined three different outcomes: (1) ever smoking; (2) current smoking; and (3) heavy smoking. It examined results for all respondents, for just males, and for just females. It separately examined nine different types of youth access policies (e.g., bans on vending machines, ID requirements, signage requirements, inspection requirements, and free distribution restrictions). It also examined the effects of combinations of 4 selected policies and of all 9 policies together.

The study found that among males, there was no association between youth access policies and any of the measures of smoking status. Among females, there were significant associations between vending machine restrictions and all three smoking measures, between identification requirements and two smoking measures, and between repackaging restrictions, statewide enforcement authority, and free distribution restrictions on one smoking behavior.

The paper concludes: "Our findings suggest that restricted access to tobacco during adolescence is associated with reduced smoking prevalence in adulthood, but the association are observed only among women."

The study then goes on to claim that if all 9 youth access policies were in place, there would be a 14% reduction in smoking prevalence for women, and a 29% reduction in heavy smoking among ever smokers.

The Rest of the Story

The results of this study must be interpreted in light of the existing literature, which documents that youth access policies have no impact whatsoever on youth smoking rates. It is simply too easy for youth to access cigarettes and at least half of youth do not purchase their cigarettes in the first place. It has been well-documented that even in communities where compliance checks show high rates of compliance with youth access laws, young people have no problem accessing cigarettes.

So how does one explain the conflicting results of this study?

There are two major problems that threaten the validity of the study conclusions.

First, confounding is a severe problem. The enactment of youth access policies is not a random phenomenon. It is possible that policy enactment is related to levels of anti-smoking sentiment in a state. And anti-smoking sentiment may itself be strongly related to youth smoking. Thus, it might appear that the youth access laws are reducing youth smoking, when really it is the anti-smoking sentiment in the state which is associated with lower smoking rates.

In fact, in my own research, I found the above to be true at the local level in Massachusetts. We were able to control for anti-smoking sentiment at the local level by including a variable that reflected the proportion of a town's residents who voted for a state cigarette tax initiative. But we did find that passage of tobacco policies is strongly related to levels of local anti-smoking sentiment. Unfortunately, this study was not able to control for this confounding variable.

But there is a more obvious problem which, in my view, makes it very difficult to support the conclusions that the paper draws.

The problem is called multiple comparisons.

Think of it this way: You have a hypothesis that there are more boys than girls enrolled in your child's elementary school. To study this, you could simply take a random sample of students in the school. You could estimate the proportion of boys and use a 95% confidence interval to see if the proportion differs from 0.5 by chance alone (using a p-value of 0.05).

However, instead of proceeding as above, you decide instead to take a sample from each of the 100 classrooms in the school. Now suppose that the truth is that there are exactly 10 boys and 10 girls in each classroom. By chance alone, it is almost certain that you are going to find that at least one classroom contains more boys than girls, with a difference that is significant at a level of 0.05.

The point is that if you run enough different statistical tests, even if there is no true effect, you are bound to find a positive result by chance alone.

In epidemiology and biostatistics, this is called the problem of multiple comparisons. And that is the trap that this paper falls into.

The paper does not simply formulate a hypothesis and then test it with a model. Instead, it develops 81 different hypotheses and tests each one. There are:

  • 9 different policies;
  • 3 different smoking behaviors; and
  • 3 different gender breakdowns (male, female, and total).
Thus, the paper runs 81 different analyses, running 81 different statistical tests.

You can see that by chance alone, with 81 different tests, you are likely to find a few that are statistically significant at a p=0.05 level, even if there is no true effect of youth access policies. Thus, using a traditional significance level of 0.05 is not valid in this paper. But that is exactly the confidence level that the paper uses.

My mentor at Yale University School of Medicine - Dr. Alvan Feinstein - demonstrated how the problem of multiple comparisons resulted in a number of bogus conclusions being drawn in several epidemiologic studies. He wrote an excellent chapter about the problem in his book on medical statistics.

A mathemetician named Carlo Bonferroni developed a method to address the problem. Essentially, one raises the significance level by dividing the p-value by the number of comparisons being made. Thus, if one makes 81 comparisons, then instead of using a p-value of 0.05, one would use a p-value of 0.0006. Instead of a 95% confidence interval, one would need to use a 99.9% confidence interval.

If you look at the 95% confidence intervals for the "significant" findings in the paper, you'll see that all of them are very close to being non-significant, and it is very apparent that even a minor increase in the significance level would render all of these results non-significant. Certainly, applying the Bonferroni correction would invalidate all of the findings of this paper.

Essentially, this paper did what we sometimes call a "fishing expedition." That is, it tested out every possible hypothesis without any pre-existing rationale for why a particular policy would conceptually or empirically be expected to have an effect on the outcome. It is acceptable to do a fishing expedition, but one has to be aware that one cannot simply use the traditional 95% confidence level.

Incidentally, this is a problem that I see all the time among many of my students. Rather than develop one or two clear, conceptual-based hypotheses, they simply run every possible analysis of every possible variable. Invariably, they come up with some "significant" results. I explain to them that these results are not actually "significant" in statistical terms, because with so many comparisons, one would expect to find at least several "significant" results.

In fact, we can quantify the chances that this paper would have failed to find any significant relationship between at least one of the 9 policies and a smoking outcome if the truth were that there is no effect. Assuming no effect, if you do 81 comparisons,  using a p-value of 0.05, your chances of not finding any "significant" results is only 1.6%.

In other words, with 81 comparisons, there was a 98.4% probability that this study would find at least one "significant" result if there were absolutely no true relationship between youth access policies and smoking behavior.

As Dr. Feinstein explains, one has to be particularly careful with the problem of multiple comparisons when the conceptual foundation and/or empirical evidence relating to an association is suspect, as it is in this case (as the overwhelming scientific evidence leads us to conclude that there is no effect of youth access policies on youth smoking rates).

No comments: