Obsession with statistical significance

Another link courtesy of the Monkey Cage’s “Potpourri” link collection. This time it’s about the prevalence of papers in Psychology journals that report p values of just under 0.05. The quest for statistical significance at the 0.05 level is probably the greatest shame of statistics education around the world. It comes across as being taught as a hard and fast rule that 0.05 is the magic number. We hear all about 95% confidence intervals (which I detest as a way of summarising uncertainty) and t tests with p values under 0.05 proving that there’s an effect.

Something I’ve picked up from my statistics supervisors, both of whom are Bayesians, is that “significance” is a bit arbitrary yet it’s treated as a revelation of some universal truth by those whose statistical training hasn’t been particularly thorough. A little bit of knowledge is dangerous, you might say, particularly among reviewers. While I’m not the biggest fan of hypothesis testing, there’s a right way and a wrong way to do it. The argument should be made that “this is the p value of the test under the hypothesis (and the model)” rather than just assuming that an effect is significant or not based on a point estimate of a probability measure.

What would I suggest in its place? Doing one’s regression in a Bayesian setting (the GLM is generally more flexible than ANOVA) and reporting the 95% credible interval. The credible interval represents the distribution of values that a parameter may take and so a 95% credible interval corresponds to a belief that the value lies in that range. This is opposed to the confidence interval approach which says that infinite replication of the experiment would yield an effect size which lies in the confidence region 95% of the time. The Bayesian posterior represents one’s belief (objective, subjective, whatever) updated by the data by a model. It’s more useful to report the range of values that one believes a posteriori that the parameter of interest might take.

And this is the strength of the Bayesian paradigm. It is all about quantifying uncertainty rather than coming up with point estimates and stating wholesale whether a result is significant or not. Far better to state “this is what I think and this is how uncertain I am”.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s