Goodbye P < 0.05. P-value is simply one item among many to gauge scientific evidence - 26/02/26
, Chadli Dziri b, Bob Occean cHighlights |
• | For a century, P -value < 0.05 is considered as the criterion to reject the null hypothesis (no significant difference between two specified population, any observed difference being due to chance) |
• | The main advantages P < 0.05 of P -values is that they give non-statisticians a quantitative basis addressing uncertainty by the null-hypothesis approach. |
• | The “arbitrary threshold of 0.05 means statistical difference with a black-or-white judgment. |
• | In its 2016 statements, the American Statistical Association recommended not banning P itself but rather banning the dichotomized P < 0.05. |
• | In the next paradigm, P -values should be considered as a continuum and as one item among others to gauge scientific evidence. |
• | There are several alternatives to the P -value with 0.05 threshold such as effect size with confidence intervals, number needed to treat (NNT) or number needed to harm (NNH), Bayesian methods, change to more stringent thresholds for P -values (0.005 or 0.001), pragmatic trials, and the minimal clinically important difference. |
• | Each tool whether P -value or alternative ones should be recommended and interpreted according to the context. |
• | The new paradigm will require statisticians, researchers, publishers, and healthcare decision makers to radically change the way they interpret scientific data, abandoning century-old dichotomous analysis and the traditional statistical significance. |
Summary |
For a century P -value is routinely used in almost every research paper with a threshold of 0.05 to reject the null hypothesis. The aim of this short review was to discuss the validity of this arbitrary (yet sacred) threshold. The history of P -value shows that very quickly, practitioners had found a simple method that appealed to them, while statisticians saw no great need to curb this enthusiasm, which seemed consensual. However, heavy reliance on P -values is of concern because of potential misuse and misinterpretation. The main pitfalls of P < 0.05 are the dichotomized approach with a black-or-white judgement, possible false positive results, lack of information about magnitude of the effect, clinical relevance, and use out of context. These pitfalls explain why several statisticians and researcher recommend abandoning, not the P -value itself but the threshold of 0.05 and the term “statistical significance”. We are faced with a paradigm shift by demoting P -value from its threshold-screening role and using alternative tools such as Bayesian methods, effect size with confidence intervals, more stringent thresholds, pragmatic trials, and the minimal clinically important difference. This will require statisticians, researchers, publishers, and health care decision makers to radically change the way they interpret scientific data by abandoning century-old dichotomous analysis–a true revolution to come.
Le texte complet de cet article est disponible en PDF.Keywords : Statistics, Statistical significance, P < 0.05 , Clinical research
Plan
Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.
Déjà abonné à cette revue ?
