poster
Assessment of Minimum False-Positive Risk of Primary Outcomes After Reducing the Nominal P Value Threshold for Statistical Significance From .05 to .005 in Anesthesiology Randomized Clinical Trials
keywords:
reproducible research
publication
statistics
bias
Objective A primary reason for reproducibility concerns in the biomedical literature may be that many published articles reporting statistically significant findings do not represent real effects.1,2 Several solutions have been postulated to mitigate the risks associated with false-positive findings.1,2 This study sought to determine the ramifications of lowering the nominal P value for statistical significance from .05 to .005 and assessed the minimum false-positive risk (minFPR) for primary outcomes in anesthesiology randomized clinical trials (RCTs). These proposals have been explored in other fields, but the metrics have not been quantified for anesthesiology.
Design This cross-sectional descriptive study aimed to determine these metrics for RCTs published in the top general anesthesiology journals, defined by impact factor. The target journals were Anaesthesia, Anesthesia & Analgesia, Anesthesiology, British Journal of Anaesthesia, Canadian Journal of Anesthesia, European Journal of Anaesthesiology, and Journal of Clinical Anesthesia. The Cochrane Highly Sensitive Search Strategy was used to identify RCTs in MEDLINE. All superiority RCTs published between January 1, 2019, and March 15, 2021, comparing 2 groups with at least 1 primary outcome were included. Study screening and data extraction were performed in duplicate. P values for primary outcomes were extracted and the percentage of RCTs that would maintain statistical significance at a threshold of P < .005 was determined. For these outcomes, minFPRs were calculated assuming 1:1 prior odds of an intervention being effective, using previously recommended methods.3 Study- level characteristics predicting maintenance of statistical significance at P < .005 and minFPRs were computed using logistic and median regression, respectively.
Results After searching, deduplication, and screening, 318 RCTs were included. The median (IQR) sample size was 80 (52-130) and did not differ significantly across journals. The majority of RCTs (273 of 318 86%) were single-center studies. P values below .05 occurred in 205 of 318 RCTs (64%) (by journal, this ranged from 44% to 77%). Of these 205, 119 (58%; 95% CI, 51%-65%) maintained statistical significance at the P < .005 threshold. The mean (SD) minFPR was 22% (20%) (by journal, this ranged from 16% to 33%). Violin plots for P values and minFPRs by journal are shown in Figure 27. With minFPR50 (ie, minFPR assuming a prior probability of 50%) constrained to RCTs with P < .005, the mean (SD) was 2% (1.2%). Conclusions Approximately 42% of primary outcomes in anesthesiology RCTs would lose statistical significance under a more stringent P value threshold of .005. These primary outcomes carry a minimum false-positive risk of 22%. The adoption of the P = .005 threshold for statistical significance could reduce the minFPR to just 2%. These results call a large portion of anesthesiology RCTs into question and provide impetus to improve study design, analysis, and reporting methods to reduce false-positives and improve reproducibility.
References
- Niven DJ, McCormick TJ, Straus SE, et al. Reproducibility of clinical research in critical care: a scoping review. BMC Med. 2018;16:26. doi:10.1186/s12916-018-1018-6
- Colquhoun D. The false positive risk: a proposal concerning what to do about P-values. Am Stat. 2019;73:192-
- doi:10.1080/00031305.2018.1529622
- Sellke T, Bayarri MJ, Berger JO. Calibration of P values for testing precise null hypotheses. Am Stat. 2001;55:62-71. doi:10.1198/000313001300339950
Conflict of Interest Disclosures Philip M. Jones is deputy editor in chief at the Canadian Journal of Anesthesia. No other disclosures were reported.
Funding/Support Research time for Philip M. Jones and Janet Martin was provided by the Department of Anesthesia & Perioperative Medicine at the University of Western Ontario, London, ON, Canada.
Role of the Funder/Sponsor The Department of Anesthesia & Perioperative Medicine was not involved in the design or conduct of the study, nor the preparation, review, approval or submission of the abstract for presentation.
Additional Information This study was registered on March 15, 2021, with Open Science Framework (doi:10.17605/OSF.IO/H8KBZ). Acknowledgments We are grateful to David Colquhoun, who reviewed a presubmission draft of the manuscript.