Peer Review Congress 2022

September 11, 2022

Chicago, United States

Quality of Reporting of Randomized Clinical Trials in Artificial Intelligence: A Systematic Review


quality of reporting

reporting guidelines

artificial intelligence

Objective The aim of this study was to evaluate the reporting quality of randomized clinical trials (RCTs) of artificial intelligence (AI) in health care from 2015 to 2020 against the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI)1 guideline.

Design In this systematic review, PubMed and Embase databases were searched to identify eligible studies published from 2015 to 2020. Articles were included if AI (defined as AI, machine learning, or deep learning studies) was used as an intervention for a medical condition, if there was evidence of randomization, and if there was a control group in the study. Exclusion criteria were nonrandomized studies, secondary studies, post hoc analyses, if the intervention was not AI, if the target condition was not a medical disease, or if the study pertained to medical education. The included studies were graded by 2 independent reviewers using the CONSORT-AI checklist, which included 43 items. Any disagreements were resolved by consensus following discussion with a senior reviewer. Each item was scored as fully reported, partially reported, or not reported. Irrelevant items were labelled as not applicable. The results were tabulated, and descriptive statistics were reported.

Results A total of 939 potential abstracts were screened, from which 73 full-text articles were reviewed for eligibility. Fifteen studies were included in the review. The number of participants ranged from 28 to 1058. Studies pertained to medical fields, including medicine (n = 2), psychiatry (n = 3), gastroenterology (n = 5), cardiology (n = 2), ophthalmology (n = 1), endocrinology (n = 1), and neurology (n = 1). Studies were from China (n = 6), the United States (n = 6), the United Kingdom (n = 1), the Netherlands (n = 1), and Israel (n = 1). Only 3 items of the CONSORT-AI checklist were fully reported in all studies. Five items were not applicable in more than 85% of the studies (13 of 15). Twenty percent of the studies (3 of 15) did not report more than 50% of the CONSORT-AI checklist items. Conclusions Reporting quality of RCTs on AI was suboptimal. Because reporting varied in the analyzed RCTs, caution must be exercised when interpreting their outcomes.


  1. Liu X, Faes L, Calvert MJ, Denniston AK. Extension of the CONSORT and SPIRIT statements. Lancet. 2019;394(10205):1225.

Conflict of Interest Disclosures None reported.

Next from Peer Review Congress 2022

A Machine Learning Powered Literature Surveillance Approach to Identify High-Quality Studies From PubMed in Disease Areas With Low Volume of Evidence

A Machine Learning Powered Literature Surveillance Approach to Identify High-Quality Studies From PubMed in Disease Areas With Low Volume of Evidence

Peer Review Congress 2022

Alfonso Iorio

11 September 2022

Similar lecture

An Overview of Chatbot Technology
technical paper

An Overview of Chatbot Technology

AIAI 2020

Eleni Adamopoulou
Eleni Adamopoulou

05 June 2020

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)


  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved