Peer Review Congress 2022

September 11, 2022

Chicago, United States

A Machine Learning Powered Literature Surveillance Approach to Identify High-Quality Studies From PubMed in Disease Areas With Low Volume of Evidence


post publication peer review

quality of the literature

artificial intelligence

Objective The DynaMed Systematic Literature Surveillance process surveys a large set of clinical journals most likely to contain high-quality, high-relevance content on treatment, diagnosis, and prognosis across all medical conditions. For many conditions, limited content is retrieved from those journals. Therefore, a machine learning–powered process was designed, implemented, and tested to efficiently and accurately identify relevant articles published across all journals indexed in PubMed.1-3 This study reports the overall performance of this machine learning–augmented surveillance system.

Design Content-based search strategies were developed by a medical librarian. PubMed-retrieved references were probability ranked by a LightGBM machine learning algorithm for likelihood of reporting high-quality, clinically relevant evidence.1 Top-ranked references were included for screening, stratified by publication date (<18 months or ≥18 months). Clinical experts trained in critical appraisal of the literature manually screened the references and identified those to be used for updating the topic. The following metrics were used to evaluate the machine learning system: median probability ranking by machine learning of the 15 highest- ranked references, overall and by topic; total and median number of references retrieved by topic; and median position of the first selected reference in the probability-ranked list compared with PubMed reference lists ranked as most recent and best match.

Results As of May 2022, results were reviewed for 332 topics. Of 91,009 articles identified, the 8406 (9.2%) with the highest probability ranking were manually screened, and 576 references (6.9%) selected to update 241 topics. The median number of references retrieved by topic was 184 (range, 7-3638). The median probability assigned to the 576 references was 0.047 (range, 0.002-0.996), and the median probability by topic was 0.079 (range, 0.047-0.803). The median position of first selected reference for machine learning was 2 vs 9 for the PubMed most recent strategy and 20 for the PubMed best match strategy. Overall, the median difference in position was 22 for machine learning vs the PubMed most recent strategy and 54.5 for machine learning vs the PubMed best match strategy. The 241 topics were distributed among 29 specialties, with pediatrics and infectious diseases accounting for 27%. The most common article type selected was cohort study (29%).

Conclusions This study provides precise estimates of the performance of a regression-based machine learning algorithm in assisting literature surveillance for topics with a low volume of evidence.


  1. Abdelkader W, Navarro T, Parrish R, et al. A deep learning approach to refine the identification of high-quality clinical research articles from the biomedical literature: protocol for algorithm development and validation. JMIR Res Protoc. 2021;10(11):e29398. doi:10.2196/29398
  2. Del Fiol G, Michelson M, Iorio A, Cotoi C, Haynes RB. A deep learning method to automatically identify reports of scientifically rigorous clinical research from the biomedical literature: comparative analytic study. J Med Internet Res. 2018;20(6):e10281. doi:10.2196/10281
  3. Abdelkader W, Navarro T, Parrish R, et al. Machine learning approaches to retrieve high-quality, clinically relevant evidence from the biomedical literature: systematic review. JMIR Med Inform. 2021;9(9):e30401. doi:10.2196/30401

Conflict of Interest Disclosures None reported.

Next from Peer Review Congress 2022

Groundwork for the Development of a New Risk of Bias Tool for Network Meta-analysis

Groundwork for the Development of a New Risk of Bias Tool for Network Meta-analysis

Peer Review Congress 2022

Carole Lunny

11 September 2022

Similar lecture

An Overview of Chatbot Technology
technical paper

An Overview of Chatbot Technology

AIAI 2020

Eleni Adamopoulou
Eleni Adamopoulou

05 June 2020

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)


  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved