Peer Review Congress 2022

September 11, 2022

Chicago, United States

Searching for Misconduct and Paper Mills in Peer-Review Comments



editorial and peer review process

peer review

Objective The objective was to test and compare various methods to detect text duplication in peer reviews submitted by 2 or more reviewers.

Design Peer review fraud is a significant concern.1,2 A data set of peer review comments submitted to SAGE Publishing was analyzed to search for duplicate text, a possible sign of fake peer review.3 Peer review comments for each article peer reviewed by 19 SAGE Publishing journals were downloaded from the ScholarOne peer review management system and loaded into a Pandas DataFrame. Journals were chosen based on the availability of data; therefore, the data set should be considered biased. Similar comments were found using a number of search methods, including MinHash Locality Sensitive Hashing (MinHash LSH) for detecting near- duplicate text strings, and Elasticsearch, a scalable graph database combined with RapidFuzz, a fast string-comparison library, for distinguishing similar from dissimilar comments.

Results Of 62,974 peer reviewer accounts used to evaluate 66,815 articles, 357 accounts (0.05%) were identified that produced reviews with partial or fully duplicate comments. One large cluster of 47 accounts that shared a number of reports included a number of articles rejected because of suspected paper mill activity. This number suggests that the cluster of 47 accounts represented 47 fake reviewer accounts administered by a paper mill. In total, 972 articles (1.5%) had reviews from reviewer accounts associated with duplicate commenting activity, and 77 articles had reviews from the 47 suspected paper mill accounts (Figure 33). Different search methods identified different suspect accounts and clusters. These searches included (1) a search for exact duplicates, which took 16 seconds to load data into memory and less than 1 second to execute; this search found 29 accounts that had produced similar comments, and (2) a search for similar comments using Elasticsearch, which took 18 minutes and 29 seconds to index and 9 hours, 19 minutes, and 2 seconds to execute; this search found 204 accounts that had produced similar comments.

Conclusions Efficient methods for identifying possible peer review fraud and paper mill activity were described. The methods should be tested on broader peer review sets and settings. When duplication is found, the findings must be considered in context before a judgment can be made about whether there is misconduct.


  1. Misra DP, Ravindran V, Agarwal V. Integrity of authorship and peer review practices: challenges and opportunities for improvement. J Korean Med Sci. 2018;33(46):e287. doi:10.3346/jkms.2018.33.e287
  2. Cohen A, Pattanaik S, Kumar P, et al. Organised crime against the academic peer review system. Br J Clin Pharmacol. 2016;81(6):1012-1017. doi:10.1111/bcp.12992
  3. Dadkhah M, Kahani M, Borchardt G. A method for improving the integrity of peer review. Sci Eng Ethics. 2018;24(5):1603-1610. doi:10.1007/s11948-017-9960-9

Conflict of Interest Disclosures None reported.


SlidesTranscript English (automatic)

Next from Peer Review Congress 2022

Detection of Plagiarism Using a Search Engine

Detection of Plagiarism Using a Search Engine

Peer Review Congress 2022

Ariella Reynolds

11 September 2022

Similar lecture

Identification and comparison of key criteria of funding decision feedback to applicants: A funder and applicant perspective
technical paper

Identification and comparison of key criteria of funding decision feedback to applicants: A funder and applicant perspective


+7Cephas A. S. BarretoJuan Pablo AlperinKathryn Fackrell
Kathryn Fackrell and 9 other authors

30 September 2020

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)


  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved