Peer Review Congress 2022

September 09, 2022

Chicago, United States

Degree of Text Similarity and Prevalence of Potential Plagiarism in Biomedical Research Articles According to Linguistic Background and Field of Study


copyright and intellectual property

ethics and ethical concerns


Objective Text similarity detection software is widely used by biomedical journals to screen submitted manuscripts for potential plagiarism, with some journals rejecting manuscripts with high overall similarity scores in (eg, >40%) without further review. However, considering that overall scores may be vulnerable to false-positives resulting from common phrases, certain guidelines suggest examining the single-source scores to detect potential plagiarism.1 The degree of text similarity and prevalence of potential plagiarism in biomedical articles was examined according to linguistic background (English-speaking vs non–English- speaking) and field of study (clinical vs nonclinical).

Design This cross-sectional study was performed in June 2020 and followed the STROBE reporting guideline. We analyzed the iThenticate similarity reports of 480 articles randomly selected from an open access multidisciplinary journal, PLoS One. The articles were categorized into 8 preselected countries as English-speaking (USA, UK, Canada, Australia) vs non–English-speaking (Korea, China, France, Italy) and 6 fields of study as clinical (cardiology, gastroenterology, oncology) vs nonclinical (molecular biology, genetics, microbiology). The degree of text similarity was defined as the overall iThenticate score, and the presence of potential plagiarism was defined as either (1) a single-source score of greater than 10% according to the Springer Nature guideline1 or (2) overall score of greater than 40%, which is a cutoff used at some journals for considering editorial actions.2,3 The similarity scores in each manuscript section were measured by calculating the proportion of highlighted text in each using ImageJ.

Results The degree of text similarity differed significantly among countries, with articles from non–English-speaking countries having higher scores than those from English- speaking countries (30.9% vs 23.8%, respectively; P < .001) (Table 39). Among the non–English-speaking countries, there was no significant difference in the degree of text similarity between Asian and European countries (31.7% vs 30.1%, respectively; P = .27). Text similarity also differed among fields of study, with clinical articles having higher scores than nonclinical articles (29.5% vs 25.2%, respectively; P < .001). Measurement of text similarity showed that the Methods had the highest degree of text similarity among manuscript sections. The overall prevalence of potential plagiarism was 13.5% (65/480) and 13.8% (66/480) according to the single-source score cutoff of greater than 10% and the overall score cutoff of greater than 40%, respectively. Except for the lower prevalence of potential plagiarism in English-speaking countries according to the overall score cutoff (5.4% vs 22.1%, respectively; P < .001), no statistically significant differences were noted between English-speaking and non–English-speaking countries, Asian and European countries, and clinical and nonclinical articles.

Conclusions While the degree of text similarity differed significantly according to linguistic background and field of study, the prevalence of potential plagiarism was similar across countries and fields of study. Clinical researchers in non–English-speaking countries in particular may benefit from receiving English-language writing education to avoid unintended text similarity.

References 1. Springer. Plagiarism prevention with CrossCheck. Accessed February 24, 2022.

2. IEEE Robotics & Automation Society. Information for IROS editors. Accessed June 14, 2022.

3. ARRUS Journal of Mathematics and Applied Science. Plagiarism policy. Accessed June 14, 2022.

Conflict of Interest Disclosures None reported.

Funding/Support This work was supported by grant 2019-781 from the Asan Institute for Life Sciences at Asan Medical Center, Seoul, South Korea.

Role of the Funder/Sponsor The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the abstract; and decision to submit the abstract for presentation.

Next from Peer Review Congress 2022

European Scholarly Journals From Small and Mid-Size Publishers in Times of Open Access: Mapping Journals and Public Funding Mechanisms

European Scholarly Journals From Small and Mid-Size Publishers in Times of Open Access: Mapping Journals and Public Funding Mechanisms

Peer Review Congress 2022

Mikael Laakso

09 September 2022

Similar lecture

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)


  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved