Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/yqe9-bn80

poster

AMA Research Challenge 2024

November 07, 2024

Virtual only, United States

Leveraging AI in urologic literature: GPT-4 for sentiment analysis on the utilization of radiotherapy versus prostatectomy in prostate cancer as an upfront treatment

Background The exponential growth of biomedical literature, especially in urology, with 1.3M new citations added to MEDLINE in 2023, makes it difficult for clinicians to stay updated and conduct literature reviews and meta-analyses. Artificial intelligence, specifically through a technique called sentiment analysis, can help by facilitating the rapid extraction and summarization of insights from study data. Sentiment analysis uses large language models (such as GPT-4) to classify texts based on sentiment (i.e. tone expressed). This study tests GPT-4 in conducting sentiment analysis of literature about a common urology topic — the utilization of radiotherapy versus prostatectomy as treatment for localized prostate cancer — by comparing GPT-classified sentiments of studies to human-classified sentiments.

Methods A PubMed literature search using MeSH keywords for prostate cancer, prostatectomy, and radiotherapy, yielded 29 relevant studies. Abstracts were reviewed by the research team, with members independently assigning a sentiment of prostatectomy-preferred, radiotherapy-preferred, or neutral. GPT-4 was then provided three abstracts that were human-classified as prostatectomy-preferred, radiotherapy-preferred, and neutral. This provided GPT-4 a baseline for sentiment classification. The model subsequently classified the remaining 26 abstracts by sentiment. It was prompted five separate times to ensure consistency. GPT-4 sentiment classification results were compared to human-classified sentiments (reference, gold standard answers) through accuracy, precision, recall, and F1 score metrics to gauge model performance.

Results GPT-4 had 80.8% (21/26) accuracy in classifying sentiments compared to manual, human classified sentiments, correctly categorizing 3 radiotherapy-preferred, 7 prostatectomy-preferred, and 11 neutral abstracts. GPT-4 had 76.5% recall, 82.8% precision, and an F1 score of 79.5% (F1 scores range between 0-1, with 75% being an approximate cutoff for “good”).

Conclusion GPT-4 is capable of fairly accurate (80.8%) sentiment analysis, even for specialized domains like urology. Its mistakes were limited to differentiating between neutral and polarized (radiotherapy-preferred or prostatectomy-preferred) sentiments. Mis-categorization of neutrality can be due to lack of specific words deemed neutral, as opposed to positive and negative. Thus, GPT-4 has promise to rapidly parse through large numbers of studies and capture overall sentiment, which may be useful for clinicians.

Next from AMA Research Challenge 2024

Scrotectomy as Genital Gender-Affirming Surgery: Novel Surgical Technique and Patient-Reported Outcomes
poster

Scrotectomy as Genital Gender-Affirming Surgery: Novel Surgical Technique and Patient-Reported Outcomes

AMA Research Challenge 2024

Kit Zoltin
Kit Zoltin

07 November 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved