
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

workshop paper
ADE Oracle at #SMM4H 2024: A Two-Stage NLP System for Extracting and Normalizing Adverse Drug Events from Tweets
keywords:
social media text mining & analytics in public health
spacy
meddra normalization
adverse drug event (ade) detection
computational linguistics (cl)
named entity recognition (ner)
natural language processing (nlp)
machine learning (ml)
pharmacovigilance
roberta
This study describes the approach of Team ADE Oracle for Task 1 of the Social Media Mining for Health Applications (#SMM4H) 2024 shared task. Task 1 challenges partic- ipants to detect adverse drug events (ADEs) within English tweets and normalize these men- tions against the Medical Dictionary for Regu- latory Activities standards. Our approach uti- lized a two-stage NLP pipeline consisting of a named entity recognition model, retrained to recognize ADEs, followed by vector similar- ity assessment with a RoBERTa-based model. Despite achieving a relatively high recall of 37.4% in the extraction of ADEs, indicative of effective identification of potential ADEs, our model encountered challenges with preci- sion. We found marked discrepancies between recall and precision between the test set and our validation set, which underscores the need for further efforts to prevent overfitting and en- hance the model’s generalization capabilities for practical applications.