
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

workshop paper
DILAB at #SMM4H 2024: RoBERTa Ensemble for Identifying Children’s Medical Disorders in English Tweets
keywords:
social media mining for health research
children’s medical disorders
social media analysis
binary classification
natural language processing
transfer learning
This paper details our system developed for the 9th Social Media Mining for Health Research and Applications Workshop (SMM4H 2024), addressing Task 5 focused on binary classification of English tweets reporting children’s medical disorders. Our objective was to enhance the detection of tweets related to children’s medical issues. To do this, we use various pretrained language models, like RoBERTa and BERT. We fine-tuned these models on the taskspecific dataset, adjusting model layers and hyperparameters in an attempt to optimize performance. As we observe unstable fluctuations in performance metrics during training, we implement an ensemble approach that combines predictions from different learning epochs. Our model achieves promising results, with the bestperforming configuration achieving F1 score of 93.8% on the validation set and 89.8% on the test set.