Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/hjmw-qn66

workshop paper

ACL 2024

August 15, 2024

Bangkok, Thailand

UTRad-NLP at #SMM4H 2024: Why LLM-Generated Texts Fail to Improve Text Classification Models

keywords:

natural language processing:large language model:text classification:data augmentation:synthetic data

In this paper, we present our approach to ad- dressing the binary classification tasks, Tasks 5 and 6, as part of the Social Media Mining for Health (SMM4H) text classification challenge. Both tasks involved working with imbalanced datasets that featured a scarcity of positive ex- amples. To mitigate this imbalance, we em- ployed a Large Language Model to generate synthetic texts with positive labels, aiming to augment the training data for our text classifi- cation models. Unfortunately, this method did not significantly improve model performance. Through clustering analysis using text embed- dings, we discovered that the generated texts significantly lacked diversity compared to the raw data. This finding highlights the challenges of using synthetic text generation for enhanc- ing model efficacy in real-world applications, specifically in the context of health-related so- cial media data.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

PheonixTrio918 at SMM4H 2024: 5 Fold Cross Validation for Classification of tweets reporting children’s disorders
workshop paper

PheonixTrio918 at SMM4H 2024: 5 Fold Cross Validation for Classification of tweets reporting children’s disorders

ACL 2024

B Rahul Naik

15 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved