
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

workshop paper
IMS_medicALY at #SMM4H 2024: Detecting Impacts of Outdoor Spaces on Social Anxiety with Data Augmented
keywords:
outdoor spaces
social media health mining
anxiety
synthetic data generation
bionlp
ensemble
Many individuals affected by Social Anxiety Disorder turn to social media platforms to share their experiences and seek advice. This in- cludes discussing the potential benefits of en- gaging with outdoor environments. As part of #SMM4H 2024, Shared Task 3 focuses on classifying the effects of outdoor spaces on social anxiety symptoms in Reddit posts. In our contribution to the task, we explore the ef- fectiveness of domain-specific models (trained on social media data – SocBERT) against general domain models (trained on diverse datasets – BERT, RoBERTa, GPT-3.5) in pre- dicting the sentiment related to outdoor spaces. Further, we assess the benefits of augment- ing sparse human-labeled data with synthetic training instances and evaluate the complemen- tary strengths of domain-specific and general classifiers using an ensemble model. Our re- sults show that (1) fine-tuning small, domain- specific models generally outperforms large general language models in most cases. Only one large language model (GPT-4) exhibits per- formance comparable to the fine-tuned mod- els (52% F1). Further, we find that (2) syn- thetic data does improve the performance of fine-tuned models in some cases, and (3) mod- els do not appear to complement each other in our ensemble setup.