
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

workshop paper
SMM4H’24 Task6 : Extracting Self-Reported Age with LLM and BERTweet: Fine-Grained Approaches for Social Media Text
keywords:
age extraction
llm
hugging face
bertweet
social media
The paper presents two distinct approaches to Task 6 of the SMM4H’24 workshop: extracting self-reported exact age information from social media posts across platforms. This research task focuses on developing methods for au- tomatically extracting self-reported ages from posts on two prominent social media platforms: Twitter (now X) and Reddit. The work lever- ages two ways, one Mistral-7B-Instruct-v0.2 Large Language Model (LLM) and another pre- trained language model BERTweet, to achieve robust and generalizable age classification, sur- passing limitations of existing methods that rely on predefined age groups. The proposed mod- els aim to advance the automatic extraction of self-reported exact ages from social media posts, enabling more nuanced analyses and in- sights into user demographics across different platforms.