Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/3y79-aq86

poster

ACL 2024

August 13, 2024

Bangkok, Thailand

Don’t Augment, Rewrite? Assessing Abusive Language Detection with Synthetic Data

keywords:

synthetic data

privacy

abusive language detection

Research on abusive language detection and content moderation is crucial to combat online harm. However, current limitations set by regulatory bodies and social media platforms can make it difficult to share collected data. We address this challenge by exploring the possibility to replace existing datasets in English for abusive language detection with synthetic data obtained by rewriting original texts with an instruction-based generative model.

We show that such data can be effectively used to train a classifier whose performance is in line, and sometimes better, than a classifier trained on original data. Training with synthetic data also seems to improve robustness in a cross-dataset setting. A manual inspection of the generated data confirms that rewriting makes it impossible to retrieve the original texts online.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Improving Low-Resource Machine Translation for Formosan Languages Using Bilingual Lexical Resources
poster

Improving Low-Resource Machine Translation for Formosan Languages Using Bilingual Lexical Resources

ACL 2024

Edison Marrese-TaylorFrancis Zheng
Francis Zheng and 2 other authors

13 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved