EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Detecting offensive language in Chinese is challenging due to homophonic substitutions used to evade detection. We propose a framework to improve large language models’ robustness against such phonetic attacks. First, we construct HED-COLD, a homophone-enhanced dataset based on the Chinese Offensive Language Dataset. Additionally, we propose a homophone-aware pretraining strategy that aligns semantics and fuses features to learn robust mappings between original and perturbed text. Experimental results show that our approach achieves state-of-the-art performance on both the COLD test set and the toxicity benchmark ToxiCloakCN. Notably, it achieves greater gains in domains especially prone to homophonic attacks, such as gender and regional content. These results demonstrate improved robustness and generalization against phonetic adversarial attacks.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completions by LLMs
poster

Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completions by LLMs

EMNLP 2025

+1Elisa LeonardelliCamilla Casula
Camilla Casula and 3 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved