EMNLP 2025

November 08, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

We tackle Diacritic Restoration for Arabic dialectal sentences using a multimodal model that combines text and speech. The text stream uses our own pretrained model named CATT, and the speech stream uses the Whisper-base encoder, with a Linear classification head for token-level prediction. We integrate the modalities via either Early Fusion or Cross-Attention Fusion, and the system remains robust when speech is absent. Across both official development and test sets, the model outperforms baseline and other participants in WER/CER and maintains an advantage on challenging pronunciations.

Downloads

Paper

Next from EMNLP 2025

PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic and Islamic Culture
workshop paper

PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic and Islamic Culture

EMNLP 2025

Fakhraddin Alwajih
Fakhraddin Alwajih

08 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved