EMNLP 2025

November 08, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Grapheme-to-phoneme (G2P) models are essential components in text-to-speech (TTS) and pronunciation assessment applications. While standard forms of languages have gained attention in that regard, dialectal speech, which often serves as the primary means of spoken communication for many communities, as it is the case for Arabic, has not received the same level of focus. In this paper, we introduce an end-to-end dialectal G2P for Egyptian Arabic, a dialect without standard orthography. Our novel architecture accomplishes three tasks: (i) restores short vowels of the diacritical marks for the dialectal text; (ii) maps certain characters that happen only in the spoken version of the dialectal Arabic to their dialect-specific character transcriptions; and finally (iii) converts the previous step output to the corresponding phoneme sequence. We benchmark G2P on a modular cascaded system, a large language model, and our multi-task end-to-end architecture.

Downloads

Paper

Next from EMNLP 2025

The AraGenEval Shared Task on Arabic Authorship Style Transfer and AI Generated Text Detection
workshop paper

The AraGenEval Shared Task on Arabic Authorship Style Transfer and AI Generated Text Detection

EMNLP 2025

+6Hamza Alami
Ahmed Abdelali and 8 other authors

08 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved