EMNLP 2025

November 05, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Speech learning involves controlling a complex motor system for uttering speech sounds from articulatory gestures and discovering a set of discrete and invariant units that provide entry to the linguistic system. Importantly, children seem to learn the relationships between speech sounds, the corresponding articulatory gestures, and these units in a weakly-supervised manner, with no explicit labeling of auditory inputs and no access to the articulatory gestures they should produce to reach an acoustic target. In this study, we propose a computational agent learning to drive a virtual vocal apparatus in order to repeat an auditory speech input. This model combines i) an articulatory synthesizer able to reproduce complex speech stimuli from a limited set of interpretable articulatory parameters, ii) two internal models respectively providing articulatory-to-acoustic forward predictions and acoustic-to-articulatory inverse computations, and iii) a (discrete) speech unit discovery module based on vector-quantized variational autoencoders (VQ-VAE). From this architecture, we provide two contributions. In a first experiment, we analyze the quantized embeddings learned by the VQ-VAE from ground truth data, and we show an interesting complementarity between acoustic and articulatory modalities which is potentially useful for the discovery of invariance. Then, we evaluate the performance of the proposed agent both at the acoustic and articulatory levels. We show that while most of the agent's productions are intelligible, the underlying articulatory trajectories of those productions are not systematically plausible. Finally, we present future perspectives for testing a developmental scenario for speech learning using end-to-end neural models.

Downloads

SlidesTranscript English (automatic)

Next from EMNLP 2025

Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance in Adaptation
poster

Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance in Adaptation

EMNLP 2025

+4
Zhiquan Lai and 6 other authors

05 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved