
David R Mortensen
morphology
multilinguality
computational historical linguistics
interlinear gloss
protoform reconstruction
large language models
natural language processing
annotation
tokenization
benchmark
emergent communication
low-resource language
automatic speech recognition
endangered languages
low resource languages
12
presentations
5
number of views
SHORT BIO
David Mortensen is an Assistant Research Professor in the Language Technologies Institute at Carnegie Mellon University. His research spans computational phonology, morphology, and historical linguistics, speech technology, and information extraction.
Presentations

Zero-Shot Cross-Lingual NER Using Phonemic Representations for Low-Resource Languages
Jimin Sohn and 5 other authors

Semisupervised Neural Proto-Language Reconstruction
Liang Lu and 2 other authors

Wav2Gloss: Generating Interlinear Glossed Text from Speech
Taiqi He and 8 other authors

XferBench: a Data-Driven Benchmark for Emergent Language
Brendon Boldt and 1 other author

Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia and 6 other authors

Counting the Bugs in ChatGPT’s Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model
Leonie Weissweiler and 12 other authors

Transformed Protoform Reconstruction
Young Min Kim and 3 other authors

WikiHan: A New Comparative Dataset for Chinese Languages
Kalvin Chang and 3 other authors

Learning the Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese
Chenxuan Cui and 2 other authors

Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble
David R Mortensen and 4 other authors

Evaluating the Morphosyntactic Well-formedness of Generated Texts
Adithya Pratapa and 6 other authors

Evaluating the Morphosyntactic Well-formedness of Generated Texts
Adithya Pratapa and 6 other authors