Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
keywords:
word order flexibility
position encodings
morphological complexity
language modelling
Language model architectures are predominately first created for English and afterwards applied to other languages. This can lead to problems for languages that are structurally different from English. We study one specific architectural choice: positional encodings. We do this through the lens of the trade-off hypothesis: the supposed interplay between morphological complexity and word order flexibility. This hypothesis states there exists a trade-off between the two: a more morphologically complex language can have a more flexible word order, and vice-versa. Positional encodings are a direct target to investigate the implications of this hypothesis in relation to language modelling. We pre-train and evaluate three monolingual model variants with absolute, relative and no position encodings for seven typologically diverse languages and evaluate on four downstream tasks. We fail to find a consistent trend with various proxies for morphological complexity and word order flexibility. Our work shows choice of tasks, languages, and metrics are essential for drawing stable conclusions.