Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Verbs occur in a particular syntactic environment (frame) along with their arguments. In this paper we introduce a new Hindi verb alternations benchmark to investigate whether pretrained large language models (LLMs) can infer the frame-selectional properties of Hindi verbs. Our benchmark consists of minimal pairs such as 'Tina cut the wood' / 'Tina disappeared the wood' that are annotated with human judgments. We expect that LLMs will assign lower probability to the unacceptable sentence. We create four variants of these alternations for Hindi to test knowledge of verb morphology and argument case-marking. Our results show that a masked monolingual model performs the best, while causal models fare poorly. We further test the quality of the predictions using a cloze-style sentence completion task. While the models appear to infer the right mapping between verbal morphology and valency in the acceptability task, they do not generate the right verbal morphology in the cloze task. The model completions also lack pragmatic and world knowledge. LLMs need to make both syntactic and semantic generalizations about verbal alternations, unlike other syntactic phenomena (like agreement). Our work points towards the need for greater cross-linguistic investigation of verbal alternations.