
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

workshop paper
How Useful is Context, Actually? Comparing LLMs and Humans on Discourse Marker Prediction
keywords:
llm
discourse marker
cloze test
context
mturk
pragmatics
This paper investigates the adverbial discourse particle actually. We compare LLM and human performance on cloze tests involving actually on examples sourced from the Providence Corpus of speech around children. We explore the impact of utterance context on cloze test performance. We find that context is always helpful, though the extent to which additional context is helpful, and what relative placement of context (i.e. before or after the masked word) is most helpful differs for individual models and humans. The best-performing LLM, GPT-4, narrowly outperforms humans. In an additional experiment, we explore cloze performance on synthetic LLM-generated examples, and find that several models vastly outperform humans.