
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

poster
Unexpected Phenomenon: LLMs' Spurious Associations in Information Extraction
keywords:
spurious associations
large language model
information extraction
Information extraction plays a critical role in natural language processing. When applying large language models (LLMs) to this domain, we discover an unexpected phenomenon: LLMs’ spurious associations. In tasks such as relation extraction, LLMs can accurately identify entity pairs, even if the given relation (label) is semantically unrelated to the pre-defined original one. To find these labels, we design two strategies in this study, including forward label extension and backward label validation. We also leverage the extended labels to improve model performance. Our comprehensive experiments show that spurious associations occur consistently in both Chinese and English datasets across various LLM sizes. Moreover, the use of extended labels significantly enhances LLM performance in information extraction tasks. Remarkably, there is a performance increase of 9.55\%, 11.42\%, and 21.27\% in F1 scores on the SciERC, ACE05, and DuEE datasets, respectively.