EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Pinpointing the subset of general data from large corpora that optimally enhances specific capabilities in large language models (LLMs) offers an effective solution to address domain-specific data scarcity and reduce training costs. However, mainstream methods that rely on observing changes in benchmark metrics or gradient similarity in the training phase are based on superficial observations and fail to link LLMs capabilities with their internal components, thus lacking interpretability, resulting in inefficient and limited performance gains. In this work, we propose a lightweight and high interpretability method to associate LLMs capabilities with internal components, specifically identifying a correspondence between specific capabilities and attention heads. We first delineate five fundamental application capabilities of LLMs. Probing techniques are then employed to identify the specific attention heads corresponding to each, thereby establishing a mapping between these capabilities and internal model components. For targeted instruction tuning, we subsequently decompose the complex abilities required for intricate tasks into combinations of these fundamental capabilities and directly select data using corresponding attention heads. Experiments on LLaMA3.1-8B and Qwen2.5-7B achieve a discrimination accuracy of over 70% for various capabilities. And results on the MMLU and BBH datasets show that our method outperforms the gradient-based selection method LESS by 1.3%-2% and other intermediate-state-based selection methods by 5%-6%.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

A Survey of Link Prediction in N-ary Knowledge Graphs
poster

A Survey of Link Prediction in N-ary Knowledge Graphs

EMNLP 2025

+5Saiping GuanXueqi Cheng
Xueqi Cheng and 7 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved