Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
The ability to track entities is fundamental for language understanding, yet the internal mechanisms governing this capability in Small Language Models (SLMs) are poorly understood. Previous studies often rely on indirect probing or complex interpretability methods, leaving a gap for lightweight diagnostics that connect model behavior to performance. To bridge this gap, we introduce a framework to analyze entity tracking by measuring the attention flow between entity and non-entity tokens within SLMs. We apply this to analyze models both before and after Parameter-Efficient Fine-Tuning (PEFT). Our analysis reveals two key findings. First, SLMs' attentional strategies vary significantly with text type, but entities consistently receive a high degree of focus. Second, we show that PEFT -- specifically QLoRA -- dramatically improves classification performance on entity-centric tasks by increasing the model's attentional focus on entity-related tokens. Our work provides direct evidence for how PEFT can refine a model's internal mechanisms and establishes attention analysis as a valuable, lightweight diagnostic tool for interpreting and improving SLMs.
