Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Representation Finetuning (ReFT) has recently emerged as an efficient paradigm for adapting pretrained language models by editing hidden representations rather than model weights. However, our preliminary experiments reveal that ReFT is notably more sensitive to training data quality compared to traditional parameter-efficient finetuning methods, particularly to samples with incorrect labels, which can severely degrade performance. Inspired by prior work demonstrating that the hidden representations of generalizable neural networks exhibit low-dimensional manifold structures, we hypothesize that effective generalization in ReFT requires geometrically structured transformations between pre- and post-intervention representations. This implies that the intervention vectors representing these transformations should form a low-dimensional manifold, rendering the inconsistent transformations induced by label noise as detectable geometric outliers. To leverage this insight, we introduce Aligning Interventions on a learned Manifold (AIM), a representation-based data filtering method for ReFT, which identifies high-quality training samples by measuring the geometric consistency of their intervention vectors with respect to a robust reference manifold derived via principal component analysis on trusted data. Extensive experiments on both commonsense and arithmetic reasoning tasks confirm the effectiveness of AIM, showing consistent improvements over strong data selection baselines across multiple model scales.
