Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Large Language Models (LLMs) have demonstrated remarkable generalization capabilities, yet aligning their outputs with human preferences typically requires expensive supervised fine-tuning. In this paper, we introduce a novel paradigm—Textual Network—which enables test-time preference optimization (TPO) without any parameter updates. Unlike traditional numerical or gradient-based alignment methods, our approach operates entirely in the space of natural language, where both the attention mechanism and output refinement are realized through LLM-interpretable textual modules. Our proposed Textual Self-Attention Network (TSAN) emulates the core principles of self-attention by constructing a latent Q-K-V-style Textual Network: (1) candidate responses are scored and formatted as textual keys and values, (2) an LLM-based attention module interprets their relevance to the user query in natural language, and (3) a textual aggregator synthesizes a new, preference-aligned response guided by the learned attention. All components are running in the textual gradient space, enabling iterative optimization with interpretable updates and no gradient backpropagation through model weights. Empirical evaluations of instruction following, alignment, security, and mathematical reasoning tasks show that TSAN equipped with TSAN outperforms supervised models such as Llama-3.1-70B-Instruct and outperforms the state-of-the-art reasoning alignment method, TPO, after just a few test time iterations on the base SFT model.
