Thailand

Aligning large language models (LLMs) with human expectations without human-annotated preference data is an important problem. In this paper, we propose a method to evaluate the response preference by using the output probabilities of response pairs under contrastive prompt pairs, which could achieve better performance on LLaMA2-7B and LLaMA2-13B compared to RLAIF. Based on this, we propose an automatic alignment method, Direct Large Model Alignment (DLMA). First, we use contrastive prompt pairs to automatically generate preference data. Then, we continue to evaluate the generated preference data using contrastive prompt pairs and calculate a self-rewarding score. Finally, we use the DPO algorithm to effectively align LLMs by combining this self-rewarding score. In the experimental stage, our DLMA method could surpass the RLHF method without relying on human-annotated preference data.

ACL 2024

Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation

large language model

alignment

safety

poster

### Welcome!
The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) will take place in Bangkok, Thailand from August 11th to 16th, 2024. Our Virtual Poster Sessions will take place online Thursday, August 22, 2024.

You are required to register for this event. **Please register [here](https://2024.aclweb.org/registration). **

If you have already registered, please check your inbox for an email from Underline granting you access to ACL 2024 content.

Please register!

The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) will take place in Bangkok, Thailand from August 11th to 16th, 2024. More information will be announced soon.

Knowledge distillation, the technique of transferring knowledge from large, complex models to smaller ones, marks a pivotal step towards efficient AI deployment. Distilling Step-by-Step (DSS), a novel method utilizing chain-of-thought (CoT) distillation, has demonstrated promise by imbuing smaller models with the superior reasoning capabilities of their larger counterparts. In DSS, the distilled model acquires the ability to generate rationales and predict labels concurrently through a multi-task learning framework. However, DSS overlooks the intrinsic relationship between the two training tasks, leading to ineffective integration of CoT knowledge with the task of label prediction. To this end, we investigate the mutual relationship of the two tasks from Information Bottleneck perspective and formulate it as maximizing the mutual information of the representation features of the two tasks. We propose a variational approach to solve this optimization problem using a learning-based method. Our experimental results across four datasets demonstrate that our method outperforms the state-of-the-art DSS. 
Our findings offer insightful guidance for future research on language model distillation as well as applications involving CoT. Codes are available at https://github.com/xinchen9/cot_distillation_ACL2024.

Learning to Maximize Mutual Information for Chain-of-Thought Distillation

Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness. To gain a better understanding of LLMs, we conduct a detailed analysis of the operations of attention heads and aim to better understand the in-context learning of LLMs. Specifically, we investigate whether attention heads encode two types of relationships between tokens present in natural languages: the syntactic dependency parsed from sentences and the relation within knowledge graphs. We find that certain attention heads exhibit a pattern where, when attending to subject tokens, they recall object tokens and increase the output logits of those object tokens. More crucially, the formulation of such semantic induction heads has a close correlation with the emergence of the in-context learning ability of language models. The study of semantic attention heads advances our understanding of the intricate operations of attention heads in transformers, and further provides new insights into the in-context learning of LLMs.

Identifying Semantic Induction Heads to Understand In-Context Learning

Dialogue discourse parsing (DDP) aims to capture the relations between utterances in the dialogue. In everyday real-world scenarios, dialogues are typically multi-modal and cover open-domain topics. However, most existing widely used benchmark datasets for DDP contain only textual modality and are domain-specific. This makes it challenging to accurately and comprehensively understand the dialogue without multi-modal clues, and prevents them from capturing the discourse structures of the more prevalent daily conversations. This paper proposes MODDP, the first multi-modal Chinese discourse parsing dataset derived from open-domain daily dialogues, consisting 864 dialogues and 18,114 utterances, accompanied by 12.7 hours of video clips. We present a simple yet effective benchmark approach for multi-modal DDP. Through extensive experiments, we present several benchmark results based on MODDP. The significant improvement in performance from introducing multi-modalities into the original textual unimodal DDP model demonstrates the necessity of integrating multi-modalities into DDP.

MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Large language models (LLMs) can generate long-form and coherent text, yet they often hallucinate facts, which undermines their reliability. To mitigate this issue, inference-time methods steer LLM representations toward the "truthful directions" previously learned for truth elicitation. However, applying these truthful directions with the same intensity fails to generalize across different query contexts. We propose LITO, a Learnable Intervention method for Truthfulness Optimization that automatically identifies the optimal intervention intensity tailored to each specific context. LITO explores a sequence of model generations based on increasing levels of intervention intensities. It selects the most accurate response or refuses to answer when the predictions are highly uncertain. Experiments on multiple LLMs and question-answering datasets demonstrate that LITO improves truthfulness while preserving task accuracy. The adaptive nature of LITO counters the limitations of one-size-fits-all intervention methods, maximizing truthfulness by reflecting the model's internal knowledge only when it is confident. Our code is available at https://github.com/launchnlp/LITO.

Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression

Videos are more informative than images because
they capture the dynamics of the scene.
By representing motion in videos, we can capture
dynamic activities. In this work, we introduce
GPT-4 generated motion descriptions that
capture fine-grained motion descriptions of activities
and apply them to three action datasets.
We evaluated several video-text models on the
task of retrieval of motion descriptions. We
found that they fall far behind human expert
performance on two action datasets, raising
the question of whether video-text models understand
motion in videos. To address it, we
introduce a method of improving motion understanding
in video-text models by utilizing
motion descriptions. This method proves to
be effective on two action datasets for the motion
description retrieval task. The results draw
attention to the need for quality captions involving
fine-grained motion information in existing
datasets and demonstrate the effectiveness of
the proposed pipeline in understanding finegrained
motion during video-text retrieval.

Diving Deep into the Motion Representation of Video-Text Models

While large language and vision-language models showcase impressive capabilities, they face a notable limitation: the inability to connect language with the physical world. To bridge this gap, research has focused on embodied language learning, where the language learner is situated in the world, perceives it, and interacts with it. This article explores the current standing of research in embodied language learning, highlighting opportunities and discussing common challenges. Lastly, it identifies existing gaps from the perspective of language understanding research within the embodied world and suggests potential future directions.

Embodied Language Learning: Opportunities, Challenges, and Future Directions

It is increasingly common to evaluate the same coreference resolution (CR) model on multiple datasets. Do these multi-dataset evaluations allow us to draw meaningful conclusions about model generalization? Or, do they rather reflect the idiosyncrasies of a particular experimental setup (e.g., the specific datasets used)? To study this, we view evaluation through the lens of measurement modeling, a framework commonly used in the social sciences for analyzing the validity of measurements. By taking this perspective, we show how multi-dataset evaluations risk conflating different factors concerning what, precisely, is being measured. This in turn makes it difficult to draw more generalizable conclusions from these evaluations. For instance, we show that across seven datasets, measurements intended to reflect CR model generalization are often correlated with differences in both how coreference is defined and how it is operationalized; this limits our ability to draw conclusions regarding the ability of CR models to generalize across any singular dimension. We believe the measurement modeling framework provides the needed vocabulary for discussing challenges surrounding what is actually being measured by CR evaluations.

Challenges to Evaluating the Generalization of Coreference Resolution Models: A Measurement Modeling Perspective

In this paper, we propose a textless acoustic model with a self-supervised distillation strategy for noise-robust expressive speech-to-speech translation (S2ST).
Recently proposed expressive S2ST systems have achieved impressive expressivity preservation performances by cascading unit-to-speech (U2S) generator to the speech-to-unit translation model. 
However, these systems are vulnerable to the presence of noise in input speech, which is an assumption in real-world translation scenarios. 
To address this limitation, we propose a U2S generator that incorporates a distillation with no label (DINO) self-supervised training strategy into it's pretraining process.
Because the proposed method captures noise-agnostic expressivity representation, it can generate qualified speech even in noisy environment.
Objective and subjective evaluation results verified that the proposed method significantly improved the performance of the expressive S2ST system in noisy environments while maintaining competitive performance in clean environments.

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation

We investigate intention detection in persuasive multi-turn dialogs employing the largest available Large Language Models (LLMs).
Much of the prior research measures the intention detection capability of machine learning models without considering the conversational history.
To evaluate LLMs' intention detection capability in conversation, we modified the existing datasets of persuasive conversation and created datasets using a multiple-choice paradigm.
It is crucial to consider others' perspectives through their utterances when engaging in a persuasive conversation, especially when making a request or reply that is inconvenient for others.
This feature makes the persuasive dialogue suitable for the dataset of measuring intention detection capability.
We incorporate the concept of `face acts,' which categorize how utterances affect mental states.
This approach enables us to measure intention detection capability by focusing on crucial intentions and to conduct comprehensible analysis according to intention types.

Evaluating Intention Detection Capability of Large Language Models in Persuasive Dialogues

A substantial body of work has provided evidence that the lexicons of natural languages are organized to support efficient communication. However, existing work has largely focused on word-internal properties, such as Zipf’s observation that more frequent words are optimized in form to minimize communicative cost. Here, we investigate the hypothesis that efficient lexicon organization is also reflected in valency, or the combinations and orders of additional words and phrases a verb selects for in a sentence. We consider two measures of valency diversity for verbs: valency frame count (VFC), the number of distinct frames associated with a verb, and valency frame entropy (VFE), the average information content of frame selection associated with a verb. Using data from 79 languages, we provide evidence that more frequent verbs are associated with a greater diversity of valency frames, suggesting that the organization of valency is consistent with communicative efficiency principles. We discuss our findings in relation to classical findings such as Zipf’s meaning-frequency law and the principle of least effort, as well as implications for theories of valency and communicative efficiency principles.

Downloads

Next from ACL 2024

Learning to Maximize Mutual Information for Chain-of-Thought Distillation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

.css-70qvj9{display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}Downloads

Next from ACL 2024

Learning to Maximize Mutual Information for Chain-of-Thought Distillation

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES

Downloads