profile picture

Caiming Xiong

benchmark

evaluation

question answering

llm

dataset

dialogue

large language models

summarization

generalization

multi-hop question answering

text generation

reasoning

continual learning

conversational agents

fairness

19

presentations

8

number of views

Presentations

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

Lifu Tu and 6 other authors

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains

Simeng Han and 15 other authors

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Philippe Laban and 3 other authors

FOFO: A Benchmark to Evaluate LLMs’ Format-Following Capability

Congying Xia and 7 other authors

ARM: Alignment with Residual Energy-Based Model

Bo Pang and 2 other authors

Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles

Kung-Hsiang Huang and 6 other authors

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

Anthony Tiong and 5 other authors

Fair Abstractive Summarization of Diverse Perspectives

Yusen Zhang and 11 other authors

Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning

Lifu Tu and 6 other authors

Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation

Yixin Liu and 7 other authors

HPE: Answering Complex Questions over Text by Hybrid Question Parsing and Execution

Ye Liu and 6 other authors

SummEdits: Measuring LLM Ability at Factual Reasoning Through The Lens of Summarization | VIDEO

Philippe Laban and 6 other authors

Salespeople vs SalesBot: Exploring the Role of Educational Value in Conversational Recommender Systems

Lidiya Murakhovs'ka and 4 other authors

What's New? Summarizing Contributions in Scientific Literature

Hiroaki Hayashi and 4 other authors

CASPI:Causal-aware Safe Policy Improvement for Task-oriented Dialogue

Govardana Sachithanandam Ramachandran and 2 other authors

OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource Language Pair for Low-Resource Sentence Retrieval

Tong Niu and 3 other authors

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved