
Caiming Xiong
benchmark
evaluation
question answering
llm
dataset
dialogue
large language models
summarization
generalization
multi-hop question answering
text generation
reasoning
continual learning
conversational agents
fairness
19
presentations
8
number of views
Presentations

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding
Lifu Tu and 6 other authors

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Simeng Han and 15 other authors

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban and 3 other authors

FOFO: A Benchmark to Evaluate LLMs’ Format-Following Capability
Congying Xia and 7 other authors

ARM: Alignment with Residual Energy-Based Model
Bo Pang and 2 other authors

Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles
Kung-Hsiang Huang and 6 other authors

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases
Anthony Tiong and 5 other authors

Fair Abstractive Summarization of Diverse Perspectives
Yusen Zhang and 11 other authors

Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning
Lifu Tu and 6 other authors

Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation
Yixin Liu and 7 other authors

HPE: Answering Complex Questions over Text by Hybrid Question Parsing and Execution
Ye Liu and 6 other authors

SummEdits: Measuring LLM Ability at Factual Reasoning Through The Lens of Summarization | VIDEO
Philippe Laban and 6 other authors

Salespeople vs SalesBot: Exploring the Role of Educational Value in Conversational Recommender Systems
Lidiya Murakhovs'ka and 4 other authors

What's New? Summarizing Contributions in Scientific Literature
Hiroaki Hayashi and 4 other authors

CASPI:Causal-aware Safe Policy Improvement for Task-oriented Dialogue
Govardana Sachithanandam Ramachandran and 2 other authors

OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource Language Pair for Low-Resource Sentence Retrieval
Tong Niu and 3 other authors