
Fan Yin
instruction tuning
robustness
interpretability
large language models
contrastive learning
faithfulness
uncertainty estimation
model interpretation
sensitivity
graph
multi-step reasoning
task selection
post-hoc interpretations
adversarial example detection
amortization
7
presentations
4
number of views
SHORT BIO
Hi, I am a third-year PhD student in the Computer Science department at University of California, Los Angeles (UCLA), advised by Prof.Kai-Wei Chang. Preivously, I received my B.S. degree in Computer Science from Peking University in 2020, where I have worked with Prof. Xiaojun Wan. My research interest are robustness, interpretability for trustworthy in NLP. My recent research tries to understand the characteristics of adversarial examples and associate it with interpretability and debugging of model behaviors.
Presentations

Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation
Di Wu and 4 other authors

Contrastive Instruction Tuning
Tianyi Yan and 7 other authors

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks
Po-Nien Kung and 4 other authors

Efficient Shapley Values Estimation by Amortization for Text Classification
Chenghao Yang and 5 other authors

Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning
Fan Yin and 5 other authors

ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation
Fan Yin

On the Sensitivity and Stability of Model Interpretations in NLP
Fan Yin and 3 other authors