
Timothy Baldwin
Professor @ MBZUAI
dataset
indonesian
unsupervised
bert
benchmark
lexical substitution
word embedding
question answering
summarization
domain-specific
multi-document summarization
large language models
commonsense reasoning
hallucinations
generation
16
presentations
33
number of views
SHORT BIO
Prior to joining MBZUAI, Baldwin spent 17 years at the University of Melbourne, including roles as Melbourne Laureate Professor, Director of the ARC Training Centre in Cognitive Computing for Medical Technologies (in partnership with IBM), Associate Dean Research Training in the Melbourne School of Engineering, and Deputy Head of the Department of Computing and Information Systems.
He has previously held visiting positions at Cambridge University, University of Washington, University of Tokyo, Saarland University, NTT Communication Science Laboratories, and National Institute of Informatics.
Prior to joining the University of Melbourne in 2004, he was a senior research engineer at the Center for the Study of Language and Information, Stanford University (2001-2004).
Baldwin is president of the Association for Computational Linguistics (ACL 2022).
Presentations

Demystifying Instruction Mixing for Fine-tuning Large Language Models
Renxi Wang and 7 other authors

LM-Polygraph: Uncertainty Estimation for Language Models | VIDEO
Ekaterina Fadeeva and 12 other authors

Terminology in CL
Timothy Baldwin and 3 other authors

Noisy Label Regularisation for Textual Regression
Yuxia Wang and 2 other authors

Unsupervised Lexical Substitution with Decontextualised Embeddings
Takashi Wada and 3 other authors

LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization
Fajri Koto and 2 other authors

MultiSpanQA: A Dataset for Multi-Span Question Answering
Haonan Li and 3 other authors

CULG: Commercial Universal Language Generation
Haonan Li and 6 other authors

Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian
Fajri Koto and 2 other authors

What does it take to bake a cake? The RecipeRef corpus and anaphora resolution in procedural text
Biaoyan Fang and 2 other authors

One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji and 10 other authors

The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature
Yulia Otmakhova and 3 other authors

KFCNet: Knowledge Filtering and Contrastive Learning for Generative Commonsense Reasoning
Haonan Li and 5 other authors

IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization
Fajri Koto and 2 other authors

Target Word Masking for Location Metonymy Resolution
Haonan Li and 3 other authors

IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP
Fajri Koto and 3 other authors