profile picture

Timothy Baldwin

Professor @ MBZUAI

dataset

indonesian

unsupervised

bert

benchmark

lexical substitution

word embedding

question answering

summarization

domain-specific

multi-document summarization

large language models

commonsense reasoning

hallucinations

generation

16

presentations

33

number of views

SHORT BIO

Prior to joining MBZUAI, Baldwin spent 17 years at the University of Melbourne, including roles as Melbourne Laureate Professor, Director of the ARC Training Centre in Cognitive Computing for Medical Technologies (in partnership with IBM), Associate Dean Research Training in the Melbourne School of Engineering, and Deputy Head of the Department of Computing and Information Systems.

He has previously held visiting positions at Cambridge University, University of Washington, University of Tokyo, Saarland University, NTT Communication Science Laboratories, and National Institute of Informatics.

Prior to joining the University of Melbourne in 2004, he was a senior research engineer at the Center for the Study of Language and Information, Stanford University (2001-2004).

Baldwin is president of the Association for Computational Linguistics (ACL 2022).

Presentations

Demystifying Instruction Mixing for Fine-tuning Large Language Models

Renxi Wang and 7 other authors

LM-Polygraph: Uncertainty Estimation for Language Models | VIDEO

Ekaterina Fadeeva and 12 other authors

Terminology in CL

Timothy Baldwin and 3 other authors

Noisy Label Regularisation for Textual Regression

Yuxia Wang and 2 other authors

Unsupervised Lexical Substitution with Decontextualised Embeddings

Takashi Wada and 3 other authors

LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization

Fajri Koto and 2 other authors

MultiSpanQA: A Dataset for Multi-Span Question Answering

Haonan Li and 3 other authors

CULG: Commercial Universal Language Generation

Haonan Li and 6 other authors

Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian

Fajri Koto and 2 other authors

What does it take to bake a cake? The RecipeRef corpus and anaphora resolution in procedural text

Biaoyan Fang and 2 other authors

One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia

Alham Fikri Aji and 10 other authors

The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature

Yulia Otmakhova and 3 other authors

KFCNet: Knowledge Filtering and Contrastive Learning for Generative Commonsense Reasoning

Haonan Li and 5 other authors

IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization

Fajri Koto and 2 other authors

Target Word Masking for Location Metonymy Resolution

Haonan Li and 3 other authors

IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP

Fajri Koto and 3 other authors

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved