Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Language models perpetuate dialect bias, associating African American English (AAE) with negative traits and outcomes. We propose JustDial, a lightweight finetuning framework aligning character trait associations between meaning-matched AAE and Standardized American English (SAE) text, while preserving general model fluency through a KL-divergence regularization term. Experiments on GPT2-Medium show that JustDial successfully removed any statistically significant correlation between dialect and predicted occupational prestige and reduced conviction and death-sentencing disparities by more than 98.7\%, with only 100,000 text examples and one epoch of LoRA finetuning. Though this debiasing comes at the cost of general model performance, adjusting the regularization term in JustDial enables a navigable debiasing-performance tradeoff space. JustDial provides the first proof-of-concept towards mitigating dialect prejudice in language models.
