Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
The GENDER1PERSON test suite is designed to measure gender bias in translating singular first-person forms from English into two Slavic languages, Russian and Serbian. The test suite consists of 1 000 Amazon product reviews, uni- formly distributed over 10 different product categories. Bias is measured through a gen- der score ranging from -100 (all reviews are feminine) to 100 (all reviews are masculine). The test suite shows that the majority of the systems participating in the WMT-2025 task for these two target languages prefer the mas- culine writer’s gender. There is no single sys- tem which is biased towards the feminine vari- ant. Furthermore, for each language pair, there are seven systems that are considered balanced, having the gender scores between -10 and 10. Finally, the analysis of different products showed that the choice of the writer’s gender depends to a large extent on the product. More- over, it is demonstrated that even the systems with overall balanced scores are actually bi- ased, but in different ways for different product categories.
