Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/advj-5h69

poster

ACL 2024

August 22, 2024

Bangkok, Thailand

Impacts of Misspelled Queries on Translation and Product Search

keywords:

byte-pair encoding

spelling correction

e-commerce

data augmentation

robustness

machine translation

Machine translation is used in e-commerce to translate second-language queries into the primary language of the store, to be matched by the search system against the product catalog. However, many queries contain spelling mistakes. We first present an analysis of the spelling-robustness of a population of MT systems, quantifying how spelling variations affect MT output, the list of returned products, and ultimately user behavior. We then present two sets of practical experiments illustrating how spelling-robustness may be specifically improved. For MT, reducing the number of BPE operations significantly improves spelling-robustness in six language pairs. In end-to-end e-commerce, the inclusion of a dedicated spelling correction model, and the augmentation of that model's training data with language-relevant phenomena, each improve robustness and consistency of search results.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers
poster

Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers

ACL 2024

Tuo Zhang and 2 other authors

22 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved