Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/hkwn-xm71

workshop paper

ACL 2024

August 15, 2024

Bangkok, Thailand

Know Thine Enemy: Adaptive Attacks on Misinformation Detection Using Reinforcement Learning

keywords:

misinformation

robustness

adversarial examples

reinforcement learning

We present XARELLO: a generator of adversarial examples for testing the robustness of text classifiers based on reinforcement learning. Our solution is adaptive, it learns from previous successes and failures in order to better adjust to the vulnerabilities of the attacked model. This reflects the behaviour of a persistent and experienced attacker, which are common in the misinformation-spreading environment. We evaluate our approach using several victim classifiers and credibility-assessment tasks, showing it generates better-quality examples with less queries, and is especially effective against the modern LLMs. We also perform a qualitative analysis to understand the language patterns in the misinformation text that play a role in the attacks.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Multi-Target User Stance Discovery on Reddit
workshop paper

Multi-Target User Stance Discovery on Reddit

ACL 2024

Benjamin Steel and 1 other author

15 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved