Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/ebjb-8q83

poster

ACL 2024

August 13, 2024

Bangkok, Thailand

Bypassing LLM Watermarks with Color-Aware Substitutions

keywords:

evasion

llm

watermarking

Watermarking approaches are proposed to identify if text being circulated is human- or large language model- (LLM) generated. The state-of-the-art watermarking strategy of Kirchenbauer et al. (2023a) biases the LLM to generate specific (“green”) tokens. However, determining the robustness of this watermarking method under finite (low) edit budgets is an open problem. Additionally, existing attack methods fail to evade detection for longer text segments. We overcome these limitations, and propose Self Color Testing-based Substitution (SCTS), the first “color-aware” attack. SCTS obtains color information by strategically prompting the watermarked LLM and comparing output tokens frequencies. It uses this information to determine token colors, and substitutes green tokens with non-green ones. In our experiments, SCTS successfully evades watermark detection using fewer number of edits than related work. Additionally, we show both theoretically and empirically that SCTS can remove the watermark for arbitrarily long watermarked text.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
poster

OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models

ACL 2024

+2Yulan HeJinhua Du
Hainiu Xu and 4 other authors

13 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved