Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background

poster

ACL 2024

August 12, 2024

Bangkok, Thailand

Compromesso! Italian Many-Shot Jailbreaks undermine the safety of Large Language Models

keywords:

model vulnerabilities

italian language processing

multilingual model safety

As diverse linguistic communities and users adopt Large Language Models (LLMs), assessing their safety across languages becomes critical. Despite ongoing efforts to align these models with safe and ethical guidelines, they can still be induced into unsafe behavior with jailbreaking, a technique in which models are prompted to act outside their operational guidelines. What research has been conducted on these vulnerabilities was predominantly on English, limiting the understanding of LLM behavior in other languages. We address this gap by investigating Many-Shot Jailbreaking (MSJ) in Italian, underscoring the importance of understanding LLM behavior in different languages. We base our analysis on a newly created Italian dataset to identify unique safety vulnerabilities in 4 families of open-source LLMs. We find that the models exhibit unsafe behaviors even with minimal exposure to harmful prompts, and--more alarmingly--this tendency rapidly escalates with more demonstrations.

Next from ACL 2024

Foundation Model for Biomedical Graphs: Integrating Knowledge Graphs and Protein Structures to Large Language Models
poster

Foundation Model for Biomedical Graphs: Integrating Knowledge Graphs and Protein Structures to Large Language Models

ACL 2024

Yunsoo Kim

12 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved