Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Existing methods for jailbreaking a Large Language Model (LLM) have largely focused on disguising a harmful request as benign, either through a single interaction with the LLM (as in single-turn methods) or through multiple interactions (as in multi-turn methods). In this paper, we propose Contextual History for Adaptive and Simple Exploitation (CHASE), a novel method for LLM jailbreaking that extends the success of existing multi-turn methods by showing that the conversational history of an LLM can additionally be exploited profitably to increase the chances of successful jailbreaking. To our knowledge, CHASE represents the first attempt to address LLM jailbreaking by considering both the linguistic aspect (i.e., how to linguistically disguise a harmful request as benign) and the extra-linguistic aspect (i.e., exploiting the conversational history of an LLM) of the problem.
