Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Prompt tuning for Large Language Models (LLMs) is vulnerable to backdoor attacks. The mainstream methods achieve backdoor attacks through prompt tuning on rich training data. However, in real-world data-limited scenarios, these methods have difficulty capturing precise backdoor patterns, leading to weakened backdoor attack capabilities and significant side effects for the LLMs. To alleviate this problem, we propose a enhanced backdoor attacks through contrastive-enhanced machine unlearning in data-limited scenarios, called BCU. Specifically, BCU introduces a multi-objective machine unlearning method to capture precise backdoor patterns by forgetting the association between non-trigger data and the backdoor patterns, reducing side effects. Moreover, we design a contrastive learning strategy to enhance the capturing ability of backdoor patterns, achieving powerful backdoor attacks in data-limited scenarios. Experimental results on 6 NLP datasets and 4 LLMs show that BCU exhibits strong backdoor attack capabilities and slight side effects, whether the training data is rich or limited. Our findings highlight the more practical security risks of backdoor attacks against LLMs, necessitating further research for security purposes.