We explore Visual Prompt Injection (VPI) that maliciously exploits the ability of Large Vision-Language Models (LVLMs) to follow instructions drawn onto the input image.
We propose a new VPI method, Goal Hijacking via Visual Prompt Injection (GHVPI), that swaps the execution task of LVLMs from an original task to an alternative task designated by an attacker.
The quantitative analysis indicates that GPT-4V is vulnerable to GHVPI and demonstrates a notable attack success rate of 15.8%, which is a significant security risk.
Our analysis also shows that successful GHVPI requires high character recognition capability and instruction-following ability in LVLMs.

Empirical Analysis of Large Vision Language Models against Goal Hijacking via Visual Prompt Injection

We study the problem of completing various visual document understanding (VDU) tasks, e.g., question answering and information extraction, on real-world documents through human-written instructions. To this end, we propose InstructDoc, the first large-scale collection of 30 publicly available VDU datasets, each with diverse instructions in a unified format, which covers a wide range of 12 tasks and includes open document types/formats. Furthermore, to enhance the generalization performance on VDU tasks, we design a new instruction-based document reading and understanding model, InstructDr, that connects document images, image encoders, and large language models (LLMs) through a trainable bridging module. Experiments demonstrate that InstructDr can effectively adapt to new VDU datasets, tasks, and domains via given instructions and outperforms existing multimodal LLMs and ChatGPT without specific training.

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions

Amidst the flourishing era of Large Language Models (LLMs), there has been a surge in proposing Large Vision-Language Models (LVLMs) that integrate LLM with vision capabilities. 
However, it has been observed that LVLMs, after visual instruction tuning, often fail to exhibit the instruction-following ability inherent to the LLM before integration, leading to results where they do not follow task instructions as expected. 
This study quantitatively demonstrates that the instruction-following ability of LVLMs declines after visual instruction tuning for the first time. 
We further investigate factors contributing to the deterioration of this instruction-following ability. 
Our evaluation reveals that specifying the output format in instructions during (visual) instruction tuning could significantly impact the instruction-following ability of models.

Instruction-Following Evaluation for Large Vision-Language Models

This poster session includes Industry, Findings and SRW posters from the following area: 
Resources and Evaluations and Evaluation

In-Person Poster Session 5

poster

## Welcome to 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics!
This year, the conference is in Mexico City. NAACL was actually already planned for Mexico City in 2021, but due to the pandemic the entire conference was moved online. This year, finally, we get to go! So it is my sincere pleasure to welcome you to Mexico City, whether in person or virtually. Having the conference in Mexico City is a good opportunity to emphasize that NAACL is our flagship conference for ACL members not only in North America but also in Central and South America, even though NAACL has been bearing “North” in its name. At this year’s conference, we have a theme to match, with a theme track on the Languages of Latin America to showcase the linguistic diversity of the region.
 
The opportunity to present at NAACL should not depend on a researcher’s travel budget, or their family status. This is why it is so important to make virtual participation at NAACL as good an experience as possible – but we want to also provide a good experience for in-person participants. As a community, we are still working out the best way to do that. This year at NAACL, we are trying out a big virtual poster session ahead of the conference, with the hope that this will make make for a lively and interactive experience. At the same time, we are reducing virtual oral presentations, which seem to be particularly tricky to make to work well. A big thanks to the NAACL program chairs and to Luciana Benotti for all their ideas and work to improve the virtual experience. And participants, virtual as well as in-person: Please let us know what worked for you and what didn’t, so we can continue to improve hybrid conferences. 
 
I have been lucky to work with many amazing people. Without their insight, dedication and patience, and without the many hours of work they put in, NAACL would not have been possible. A huge thank you to the program chairs Helena Gomez, Kevin Duh, and Steve Bethard – you are the best! 
 
Finally, I would like to thank all authors, invited speakers and panelists, area chairs and reviewers, the volunteers organizing and chairing sessions, and all attendees, in-person and virtual. Thank you for helping us make NAACL 2024 come to life. 
 
Welcome and hope you all enjoy the conference! 
Katrin Erk 
The University of Texas at Austin 
NAACL-2024 General Chair 
*You can read the full Welcome message in the [Conference Handbook (downloadable)](https://drive.google.com/file/d/1H1NvW0VASQjkSCYgw3mr-3l4yYDSxz6B/view?usp=sharing)*

To access the event page you need to register [**here.**](http://acl.swoogo.com/naacl2024) 
Your access to the event page is limited based on your registration type. If you registered for workshops only, you will gain access to full workshops content on the day of the workshops program.

Please register!

NAACL 2024

2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics

Main Track - Natural Language Processing

technical paper

We are pleased to announce the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24), which will be held in Vancouver, British Columbia at the Vancouver Convention Centre – West Building from 20-27 February, 2024.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-24 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.

We expect for AAAI-24 to be an in-person conference – one author of all accepted papers will be expected to present work in person unless there are exceptional circumstances that prevent this.

In order to access the AAAI-24 event page you need to register [here](https://aaai.org/aaai-conference/registration/)

AAAI 2024

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. 

Ryota Tanaka

3

SHORT BIO

Presentations

Empirical Analysis of Large Vision Language Models against Goal Hijacking via Visual Prompt Injection

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions

Instruction-Following Evaluation for Large Vision-Language Models

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES