Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Prompt tuning has shown promise for continual visual question answering (CVQA), facilitating modular and transferable knowledge across tasks. However, existing approaches often overlook the guiding role of prompts in the model’s implicit reasoning process. This oversight can lead to inconsistent reasoning paths and performance degradation across tasks. To address this issue, we propose the E Logic Prompt framework, which employs energy-based models (EBMs) to model the semantic compatibility between prompts and queries. In this framework, prompts function not only as adapters but also as reasoning guides that help maintain coherence throughout the inference process. The framework enforces logical consistency at three levels. At the input level, it selects semantically aligned prompts by minimizing the energy between queries and prompts. Within the model, it aligns intermediate representations with prompts across layers to preserve step-by-step reasoning. Across tasks, it applies energy-based constraints to regulate prompt behavior, effectively suppressing semantic drift and enabling prompt reuse. These three levels of consistency together enhance the guiding capacity of prompts, allowing them to steer the model toward more stable and coherent reasoning. Extensive experiments show that E Logic Prompt outperforms existing methods in both accuracy and knowledge retention, while effectively maintaining balanced cross-modal reasoning throughout continual learning.
