Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Conversational agents have typically been developed for either task-oriented dialogue (TOD) or open-ended chitchat, with limited success in integrating both. Yet, real-world conversations often involve fluid transitions between these modes. To address this, we introduce TACT (TOD-And-Chitchat Transition), a dataset for transition-aware dialogue modeling that features structurally diverse and integrated mode flows. TACT supports both user- and agent-driven mode switches, enabling robust modeling of complex dialogue dynamics. To evaluate an agent’s ability to initiate and recover from mode transitions, we propose new performance metrics---Switch and Recovery. Models trained on TACT outperform baselines in both intent detection and mode transition handling. Moreover, applying Direct Preference Optimization (DPO) to TACT-trained models yields extra gains, achieving 75.74% joint mode-intent accuracy and a 40.86% win rate against GPT-4o in human evaluation. This shows that pairing structurally diverse data with DPO boosts response quality and transition control, facilitating the development of proactive agents.