Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Recent progress in large language models (LLMs) has given rise to Large Reasoning Models (LRMs) that externalize multi-step, System 2-style reasoning, achieving state-of-the-art results on complex tasks. However, this explicit reasoning introduces notable computational overhead, while traditional LLMs remain efficient but struggle with tasks demanding deep, stepwise thought. In this work, we systematically study the trade-off between efficiency and robustness inherent in System 1 (intuitive, fast) and System 2 (deliberate, explicit) reasoning in modern language models. Through empirical analysis, we show that enforcing concise reasoning on LRMs improves efficiency but can hinder performance, whereas augmenting LLMs with explicit reasoning traces enhances both confidence and accuracy. Motivated by these insights, we propose a curriculum-based distillation framework that incrementally teaches small models to reason, beginning with concise traces and gradually introducing more complex reasoning. Experiments on challenging mathematical benchmarks demonstrate that our approach enables small models to achieve both strong reasoning ability and inference efficiency. Our findings highlight the importance of dynamic, flexible reasoning strategies and staged learning for building practical, adaptable language models.