Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Vision Language Models (VLMs) have demonstrated strong performance in multimodal understanding, offering promise for the circuit-to-netlist translation task. However, the diverse component symbols and complex connections in circuit images challenge VLMs in understanding physical layouts and reasoning for electrical connection logic. To address these, we propose Circuit-Think, the first multimodal reasoning framework for the automated circuit-to-netlist translation task, which employs the Trajectory-Guided Reinforcement Learning (TGRL) learning paradigm for structured logical reasoning on circuit images. Circuit-Think initializes reasoning capabilities through supervised fine-tuning (SFT) on image-netlist pairs, then optimizes reasoning trajectories and netlist generation decisions using TGRL. Firstly, TGRL introduces a step-by-step reasoning paradigm, which guides the model with stepwise reward functions to simulate the human cognitive trajectory of "identifying ports, recognizing devices, and inferring connections''. Secondly, we customize a multi-level reward that maps reasoning and answers into graph structures and node sets, jointly optimizing logical consistency and netlist accuracy via graph similarity and set matching. Thirdly, TGRL contains a reflective learning mechanism for low-scoring samples, which corrects the reasoning trajectory through reference answers as hints, avoiding local optima caused by sparse reward signals or erroneous reasoning paths. Moreover, we construct a circuit image-netlist reasoning dataset with 3,100 samples, offering step-by-step annotations for converting circuit images to netlists. Extensive experiments demonstrate that Circuit-Think achieves SOTA netlist accuracy and significantly improves the accuracy of downstream tasks. Our circuit image-netlist reasoning dataset is open-source.
