Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
To facilitate the large-scale deployment of autonomous driving in real-world scenarios, developing low-cost and highperformance 3D object detection systems has become a critical technical challenge. Although high-beam LiDARs provide denser point cloud data, their prohibitive hardware cost and high power consumption limit their practicality. In contrast, low-beam LiDARs offer advantages in terms of affordability and energy ef-ficiency, but often suffer from inadequate perception accuracy due to their sparser point cloud data. This pa-per focuses on the task of multimodal 3D object detec-tion with low-beam LiDARs, and proposes a novel ap-proach that integrates temporal and spatial representa-tion learning to enhance detection accuracy under sparser sensor conditions. Specifically, our approach comprises: (1) a Temporal Feature Prediction Learning (TFPL) module, which predicts the current BEV repre-sentation based on a sequence of historical BEV fea-tures; (2) a Spatial Feature Observation Learning (SFOL) module, which aligns BEV (Bird’s-EyeView) features from high-beam and low-beam LiDAR to enforce the low-beam features to approximate high-beam represen-tations; (3) an Uncertainty-Aware Fusion (UAF) strate-gy, which performs feature-wise weighting between the predicted and observed BEV features by leveraging channelwise variances, effectively mitigating perturba-tions in the learned BEV representations. Extensive ex-periments on the KITTI and nuScenes 3D object detec-tion datasets demonstrate that the proposed approach significantly improves detection performance under low-beam LiDAR configurations.