Dual-Modal Gated Fusion-Based Multi-Modal 3D Object Detection for Nighttime Autonomous Driving Scenarios
Autonomous driving technology represents a critical component in the advancement of new energy vehicles and serves as an essential enabler for the achievement of sustainable development goals at the societal level. However, autonomous driving in nighttime scenarios suffers from unstable perception under low-light conditions and limited effectiveness, which significantly constrains the practical performance of existing perception systems. This is attributed to the fact that visual degradation and modal reliability imbalance prevalent in nighttime scenarios give rise to erratic feature fusion dynamics within 3D detection paradigms, which is the key technology in autonomous driving, consequently undermining the detection precision.
In this paper, a BEV-based multi-modal 3D object detection approach is presented for nighttime autonomous driving that incorporates adaptive modeling components tailored for nighttime scenarios while preserving the original BEV representation and detection pipeline. Without modifying the core inference structure, the method improves robustness to low-light conditions and enhances the stability of cross-modal feature integration, thereby maintaining reliable perception performance under challenging illumination conditions. Extensive experiments are conducted on the nuScenes nighttime subset to evaluate the effectiveness of the proposed approach. The experimental results demonstrate that the proposed method consistently outperforms the BEVFusion baseline while introducing negligible additional model parameters and inference overhead. In particular, an overall NDS improvement of 1.13% is achieved under nighttime conditions, validating the effectiveness and practical applicability of the proposed approach for low-light and complex autonomous driving environments.