MEBM-Speech: Multi-scale Enhanced BrainMagic for Robust MEG Speech Detection
arXiv:2603.02255v1 Announce Type: new
Abstract: We propose MEBM-Speech, a multi-scale enhanced neural decoder for speech activity detection from non-invasive magnetoencephalography (MEG) signals. Built upon the BrainMagic backbone, MEBM-Speech integrates three complementary temporal modeling mechanisms: a multi-scale convolutional module for short-term pattern extraction, a bidirectional LSTM (BiLSTM) for long-range context modeling, and a depthwise separable convolutional layer for efficient cross-scale feature fusion. A lightweight temporal jittering strategy and average pooling further improve onset robustness and boundary stability. The model performs continuous probabilistic decoding of MEG signals, enabling fine-grained detection of speech versus silence states – an ability crucial for both cognitive neuroscience and clinical applications. Comprehensive evaluations on the LibriBrain Competition 2025 Track1 benchmark demonstrate strong performance, achieving an average F1 macro of 89.3% on the validation set and comparable results on the official test leaderboard. These findings highlight the effectiveness of multi-scale temporal representation learning for robust MEG-based speech decoding.