Towards Safety-Compliant Transformer Architectures for Automotive Systems
arXiv:2601.18850v1 Announce Type: new
Abstract: Transformer-based architectures have shown remarkable performance in vision and language tasks but pose unique challenges for safety-critical applications. This paper presents a conceptual framework for integrating Transformers into automotive systems from a safety perspective. We outline how multimodal Foundation Models can leverage sensor diversity and redundancy to improve fault tolerance and robustness. Our proposed architecture combines multiple independent modality-specific encoders that fuse their representations into a shared latent space, supporting fail-operational behavior if one modality degrades. We demonstrate how different input modalities could be fused in order to maintain consistent scene understanding. By structurally embedding redundancy and diversity at the representational level, this approach bridges the gap between modern deep learning and established functional safety practices, paving the way for certifiable AI systems in autonomous driving.