ENFOR-SA: End-to-end Cross-layer Transient Fault Injector for Efficient and Accurate DNN Reliability Assessment on Systolic Arrays

arXiv:2602.00909v1 Announce Type: new
Abstract: Recent advances in deep learning have produced highly accurate but increasingly large and complex DNNs, making traditional fault-injection techniques impractical. Accurate fault analysis requires RTL-accurate hardware models. However, this significantly slows evaluation compared with software-only approaches, particularly when combined with expensive HDL instrumentation. In this work, we show that such high-overhead methods are unnecessary for systolic array (SA) architectures and propose ENFOR-SA, an end-to-end framework for DNN transient fault analysis on SAs. Our two-step approach employs cross-layer simulation and uses RTL SA components only during fault injection, with the rest executed at the software level. Experiments on CNNs and Vision Transformers demonstrate that ENFOR-SA achieves RTL-accurate fault injection with only 6% average slowdown compared to software-based injection, while delivering at least two orders of magnitude speedup (average $569times$) over full-SoC RTL simulation and a $2.03times$ improvement over a state-of-the-art cross-layer RTL injection tool. ENFOR-SA code is publicly available at https://github.com/rafaabt/ENFOR-SA.

Liked Liked