Quantifying the Impact of Signal Simplification, Data Quantity, and Task Difficulty on Vision Transformer Performance for ECG Rhythm Classification

Vision transformers (ViTs) have demonstrated considerable promise for classifying electrocardiogram (ECG) rhythms. However, much of the existing research is conducted in highly controlled, data-sterile settings that fail to reflect the substantial variability present in real-world ECG signals. This paper seeks to address this gap by examining how signal simplification, data quantity, and task difficulty influence the performance of the SwinV2 ViT model in ECG rhythm classification. Through systematic analysis, we highlight that classifying highly abstracted signals yields only a limited impact on model performance, with all models achieving over 95% accuracy, while the amount of training data plays a crucial role with an almost 15% accuracy difference between the models trained on the most data and the least data. Finally, our analysis shows the models ability to effectively adapt to an increased class count, which is essential due to the varying nature of ECG diagnosis. In summary, these results highlight the importance of carefully balancing data clarity, dataset size, and diagnostic variety when designing ECG classification systems. Achieving this balance is crucial for building reliable and scalable AI solutions for cardiac assessment in real-world clinical settings.

Liked Liked