CATS: A Tool for Semantically-Aware Cache Performance Analysis of C Programs

The rapid development of Artificial Intelligence (AI) and Machine Learning (ML) poses new challenges for high-performance system developers. The performance of such systems is often limited not by computational power, but by the efficiency of memory subsystem interaction. Cache behavior optimization becomes critically important, yet existing analysis tools fail to meet the demands of modern AI applications. They either provide only aggregated statistics or are characterized by a “semantic gap”, presenting data in machine addresses rather than source code, which makes them ill-suited for analyzing the complex software systems typical of AI. This paper introduces CATS (C Annotated Trace-based Cache Simulator), a novel hybrid method and toolset for detailed cache efficiency analysis, designed to overcome these limitations. CATS combines dynamic tracing with static source code analysis to generate semantically annotated memory traces. This approach is particularly relevant for optimizing AI applications, as it allows precise identification of which data structures (e.g., weight matrices, tensors, or input vectors) are causing cache misses. For analyzing long-running tasks, such as training AI models, our method leverages AI techniques, specifically ML, for intelligent trace sampling, significantly reducing analysis time without sacrificing representativeness. The paper describes the methodology and architecture of CATS and presents experimental evaluation results. In the long term, the data collected by CATS can be used to train AI models capable of automatically providing developers with code refactoring recommendations to improve performance. Early CATS application identifies and resolves cache issues before final implementation, cutting performance optimization costs

Liked Liked