How Developers Adopt, Use, and Evolve CI/CD Caching: An Empirical Study on GitHub Actions

arXiv:2604.13129v1 Announce Type: new
Abstract: Continuous Integration/Continuous Delivery (CI/CD) caching is widely used to reduce repeated computation and improve CI/CD efficiency, yet maintaining effective caching requires ongoing maintenance effort. In this paper, we present the first empirical study on how developers configure and evolve caching in CI/CD workflows on GitHub Actions. We analyze 952 GitHub repositories (266 cache adopters and 686 non-adopters), to compare repository characteristics, characterize caching usage at the job and step levels, uncover patterns in caching configuration evolution, and identify the drivers of cache-related changes. Our analysis spans 1,556 workflow files, 10,373 commits, and 17,185 workflow configuration changes, including an average of 9.37 cache-related changes per repository. Our main observations are: (1) cache-adopting repositories are more active and popular than non-adopters; (2) caching is used across multiple CI/CD job types through a variety of caching mechanisms rather than a single standardized approach; (3) caching configurations evolve through frequent, repetitive maintenance patterns, with rapid updates in build and test jobs and slower evolution in other job types; and (4) cache-related modifications are driven by distinct maintenance needs: parameter updates are mainly human-driven to fix issues, while version updates occur later and are often bot-driven for dependency maintenance. Our findings quantify the substantial maintenance effort involved in CI/CD caching and highlight opportunities to improve reliability and tool support.

Liked Liked