Learning Centre Partitions from Summaries
arXiv:2509.16337v2 Announce Type: replace-cross
Abstract: Multi-centre studies increasingly rely on distributed inference, where sites share only centre-level summaries. Homogeneity of parameters across centres is often violated, motivating methods that both emph{test} for equality and emph{learn} centre groupings before estimation. We develop multivariate Cochran-type tests that operate on summary statistics and embed them in a sequential, test-driven emph{Clusters-of-Centres (CoC)} algorithm that merges centres (or blocks) only when equality is not rejected. We derive the asymptotic $chi^2$-mixture distributions of the test statistics and provide plug-in estimators for implementation. To improve finite-sample integration, we introduce a multi-round bootstrap CoC that re-evaluates merges across independently resampled summary sets; under mild regularity and a separation condition, we prove a emph{golden-partition recovery} result: as the number of rounds grows with $n$, the true partition is recovered with probability tending to one. We also give simple numerical guidelines, including a plateau-based stopping rule, to make the multi-round procedure reproducible. Simulations and a real-data analysis of U.S. airline on-time performance (2007) show accurate heterogeneity detection and partitions that change little with the choice of resampling scheme.