What Is Redundancy?

Redundancy is a central but persistently ambiguous concept in multivariate information theory. Across the literature, the same term is used to denote at least three distinct ideas: (i) predictive sufficiency of multiple variables for a task, (ii) conditional irrelevance of variables given others, and (iii) overlapping information content among inputs that is relevant to an output. These notions are routinely conflated in decompositions of mutual information, leading to incompatible axioms, contradictory interpretations, and apparent paradoxes — particularly when inputs are statistically independent.
Here, we argue that the difficulty of defining redundancy is not primarily technical, but conceptual: the field has not converged on which problem redundancy is meant to solve. We show that predictive redundancy and conditional irrelevance are task-relative, data-dependent properties that need not correspond to Shannon information quantities, while informational redundancy — if it is to appear as a term in a decomposition of mutual information — must be grounded in mutual information between inputs.
Using simple functional examples, biased input ensembles, and connections to information fragmentation analysis, we demonstrate how input correlations induce apparent redundancy in outputs without reflecting overlapping informational content in the underlying function. We conclude by proposing an explicit separation of redundancy concepts and outlining the minimal commitments required for each to be meaningfully operationalized. This separation clarifies why redundancy remains elusive, why no single measure can satisfy all intuitions, and how future work can proceed without redefining information itself.

Liked Liked