Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

arXiv:2602.03970v1 Announce Type: new
Abstract: We study the statistical behaviour of reasoning probes in a stylized model of looped reasoning, given by Boolean circuits whose computational graph is a perfect $nu$-ary tree ($nuge 2$) and whose output is appended to the input and fed back iteratively for subsequent computation rounds. A reasoning probe has access to a sampled subset of internal computation nodes, possibly without covering the entire graph, and seeks to infer which $nu$-ary Boolean gate is executed at each queried node, representing uncertainty via a probability distribution over a fixed collection of $mathtt{m}$ admissible $nu$-ary gates. This partial observability induces a generalization problem, which we analyze in a realizable, transductive setting.
We show that, when the reasoning probe is parameterized by a graph convolutional network (GCN)-based hypothesis class and queries $N$ nodes, the worst-case generalization error attains the optimal rate $mathcal{O}(sqrt{log(2/delta)}/sqrt{N})$ with probability at least $1-delta$, for $deltain (0,1)$. Our analysis combines snowflake metric embedding techniques with tools from statistical optimal transport. A key insight is that this optimal rate is achievable independently of graph size, owing to the existence of a low-distortion one-dimensional snowflake embedding of the induced graph metric. As a consequence, our results provide a sharp characterization of how structural properties of the computational graph govern the statistical efficiency of reasoning under partial access.

Liked Liked