Atomic Information Flow: A Network Flow Model for Tool Attributions in RAG Systems

arXiv:2602.04912v1 Announce Type: new
Abstract: Many tool-based Retrieval Augmented Generation (RAG) systems lack precise mechanisms for tracing final responses back to specific tool components — a critical gap as systems scale to complex multi-agent architectures. We present textbf{Atomic Information Flow (AIF)}, a graph-based network flow model that decomposes tool outputs and LLM calls into atoms: indivisible, self-contained units of information. By modeling LLM orchestration as a directed flow of atoms from tool and LLM nodes to a response super-sink, AIF enables granular attribution metrics for AI explainability.
Motivated by the max-flow min-cut theorem in network flow theory, we train a lightweight Gemma3 (4B parameter) language model as a context compressor to approximate the minimum cut of tool atoms using flow signals computed offline by AIF. We note that the base Gemma3-4B model struggles to identify critical information with textbf{54.7%} accuracy on HotpotQA, barely outperforming lexical baselines (BM25). However, post-training on AIF signals boosts accuracy to textbf{82.71%} (+28.01 points) while achieving textbf{87.52%} (+1.85%) context token compression — bridging the gap with the Gemma3-27B variant, a model nearly $7times$ larger.

Liked Liked