I mapped the “Dynamic Grammar” of LLMs: How hidden states move, stabilize, and decide

Hi everyone,

I’m an independent researcher (no lab affiliation) who has spent the last year diving deep into the internal dynamics of Transformers. Instead of looking at outputs or attention heads, I’ve been tracking the geometric trajectories of hidden states layer-by-layer during inference.

I wanted to share my latest findings (preprints linked below) because they reveal a structured “dynamic grammar” that seems universal across architectures, from GPT-2 to Llama-3.2.

The Core Idea

Most observability tools treat LLMs as static input-output machines. I treat them as dynamic systems. By measuring metrics like trajectory curvature (ct_t), functional capacity, and state transitions, I found that LLMs don’t just “generate text”—they navigate a latent space through specific, reproducible phases.

Key Findings (V20–V24)

  1. A Universal Dynamic Grammar (V24)

Across 7 models (GPT-2, OPT, Qwen, TinyLlama, Phi-1.5, Llama-3.2, DistilGPT2), I observed a conserved sequence of internal states:

B (Branching/Hesitation): Initial exploration.

A (Adaptive/Stable): The main processing phase (an attractor state).

D (Decision/Bifurcation): Final commitment to a token.

Result: B → A → D appears to be the “standard cognitive path” for coherent generation. Deviations from this path often correlate with errors or hallucinations.

  1. Geometry > Neurons (V22)

Using orthogonal rotation controls, I proved that functional information (syntax, decision, stabilization) is encoded in the relative geometry of the representation space, not in individual neurons. If you rotate the latent space, the information remains decodable. This suggests LLMs think in shapes, not just activations.

  1. Ambiguity Changes the Path, Not the Chaos (V23)

When prompts are ambiguous, models don’t necessarily become “chaotic.” Instead, they delay commitment. They spend more time in the exploration phase (B) and less time rushing to decision (D). Phi-1.5, interestingly, shows a unique oscillating pattern (B↔A) during reasoning tasks, distinct from the smoother convergence of other models.

  1. Architecture Matters More Than Size (V20)

Models cluster by their dynamic signatures (e.g., GD_ratio), not just parameter count. Small models like Qwen-0.5B show distinct stability regimes compared to GPT-2, despite similar sizes.

The Preprints (Open Access)

[June 2026] A Runtime Trajectory Dynamics Framework (V20): Introduces the 5-state taxonomy (Stable, Turbulence, Branching, Bifurcation, Committed) and the bicephalic operator.

Link: https://doi.org/10.5281/zenodo.20602685

[May 2026] Dynamic-Layer Controllability (V21): Shows how perturbations affect recovery and proves that emergent organization dominates architectural skeleton.

Link: https://doi.org/10.5281/zenodo.20400171

[May 2026] Conditional Dynamic Signatures (V22): Audits normalization effects and variance decomposition. Explicitly documents falsified claims.

Link: https://doi.org/10.5281/zenodo.20361289

[May 2026] Four Dynamical Regimes (V19/V20): Introduces ct_t (curvature × displacement) as a predictor of collapse and instability.

Link: https://doi.org/10.5281/zenodo.20348878

Why I’m Posting This

I’m not selling a product. I’m building an open framework (LIMEN) to make LLM internals auditable and controllable. I believe that if we want safe AI, we need to monitor its “vital signs” (dynamic stability) in real-time, not just its output.

I’d love feedback from the community, especially on:

Have you seen similar “universal motifs” in larger models (>7B)?

Critiques on the methodology (normalization, probe training).

Ideas for causal interventions based on these dynamic states.

submitted by /u/Turbulent-Metal-9491
[link] [comments]

Liked Liked