Four Dynamical Regimes in large Language Models : An Empirical Phase Map

We introduce ct_t = delta_t × curvature_t, a token-level instability metric computed from L2-normalised hidden states of large language models, and ratio_norm = max(ct_t) / mean(ct_t) as a scalar regime indicator. Evaluated across 10 open-source models (158 runs), four dynamical regimes emerge consistently: UNDERACTIVE (ratio 1.55-1.70), ADAPTIVE (2.27-2.92), TRANSITION (~2.97), and CHAOTIC (4.42-35.55). The Qwen family is the only family observed in the ADAPTIVE zone in this panel. The ordering is robust across temperature (0.1-1.0) and token budget variations (mean ratio 2.384, std 0.343). ratio_norm correlates with training loss at r = 0.922 (n = 20) and diverges from perplexity at r = 0.716, indicating a partially distinct diagnostic dimension. A single-threshold collapse predictor (late_ct < 0.001) achieves accuracy = 1.0 on n = 8 samples, pending held-out validation. A hybrid control architecture (LIMEN dynamic monitor + task-aware semantic guard) improves baseline performance from 2/10 to 6/10 on TinyLlama on an adversarial benchmark (n = 10), with the contribution of dynamic monitoring versus guard prompt requiring ablation. No gain is observed on TruthfulQA-Light (20/20 baseline = 20/20 hybrid). We identify two structurally distinct failure modes: immediate trajectory collapse and late-divergence tension. All negative results are explicitly documented.

submitted by /u/Turbulent-Metal-9491
[link] [comments]

Liked Liked