When to Route? Regime-Adaptive Meta-Policies for Hierarchical Portfolio Agents
Modular decision systems expose multiple operating points, but downstream utility can vary by regime. Portfolio construction is a useful setting because routing, aggregation, and allocation can help or hurt depending on market structure. We instantiate a hierarchy with three operating points: direct optimizer, routed consensus, and alpha-augmented optimizer. Across universes, these modes are not uniformly ranked. Routing helps when dispersion/decorrelation is high; direct optimization is safer in low-signal settings; alpha augmentation helps in concentrated signal-rich settings. We identify the market characteristics (cross-sectional dispersion, concentration, forecast signal density) that predict which mode dominates. A rolling adaptive meta-policy that selects among operating points based on recent performance achieves competitive or superior risk-return profiles without foreknowledge of the optimal mode. We validate against classical baselines demonstrate that the operating-point structure persists across time frequencies from 15-minute bars to daily rebalancing, and confirm robustness under realistic transaction costs (0–15 bps). More broadly, our results suggest for hierarchical decision systems: the key is not to find one universally best configuration, but to characterize when each operating mode is most effective. To support future research and ensure reproducibility, we make source code publicly available at https://github.com/kouzhizhuo/Regime-Adaptive-Portfolio-Agents.