I built a RL trading bot that learned risk management on its own — without me teaching it
After 20 dead versions and about 2 month of work, my RL agent (NASMU) passed its walk-forward backtest across 2020–2026. But the most interesting part wasn’t the results — it was what the model actually learned. The setup: – PPO + xLSTM (4 blocks), BTC/USDT 4h bars – 35 features distilled from López de Prado, Hilpisch, Kaabar, Chan and others – Triple Barrier labeling (TP/SL/Timeout) – HMM for regime detection (bull/bear/sideways) – Running on a Xeon E5-1650 v2 […]