I built a GATv2 + MINCO + CBF drone swarm controller in Isaac Lab — here’s what actually worked (and what didn’t)

digitado ⋅ 9 de April de 2026

Capstone project: decentralized formation control for UAV swarms using CTDE (centralized training, decentralized execution) with a shared PPO policy in NVIDIA Isaac Lab.

**The stack (GNSC 5-layer architecture):**

– L1: Local sensing — 12D body-frame state + K-nearest neighbor relative positions (18D total obs)

– L2: GATv2 graph attention network — each drone reasons about K-nearest neighbors via sparse message passing

– L3: MINCO minimum-jerk trajectory filter (T=0.04s) + SwarmRaft agent dropout recovery

– L4: CBF-QP safety shield — mathematically guaranteed collision avoidance

– L5: Mission execution — formation reward managers, shape switching, polygon/grid/letter presets at play time

**The finding that surprised me most:**

MINCO’s value isn’t runtime smoothing — it’s a training stabilizer. A/B comparing policies trained with vs without MINCO showed 77% lower steady-state jitter, 72% better formation error, and 40% faster convergence. The trained policy internalizes smoothness so completely that the runtime filter becomes unnecessary.

**The bug that cost me the most time:**

The GATv2 adjacency matrix was being stored in `extras` — a side-channel that SKRL never forwards to the model. GATv2 was silently falling back to self-loops only, functioning as an MLP the entire time. Fixed by building fully-connected edges internally from the flat observation tensor with caching.

Trained on 8 agents, deployed on 20+ with the same checkpoint.

Full repo: https://github.com/garykuepper/ggSwarm

submitted by /u/garygigabytes
[link] [comments]

Like 0

Liked Liked