I trained a DQN agent to solve drone intercept cost optimization — here’s what it figured out on its own
Built a drone interception environment from scratch in Pygame — no OpenAI Gym dependency. State vector is 10-dimensional, tracking 2 nearest drones with angle error, predicted position 15 steps ahead, distance, and vertical speed. Reward structure is where it gets interesting: Hit: +10 Building destroyed: -20 Shot fired: -0.5 Drone escaped: -5 The -0.5 firing penalty forces the agent to learn ammo conservation. What emerged: under low swarm density it fires aggressively, under high density it becomes selective. […]