MicroSafe-RL: Sub-microsecond safety shield with Gymnasium Wrapper for Sim-to-Real parity
Deploying RL agents on real physical hardware often reveals a catastrophic flaw: hardware drift. I built MicroSafe-RL to act as a real-time safety interceptor that constrains the action space based on hardware stability signatures. Universal Gym Wrapper: I’ve added a MicroSafeWrapper that allows you to apply the same safety shielding and reward shaping during simulation that you will use on the actual hardware. Reward Shaping: The wrapper uses a safety signal to penalize entropy and “Chaos” states, helping […]