MicroSafe-RL: Sub-microsecond safety shield with Gymnasium Wrapper for Sim-to-Real parity

Deploying RL agents on real physical hardware often reveals a catastrophic flaw: hardware drift. I built MicroSafe-RL to act as a real-time safety interceptor that constrains the action space based on hardware stability signatures.

  • Universal Gym Wrapper: I’ve added a MicroSafeWrapper that allows you to apply the same safety shielding and reward shaping during simulation that you will use on the actual hardware.
  • Reward Shaping: The wrapper uses a safety signal to penalize entropy and “Chaos” states, helping the agent learn to avoid dangerous operating zones before deployment.
  • Sim-to-Real Parity: The Python profiler is a direct port of the C++ core, ensuring that the tuned parameters (kappa, alpha, beta, decay) transfer 1:1 to the physical machine.
  • Performance: While the Python wrapper adds minimal overhead to your training, the C++ core is optimized for O(1) determinism.https://github.com/Kretski/MicroSafe-RL

submitted by /u/Visible-Cricket-3762
[link] [comments]

Liked Liked