MicroSafe-RL: Sub-microsecond safety shield with Gymnasium Wrapper for Sim-to-Real parity
Deploying RL agents on real physical hardware often reveals a catastrophic flaw: hardware drift. I built MicroSafe-RL to act as a real-time safety interceptor that constrains the action space based on hardware stability signatures.
- Universal Gym Wrapper: I’ve added a
MicroSafeWrapperthat allows you to apply the same safety shielding and reward shaping during simulation that you will use on the actual hardware. - Reward Shaping: The wrapper uses a safety signal to penalize entropy and “Chaos” states, helping the agent learn to avoid dangerous operating zones before deployment.
- Sim-to-Real Parity: The Python profiler is a direct port of the C++ core, ensuring that the tuned parameters (
kappa,alpha,beta,decay) transfer 1:1 to the physical machine. - Performance: While the Python wrapper adds minimal overhead to your training, the C++ core is optimized for O(1) determinism.https://github.com/Kretski/MicroSafe-RL
submitted by /u/Visible-Cricket-3762
[link] [comments]
Like
0
Liked
Liked