AgentVerify: Compositional Formal Verification of AI Agent Safety Properties via LTL Model Checking
Autonomous AI agents operating in high-stakes domains—financial trading, medical diagnostics, autonomous code execution—lack formal safety guarantees for their core operational loops, including memory management, tool invocations, and human interactions. Current verification approaches either fail to scale to neural components or ignore the structured control flow of agentic systems entirely. We introduce AgentVerify (Compositional Formal Verification of AI Agent Safety Properties via LTL Model Checking), a model checking framework that specifies and verifies safety properties for agent architectures using temporal logic. AgentVerify defines compositional specifications for memory integrity, tool call pro tocols, MCP/skill invocations, and human-in-the-loop boundaries, enabling rigorous runtime monitoring and post-hoc behavioral analysis. In an empirical evaluation across 15 diverse agent scenarios (low- and high-difficulty), our post-hoc behavioral analysis component achieved a verification accuracy of 86.67% (mean over 3 seeds, σ=0.00), outperforming a monolithic contract verification baseline (80.00%) and a runtime monitoring baseline without temporal logic (46.67%). A monolithic neural verifier, which attempts to verify the LLM outputs directly, performed poorly at 13.33%, confirming that end-to-end neural verification is currently intractable for production-scale agents. These results demonstrate that formal methods applied to the agent’s observable control flow provide a tractable and effective path to safety assurance, complementing rather than replacing neural-centric efforts to align large language models.