AI Agents Need Inspectable State. That’s Why I Built LangMCP

Checkpoints, memory, and the debugging gap that traces don’t fill.

An illustrative style digital artwork from a first-person, over-the-shoulder perspective behind a sleek, metallic humanoid robot. The robot is sitting at a wooden desk, busy at work soldering components on a green circuit board with tools emitting small wisps of smoke. A giant magnifying glass hovers in the background directly behind the robot’s head, revealing an internal view of an illuminated, glowing electronic brain circuit. The surrounding environment is a soft-focused server room or tech
Inspecting an agent’s inner workings.
AI Generated via Gemini

The first time an AI agent forgets something important, the instinct is to blame the prompt. I’ve done that too.

You look at the system message. You reread the tool descriptions. You ask whether the model ignored an instruction, or whether the user said something ambiguous three turns ago.

Sometimes that is the problem.

But when you are building with LangGraph, the most interesting behavior often lives in checkpoints, thread state, long-term memory, namespaces, configurable IDs, and all the persistence details that decide whether a conversation feels coherent from one turn to the next.

At some point, the real question stops being:

“What did the model do?”

And becomes:

“What is actually in the database right now?”

That question is why I built LangMCP.

The debugging gap in stateful agents

Tools like Langsmith and Langfuse are excellent for traces. They tell you what happened during a run, which tools were called, what the model returned, and how a chain or graph executed.

But while building real agent systems, I kept running into a slightly different debugging problem.

I did not only want to know what happened during one execution. I wanted to inspect the state that survived after execution. You can do that with database consoles, local scripts, logs, and trace dashboards. I did that for a while.

But none of those felt like the right interface for an AI coding assistant.

I did not want to give the assistant arbitrary SQL access. I did not want database credentials floating around in prompts. I did not want every developer to keep a private collection of scripts for inspecting thread state.

I wanted something smaller and safer:

A local MCP server that understands LangGraph persistence and exposes only the inspection operations I actually need.

That became LangMCP.

What LangMCP is

LangMCP is a development MCP server for LangGraph checkpoint and store inspection.

It connects through named profiles, uses LangGraph-native checkpointer and store APIs, and gives MCP clients such as Cursor or Claude Desktop a read-only way to inspect persistence. It gets a narrow, intentional surface area:

  • listing profiles,
  • checking health,
  • discovering thread IDs,
  • inspecting thread state,
  • listing checkpoint history,
  • comparing checkpoints,
  • summarizing threads,
  • inspecting store namespaces,
  • searching long-term memory, and
  • summarizing user memory.

That surface is intentionally practical. It is designed to answer the question that matters during development:

“Why did this agent behave this way?”

Why MCP was the right boundary

MCP gives the project a clean shape.

The editor or assistant does not need direct database access. It talks to LangMCP. LangMCP owns the profiles, backend adapters, redaction, pagination, and read-only enforcement.

That separation matters. A useful assistant should be able to inspect state, but it should not accidentally become a migration tool.

The workflow is simple:

1. Configure profiles in langmcp.toml.

2. Start the MCP server with stdio transport.

3. Ask the assistant about a thread, checkpoint, or user memory.

4. Let the assistant inspect the state through constrained operations.

5. Get back a verdict grounded in actual persistence data.

This is the part I like most about the design. It does not ask the model to be clever with infrastructure. It gives the model a safer lens into the system.

Getting started with LangMCP

  1. Install Python Package
uv pip install "langmcp[all]"

2. Configure a profile in langmcp.toml:

[profiles.dev]
checkpointer = "${POSTGRES_URI}"
store = "${POSTGRES_URI}"

3. Start the server and connect your editor:

langmcp serve --config ./langmcp.toml

Then ask your assistant: “Summarize thread abc123 and check if user memory exists for user_456.”

If you want to add LangMCP to your AI-based coding IDE such as cursor or vscode, the mcp.json should have the following structure.

{
"mcpServers": {
"langmcp": {
"command": "uvx",
"args": ["langmcp[all]", "serve", "--config", "ABSOLUTE_PATH_TO_LANGMCP_TOML"],
"env": {
"LANGMCP_READ_ONLY": "true",
"POSTGRES_URI": "postgresql://READONLY_USER:READONLY_PASSWORD@HOST:5432/DB_NAME"
}
}
}
}

Tools are useful, but MCP has more to offer

The first version of LangMCP focused on tools.

That was the obvious starting point. Tools are perfect when the assistant needs to perform an action with arguments:

  • get_thread_state(thread_id)
  • compare_checkpoints(thread_id, checkpoint_id_a, checkpoint_id_b)
  • search_store(namespace_prefix, query)
  • analyze_memory_gaps(thread_id, user_id)

But MCP is not only tools. As the project matured, I added resources and prompts too. That changed how the server feels.

Resources: treat persistence state like readable context

Resources are useful when data should feel like a readable object with a stable URI.

For LangMCP, that maps naturally to things like:

  • langmcp://profiles
  • langmcp://profiles/{profile}/health
  • langmcp://profiles/{profile}/threads
  • langmcp://profiles/{profile}/threads/{thread_id}/summary
  • langmcp://profiles/{profile}/threads/{thread_id}/checkpoints
  • langmcp://profiles/{profile}/users/{user_id}/memory-summary

This is a better fit for the state that a client may want to attach as context. A thread summary is not really an “action” in the product sense. It is a view of the current state.

That distinction sounds small, but it makes the MCP surface feel more native. Tools answer requests. Resources expose an inspectable state.

Prompts: package the debugging workflow

When debugging agent memory, the steps are often repeatable. You do not want the assistant to jump straight from “the agent forgot something” to a confident answer. You want it to inspect thread state, checkpoint history, config metadata, store namespaces, and user memory before reaching a conclusion.

So LangMCP includes reusable prompts such as:

  • debug_thread
  • investigate_memory_gap
  • compare_thread_checkpoints
  • inspect_user_memory

These prompts do not replace tools. They guide the investigation.

For example, a memory-gap investigation should usually ask:

  • Does the thread state contain the expected user ID?
  • Does the latest checkpoint look correct?
  • Does the store have items for that user?
  • Are the items under the expected namespace?
  • Did the assistant have enough context to use the memory?
  • Is the issue a missing write, a wrong namespace, a wrong thread config, or expected empty memory?

That is the kind of checklist I want encoded into the system, not reinvented in every debugging conversation.

Safety is the product feature

LangMCP is read-only in v0.1.

When you build tools for AI-assisted engineering, capability is only half the story. The other half is blast radius.

LangMCP enforces read_only=true, accepts profile names instead of raw connection strings, and redacts secrets from health output and error messages. The intended setup is a read-only database user, especially for shared development or staging environments.

If the assistant can inspect persistence but cannot mutate it, I can ask more direct questions. I can let it gather evidence. I can use it during a real debugging session without feeling like every prompt needs a warning label.

Backend support

LangMCP currently supports PostgreSQL (full checkpointer and store via PostgresStore), SQLite, and Redis for checkpoint inspection. Store inspection is focused on PostgreSQL for now, since long-term memory workflows generally require the store API.

What I learned building it

The biggest lesson was that agent debugging tools should be opinionated. It is tempting to expose a powerful generic interface and let the model figure it out. But in practice, I want the opposite. I want fewer capabilities, better named operations, and defaults that reflect how the system should be inspected.

The second lesson was that state deserves first-class UX. AI engineers spend a lot of time designing prompts, tool calls, traces, and evals. But for stateful agents, persistence is part of the product. If memory, checkpoints, and thread state are hard to inspect, debugging becomes guesswork.

What comes next

LangMCP v0.1 is intentionally conservative. The next natural steps are broader adapters, HTTP transport with team auth, vector store inspection, and eventually carefully scoped write workflows such as updating thread state or resuming a thread. Those write workflows should come later. They should have more friction than reads, clearer permissions, and stronger auditability.

For now, the most valuable thing LangMCP can do is make invisible state visible.

Final thought

At the end of the day, LangMCP is built to solve a highly practical developer frustration. The stateful reality of LangGraph means that an agent’s bugs are often preserved right in its checkpoints. Shifting that state out of isolated database consoles and directly into your AI coding assistant’s context window fundamentally changes how you debug. It means fewer blind prompt tweaks, faster root-cause analysis, and significantly fewer late-night sessions spent wondering what a production agent just forgot.

If you want to try out LangMCP or contribute to its development, check out the project here

GitHub – xmassmx/langmcp: A MCP server that will connect with LangChain Checkpointers, Memory Stores, Vectorstores to aid in monitoring and observability during development of AI Applications


AI Agents Need Inspectable State. That’s Why I Built LangMCP was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Liked Liked