Can You Really Triage a Security Alert with Claude Code?
TL;DR
- Claude Code with MCP turns Splunk into a real autonomous triage agent – it writes SPL, runs searches, and pivots on its own.
- No manual log pasting, no hallucinations, no constant hand-holding.
- Here’s exactly how to wire it up and what it actually delivers in practice.
Most AI security tooling falls into two categories: vendor-integrated black boxes, or chatbot wrappers where you paste logs manually and hope the model doesn’t hallucinate the hostname. Claude Code with MCP is neither. It runs against your real data, writes its own SPL, and pivots without you touching a keyboard.
This post proves it’s possible and shows you exactly how to wire it up.
What “triage with Claude Code” actually means
This isn’t co-pilot autocomplete. It’s not a SIEM plugin. It’s a coding agent with a live Splunk connection that treats your environment like a tool call.
Claude Code issues queries, reads results, and decides what to search next – on its own. The feedback loop is: run SPL, get data back, form a hypothesis, run more SPL. Your role is to define the alert, write the triage instructions in CLAUDE.md, and stay out of the way. That last part is harder than it sounds.
The key distinction: you’re not asking Claude Code “what does this log mean.” You’re handing it an alert and asking it to go find out whether the alert is real.
The MCP Splunk server
MCP (Model Context Protocol) is the interface layer that lets Claude Code call external tools. Not a chatbot API – tool calls with structured inputs and outputs. The model can issue a search, wait for results, and branch based on what comes back.
The splunk-mcp server exposes what you’d expect: run a search job, poll job status, list indexes, pull field extractions. That’s enough to do real triage.
What it can’t do out of the box: no threat intel enrichment, no write-back to your ticketing system, no lookup table updates. It’s read-only by design. That’s a feature, not a gap.
Auth is a scoped Splunk token. Read-only. Scope it to the indexes you want it to touch, nothing else.
Install splunk-mcp
pip install splunk-mcp
Set your environment variables:
export SPLUNK_HOST=your-instance.splunkcloud.com
export SPLUNK_PORT=8089
export SPLUNK_TOKEN=your-scoped-read-only-token
Wire it into Claude Code
Add the server entry to .claude/settings.json:
{
"mcpServers": {
"splunk": {
"command": "python",
"args": ["-m", "splunk_mcp"],
"env": {
"SPLUNK_HOST": "your-instance.splunkcloud.com",
"SPLUNK_PORT": "8089",
"SPLUNK_TOKEN": "your-token-here"
}
}
}
}
Run claude, you should see the Splunk MCP server connect on startup.
Tune CLAUDE.md
This is the part most people skip, and it’s the difference between a useful triage and 40 queries into noise.
Your CLAUDE.md should scope the alert, tell it when to stop, and define what good looks like.
A minimal triage CLAUDE.md looks like this:
You are triaging a Splunk alert. Use the Splunk MCP tool to investigate.
Relevant indexes: index=main, index=endpoint
Timeframe: last 24 hours unless evidence suggests otherwise
Entity under investigation: [host/user/IP from alert]
Your goal: determine whether this alert represents real attacker activity.
Stop when you have a supported yes/no, or after 10 searches.
Do not explore beyond what's relevant to the alert.
The triage prompt structure
Hand it the alert metadata, the search criteria or event ID, and the question. That’s it.
Don’t pre-filter the data. Don’t summarize the logs. Don’t tell it what you think is happening. The whole point is to let the agent form its own picture.
What Claude Code actually does
It writes real SPL and executes it against your environment.
It reads the results and decides what to look at next. Suspicious process? It pivots to network activity from the same host. Unusual login? It pulls login history.
By the time it reports back, it’s done 5-10 searches and has a coherent picture. No analyst typing required after the initial prompt.
Limitations
- No native threat intel enrichment (you can add it with extra MCP tools).
- No write-back to ticketing systems (copy/paste for now).
- Context window limits — constrain result sizes in the prompt.
- Model quality matters. Stronger models do coherent multi-step triage.