Building an XDR-Style Security Bot in OpenClaw to Watch Your Logs 24/7

digitado ⋅ 30 de May de 2026

Security vendors will sell you an XDR platform that costs more than your car and still misses the weird stuff because it was trained on someone else’s network. I got tired of paying for that. So I built my own, on an old NUC under my desk, for about $28 a month in electricity and cloud storage.

This is that build. OpenClaw as a 24/7 log watcher, anomaly scorer, correlator, and Telegram notifier. No cloud dependency. No per-seat license. No dashboard that requires a certification to understand.

What OpenClaw Actually Is (and Why It Matters Here)

OpenClaw is a self-hosted, open-source AI agent. The distinction from a chatbot is important: it reads and writes files, executes shell commands, manages multi-step workflows, and runs on a schedule without you touching it. You install it locally. It stays there.

Three capabilities make it relevant for security monitoring:

Skills. Small Python or TypeScript modules that do one job. Schedule them in claw.yaml and they run on their own. every: “60s” and it never forgets.

Persistent memory. It stores what it observed yesterday. Anomaly detection only works when “normal” is defined, and memory is where that definition lives.

Connectors. It sends messages to Telegram, Discord, email, and others without you writing integration code.

That’s the full stack for a log watcher. Nothing exotic.

The Architecture

I named mine Cerberus. Three heads, fits the job. The shape of it:

Skill Job Schedule log-ingest Normalize logs to SQLite Every 60 seconds baseline-nose Build behavioral baselines Every 5 minutes hunt-correlate Correlate anomalies, generate alerts On trigger notify-telegram Tiered notification + interactive response On alert

All four run on the same NUC. The whole thing costs less than a single month on any commercial EDR trial.

Giving It Eyes: Log Ingestion That Doesn’t Lie to You

The most common failure in DIY security monitoring is garbage data. If normalization is wrong, every downstream decision is wrong.

My sources:

pfSense firewall syslog
CrowdSec decision logs on two Ubuntu boxes
Auth logs from a jump host
Cloudflare Zero Trust audit logs via API pull
Docker stdout from self-hosted apps

The log-ingest skill watches /var/log/remote/ with inotify, tails new lines, and writes normalized JSON to SQLite. No Elastic. No Splunk. SQLite because if it corrupts, you can fix it. You cannot fix a broken Elastic cluster at 2am with confidence.

The key decision here: do not let the LLM parse every log line. That’s expensive and unreliable. Deterministic regex first, LLM as fallback for unknown formats only:

def normalize(line):
    m = AUTH_RE.match(line)
    if m:
        return {"src": "ssh", "user": m[1], "ip": m[2], "result": m[3]}
    # Only hand off to LLM if the format is genuinely unknown
    return claw.llm_extract(line, schema=GENERIC_EVENT)

OpenClaw executes this Python natively, inside the agent process. No external lambda. No HTTP round-trip.

Schedule it with one block in claw.yaml:

jobs:
  - name: ingest
    every: "60s"
    skill: log-ingest

Quiet when nothing changed. Active when data arrives. That’s the right behavior.

Teaching It What Normal Looks Like

Most DIY SOC projects die here. Someone writes 400 Sigma rules, drowns in false positives, mutes the alerts, and now has a very expensive shell script that does nothing.

The alternative is baselining. Let it learn before it judges.

The baseline-nose skill runs every five minutes. It queries SQLite for the last 14 days and builds statistical profiles per entity:

User lusynth SSHes from 2 IPs, between 7am and 11pm, never on weekends
The firewall sees roughly 400 outbound DNS queries per hour, 92 percent to Cloudflare and Quad9
The home assistant container restarts once after updates, never at 3am

These profiles live in OpenClaw’s memory as embeddings. When a new event arrives, the skill calculates cosine distance from the baseline centroid. Old-school statistics. No neural miracle required.

One concrete example: it flagged me logging in from a coffee shop in Asheville at 6am. Sent a Telegram: “You are awake early. Are you compromised or just regretting life choices?” Both were true. The point is, it noticed. A ruleset wouldn’t have.

That context layer is the actual value of this approach. Correlation without it is just noise.

Detection That Reasons, Not Just Matches

The hunt-correlate skill triggers when the anomaly score crosses a threshold. It pulls the last hour of related events across all sources and builds a timeline, then hands it to the LLM for reasoning.

The prompt I use:

You are Cerberus, a security analyst. You have local system access but you may 
not delete, encrypt, or exfiltrate data. Given these JSON events, identify 
ATT&CK techniques, explain the story in plain English, rate confidence as 
low/medium/high, and propose exactly one containment action that is reversible.

Because OpenClaw runs locally, the skill can call ss -tulpn or docker ps to enrich context before reasoning. That enrichment is what separates this from a log aggregator with a chat interface bolted on.

Real example from last month: three failed SSH attempts from a DigitalOcean IP, then a successful login from my phone over Tailscale, then sudo apt update. A traditional SIEM would have fired three separate alerts with no relationship between them. Cerberus wrote:

“Credential stuffing attempt blocked by fail2ban, followed by legitimate admin login from known device. No lateral movement detected. Confidence: medium. Proposed action: add DO /24 to temporary blocklist for 24 hours.”

Then it called the pfSense API and did exactly that. One alert. One story. One reversible action.

A Notification System That Doesn’t Train You to Ignore It

If the bot pings you 40 times a night, you will mute it. Then you have an expensive background process and no actual monitoring.

The notify-telegram skill has three levels:

Whisper: Logged only. Used during the learning period and for low-confidence anomalies that need more data before escalating.

Tap: Telegram DM with a summary and two buttons: “Ignore” and “Show details.” When you hit Ignore, OpenClaw writes that feedback into memory and dampens the anomaly score for similar patterns going forward. You are training it like a junior analyst.

Shout: Telegram plus a Twilio phone call. Reserved for high-confidence alerts with privilege escalation in the event chain.

Right now I get about two taps per week. Down from 200 alerts a day during a commercial EDR trial I ran in January. Sleep has improved accordingly.

Prompt Injection Is Real and You Should Treat It That Way

OpenClaw can execute commands. That power has a corresponding attack surface: if an attacker can inject malicious text into your logs, and your agent processes it without sanitization, you have handed them a remote shell.

CrowdStrike and Lasso have written about this. They are not wrong. The guardrails I run:

No direct LLM-to-shell. The LLM proposes an action as JSON:

{"tool": "pfblocker", "args": {"ip": "1.2.3.4"}}

A separate deterministic allowlist validates that JSON before anything executes. If the action is not on the list, it dies silently and gets logged.

Separate privilege levels. The ingest skill runs as claw-logs with read-only filesystem access. The response skill runs as claw-ops with limited sudo for exactly three commands.

Input sanitization. All log fields are escaped before they touch the prompt. Anything that resembles an instruction gets stripped. Attackers do hide “ignore previous instructions” in User-Agent strings. I have seen it in the wild.

No internet exposure. OpenClaw’s API does not face the internet. Access is through Tailscale only.

Treat it like any other privileged service. It is one.

War Stories

Last Tuesday, Cerberus caught a compromised Home Assistant add-on making an outbound connection to a Russian IP. Not in any threat feed I subscribe to. The flag came from the baseline: that container had made zero outbound connections in 90 days, then suddenly connected at 1:13am. Correlation added a recent add-on update. Response killed the container and snapshotted the logs. Time from packet to containment: 47 seconds.

In March, it flagged my own behavior. Testing a new VPN, I generated five new SSH keys in an hour. It asked whether I was rotating credentials or being impersonated. I hit “Show details,” confirmed it was me, and told it to learn the pattern. It did. No repeat alert.

That feedback loop matters more than the detection logic. The system gets more accurate the longer you run it, specifically because you are teaching it your environment.

Where to Start

One log source. One skill. One notification channel. Let it observe for a week before you give it any response capability.

Once the baseline looks reasonable, add GreyNoise or AbuseIPDB lookups for IP enrichment. Cache aggressively because free tier limits are real. After that, build a weekly summary skill that writes a one-page plain-English report instead of surfacing raw JSON.

The goal is not to replace a SOC. The goal is something that knows your network better than any vendor ever will, runs while you sleep, and can be audited line by line.

If you want a production-ready reference for building AI-assisted tooling and workflows, I put together a structured guide at numbpilled.gumroad.com covering the Claude Code + Obsidian setup I use for research and automation projects, same philosophy as this build.

OpenClaw + Claude Code: 24/7 Persistent Agent Playbook (2026)

Also: check the OpenClaw field guide, which covers skill architecture, sandboxing, and multi-agent setups in more depth than this post could.

OpenClaw 2.0: Power User’s Edition: 2026 Operator’s Manual

Building an XDR-Style Security Bot in OpenClaw to Watch Your Logs 24/7 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Like 0

Liked Liked