[D] antaris-suite 3.0 (open source, free) — zero-dependency agent memory, guard, routing, and context management (benchmarks + 3-model code review inside)
|
So, I picked up vibe coding back in early 2025 when I was trying to learn how to make indexed chatbots and fine tuned Discord bots that mimic my friend’s mannerisms. I discovered agentic coding when Claude Code was released and pretty much became an addict. It’s all I did at night. Then I got into agents, and when ClawBot came out it was game over for me (or at least my time). So I built one and starrt using it to code pretty much exclusively, using DIscord to communicate with it. I’m trying to find a way out of my current job and I’m hoping this opens up some pathways. Well the evening/early morning after Valentines Day, when I was finally able to sneak away to my computer and build, I came back to a zombified agent and ended up losing far more progress from the evening before than I’d like to admit. (Turns out when you us discord as your sole method of communication, exporting your entire chat history or even just telling it to read back to a certain time-stamp works really well for recovering lost memory). Anyways, I decided to look into ways to improve its memory, and stumbled across some reddit posts and articles that seemed like a good place to start. I swapped my method from using a standard markdown file and storing every 4 hours + on command to a style of indexing memories with the idea of building in a decay system for the memories and a recall and search function. (Nothing new in the space, but it was fun to learn myself). That’s how my first project was born- Antaris-Memory. It indexes its memories based on priority, and uses local sharded JSONL storage. When it need to recall something, it utilizes BM25 and decay-weighted searching, and narrows down the top 5-10 memories based on the context of the conversation. That was my first module. No RAG, no Vector DB, just persistent file based memory. Now I’m on V3.0 of antaris-suite, a six Python packages that handles the infrastructure layer of an agent from memory, safety, routing, and context using pipeline coordination and shared contracts. Zero external dependencies on the core packages. No pulling memories from the cloud, no using other LLMs to sort through them, no API keys, nothing. Which, it turns out, makes it insanely fast. “`bash If you use OpenClaw: there’s a native plugin. **What each package actually does:** **Antaris-Memory**
**Antaris-Guard**
**Antaris-Router**
-**Antaris-context**
**Antaris Pipeline**
**Antaris-Contract**
— **Benchmarks (Mac Mini M4, 10-core, 32GB):** The Antaris vs mem0 numbers are a direct head-to-head on the same machine with a live OpenAI API key — 50 synthetic entries, varying corpus sizes (50, 100, 100,000, 500,000, 1,000,000,10 runs averaged. Letta and Zep were measured separately (different methodology — see footnotes). Even with a full pipeline turn of guard + recall + context + routing + ingest antaris measured at 1,000-memory corpus. mem0 figure = measured search p50 (193ms) + measured ingest per entry (312ms). LangChain ConversationBufferMemory: its fast because it’s a list append + recency retrieval — not semantic search. At 1,000+ memories it dumps everything into context. Not equivalent functionality. Zep Cloud measured via cloud API from a DigitalOcean droplet (US-West region). Network-inclusive latency. Letta self-hosted: Docker + Ollama (qwen2.5:1.5b + nomic-embed-text) on the same DigitalOcean droplet. Each ingest generates an embedding via Ollama. Not a local in-process comparison. Benchmark scripts are in the repo. For the antaris vs mem0 numbers specifically, you can reproduce them yourself in about 60 seconds: “`bash **Engineering decisions worth noting:** – Storage is plain JSONL shards + a WAL. Readable, portable, no lock-in. At 1M entries bulk ingest runs at ~11,600 items/sec with near-flat scaling (after bulk_ingest fix). — GitHub: https://github.com/Antaris-Analytics/antaris-suite Website: https://antarisanalytics.ai/ Original README and the original idea for the architecure. At the time we believe this to be a novel solution to the Agent Amnesia problem, and also we’ve discovered a lot of these idea have been discussed before, good amount of them never have, like our Dream State Processing.
Happy to answer questions on architecture, the benchmark methodology, or anything that looks wrong. <3 Antaris submitted by /u/fourbeersthepirates |