Hey r/MachineLearning,
I’ve been working on an MCP-powered “AI Research Engineer” and wanted to share it here for feedback and ideas.
GitHub: https://github.com/prabureddy/ai-research-agent-mcp
If it looks useful, a ⭐ on the repo really helps more MCP builders find it.
What it does
You give it a single high-level task like:
“Compare electric scooters vs bikes for my commute and prototype a savings calculator”
The agent then autonomously:
- researches the web for relevant data
- queries your personal knowledge base (notes/papers/docs) via RAG
- writes and executes Python code (models, simulations, visualizations) in a sandbox
- generates a structured research run: report, charts, code, data, sources
- self-evaluates the run with quality metrics (clarity, grounding, completeness, etc.)
It’s built specifically around MCP so you can run everything from Claude Desktop (or another MCP client) with minimal setup.
Tech / architecture
MCP server in Python 3.10+
Tools:
web_research: DuckDuckGo/Brave + scraping + content extraction
rag_tool: local embeddings + ChromaDB over a knowledge_base directory
code_sandbox: restricted Python execution with time/memory limits
workspace: organizes each research run into its own folder (report, charts, code, data, evaluation)
evaluator: simple self-critique + quality metrics per run
RAG uses local sentence-transformers by default, so you can get started without external embedding APIs.
5–10 min setup: clone → install → add MCP config to Claude Desktop → restart.
Example flows
- “Deep dive: current state of EVs in 2026. Include market size, major players, growth trends, and a chart of adoption over time.”
- “Use my notes in
knowledge_base plus web search to analyze whether solar panels are worth it for a home in California. Build a payback-period model and visualize cashflows.”
- “Use
web_research + RAG + code execution to build a small cost-of-ownership calculator for my commute.”
Why I’m posting here
I’d really appreciate feedback from this community on:
MCP design:
- Does the tool surface / boundaries make sense for MCP?
- Anything you’d change about how
web_research / rag_tool / code_sandbox are exposed?
Safety & sandboxing:
- Are there better patterns you’ve used for constrained code execution behind MCP?
- Any obvious gotchas I’m missing around resource limits or isolation?
RAG + research UX:
- Suggestions for better chunking/query strategies in this “research agent” context?
- Patterns you’ve used to keep the agent grounded in sources while still being autonomous?
Extensibility:
- Other tools you’d add to a “research engineer” server (data connectors, notebooks, schedulers, etc.)?
- Thoughts on integrating with other MCP clients beyond Claude Desktop / Cursor?
If you have time to glance at the repo and tear it apart, I’d love to hear what you think. Happy to answer implementation questions or discuss MCP patterns in more detail.
If you end up trying it and think it’s useful, please consider dropping a ⭐ on the GitHub repo and sharing any ideas/issues there as well.
Thanks!
MCP-Powered AI Research Engineer
https://preview.redd.it/kwh5dbntczhg1.png?width=1074&format=png&auto=webp&s=2c7729e95890dce291ad8e635feca5a2805583b2
https://preview.redd.it/4e0nlantczhg1.png?width=1076&format=png&auto=webp&s=f1e3f3eabe67ff887c8ca994f0090c74989621f6
https://preview.redd.it/zx4v3puuczhg1.png?width=4168&format=png&auto=webp&s=f798447d3b5bf5510400b832af96161488c4e25c
https://preview.redd.it/bmec8quuczhg1.png?width=3702&format=png&auto=webp&s=6a8fe3d1c47a464c6f733cfa4c2463d25ccd5d5b
https://preview.redd.it/3zv5hnuuczhg1.png?width=3568&format=png&auto=webp&s=162f410cc6edd2b46bd1c0a8f36a7e4a0afb9e12