[D][Showcase] MCP-powered Autonomous AI Research Engineer (Claude Desktop, Code Execution)

digitado ⋅ 7 de February de 2026

I’ve been working on an MCP-powered “AI Research Engineer” and wanted to share it here for feedback and ideas.

GitHub: https://github.com/prabureddy/ai-research-agent-mcp
If it looks useful, a ⭐ on the repo really helps more MCP builders find it.

What it does

You give it a single high-level task like:

“Compare electric scooters vs bikes for my commute and prototype a savings calculator”

The agent then autonomously:

researches the web for relevant data
queries your personal knowledge base (notes/papers/docs) via RAG
writes and executes Python code (models, simulations, visualizations) in a sandbox
generates a structured research run: report, charts, code, data, sources
self-evaluates the run with quality metrics (clarity, grounding, completeness, etc.)

It’s built specifically around MCP so you can run everything from Claude Desktop (or another MCP client) with minimal setup.

Tech / architecture

MCP server in Python 3.10+

Tools:

web_research: DuckDuckGo/Brave + scraping + content extraction
rag_tool: local embeddings + ChromaDB over a knowledge_base directory
code_sandbox: restricted Python execution with time/memory limits
workspace: organizes each research run into its own folder (report, charts, code, data, evaluation)
evaluator: simple self-critique + quality metrics per run

RAG uses local sentence-transformers by default, so you can get started without external embedding APIs.

5–10 min setup: clone → install → add MCP config to Claude Desktop → restart.

Example flows

“Deep dive: current state of EVs in 2026. Include market size, major players, growth trends, and a chart of adoption over time.”
“Use my notes in knowledge_base plus web search to analyze whether solar panels are worth it for a home in California. Build a payback-period model and visualize cashflows.”
“Use web_research + RAG + code execution to build a small cost-of-ownership calculator for my commute.”

Why I’m posting here

I’d really appreciate feedback from this community on:

MCP design:

Does the tool surface / boundaries make sense for MCP?
Anything you’d change about how web_research / rag_tool / code_sandbox are exposed?

Safety & sandboxing:

Are there better patterns you’ve used for constrained code execution behind MCP?
Any obvious gotchas I’m missing around resource limits or isolation?

RAG + research UX:

Suggestions for better chunking/query strategies in this “research agent” context?
Patterns you’ve used to keep the agent grounded in sources while still being autonomous?

Extensibility:

Other tools you’d add to a “research engineer” server (data connectors, notebooks, schedulers, etc.)?
Thoughts on integrating with other MCP clients beyond Claude Desktop / Cursor?

If you have time to glance at the repo and tear it apart, I’d love to hear what you think. Happy to answer implementation questions or discuss MCP patterns in more detail.

If you end up trying it and think it’s useful, please consider dropping a ⭐ on the GitHub repo and sharing any ideas/issues there as well.

Thanks!

MCP-Powered AI Research Engineer

https://preview.redd.it/kwh5dbntczhg1.png?width=1074&format=png&auto=webp&s=2c7729e95890dce291ad8e635feca5a2805583b2

https://preview.redd.it/4e0nlantczhg1.png?width=1076&format=png&auto=webp&s=f1e3f3eabe67ff887c8ca994f0090c74989621f6

https://preview.redd.it/zx4v3puuczhg1.png?width=4168&format=png&auto=webp&s=f798447d3b5bf5510400b832af96161488c4e25c

https://preview.redd.it/bmec8quuczhg1.png?width=3702&format=png&auto=webp&s=6a8fe3d1c47a464c6f733cfa4c2463d25ccd5d5b

https://preview.redd.it/3zv5hnuuczhg1.png?width=3568&format=png&auto=webp&s=162f410cc6edd2b46bd1c0a8f36a7e4a0afb9e12

submitted by /u/Kooky-Second2410
[link] [comments]