TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces
arXiv:2603.00623v1 Announce Type: new Abstract: Agentic systems augment large language models with external tools and iterative decision making, enabling complex tasks such as deep research, function calling, and coding. However, their long and intricate execution traces make failure diagnosis and root cause analysis extremely challenging. Manual inspection does not scale, while directly applying LLMs to raw traces is hindered by input length limits and unreliable reasoning. Focusing solely on final task outcomes further discards critical behavioral information required […]