DDL2PropBank Agent: Benchmarking Multi-Agent Frameworks’ Developer Experience Through a Novel Relational Schema Mapping Task
arXiv:2602.11198v1 Announce Type: new Abstract: Multi-agent frameworks promise to simplify LLM-driven software development, yet there is no principled way to evaluate their developer experience in a controlled setting. We introduce DDL2PropBank, a novel benchmark task that maps relational database schemas to PropBank rolesets, requiring autonomous retrieval of candidate frames and fine-grained linguistic reasoning over table names, columns, and relations. Using the Agent-as-a-Tool pattern, we implement identical agent logic across 10 frameworks and evaluate along two dimensions: (i) code […]