Evaluating LLMs as Human Surrogates in Controlled Experiments
arXiv:2604.15329v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to simulate human responses in behavioral research, yet it remains unclear when LLM-generated data support the same experimental inferences as human data. We evaluate this by directly comparing off-the-shelf LLM-generated responses with human responses from a canonical survey experiment on accuracy perception. Each human observation is converted into a structured prompt, and models generate a single 0–10 outcome variable without task-specific training; identical statistical analyses are […]