RubricBench Exposes a Big Flaw in AI Grading

RubricBench measures how far AI-generated grading rubrics drift from human standards—and shows why automated evaluation can misfire.

Liked Liked