Is RLHF fundamentally broken? Paid labelers rating synthetic scenarios doesn’t seem like real human feedback to me
Every major AI model goes through RLHF — thousands of paid contractors rating AI outputs to teach models what good looks like. But here’s what bothers me: These contractors are paid per task — incentivized to finish fast not feel deeply. They’re rating synthetic scenarios not real emotional situations. They burn out after thousands of repetitive evaluations. The result is AI that passes every benchmark but fails every real human moment. OpenAI spent $100M+ on this process. And […]