LLM Alignment Should Go Beyond Harmlessness–Helpfulness and Incorporate Human Agency

digitado ⋅ 10 de March de 2026

Large Language Models are transforming communication, research, and decision-making, but misalignment – when models diverge from human values, safety requirements, or user intent – poses serious risks. In this position paper, we argue that many alignment failures stem from operational choices in training and deployment. We posit that alignment should shift from static, post-training constraints toward dynamic, participatory approaches that safeguard pluralism, autonomy, and human flourishing. We outline forward-looking directions, including pluralistic evaluation, transparency, and the Flourishing–Justice–Autonomy (FJA) framework, and present a roadmap for advancing alignment research and practice.

Like 0

Liked Liked