From Bias Mitigation to Bias Negotiation: Governing Identity and Sociocultural Reasoning in Generative AI
arXiv:2602.18459v1 Announce Type: new
Abstract: LLMs act in the social world by drawing upon shared cultural patterns to make social situations understandable and actionable. Because identity is often part of the inferential substrate of competent judgment, ethical alignment requires regulating when and how systems invoke identity. Yet the dominant governance regime for identity-related harm remains bias mitigation, which treats identity primarily as a source of measurable disparities or harmful associations to be detected and suppressed. This leaves underspecified a positive, context-sensitive role for identity in interpretation. We call this governance problem bias negotiation: the normative regulation of identity-conditioned judgments of sociocultural relevance, inference, and justification. Empirically, we probe the feasibility of bias negotiation through semi-structured interviews with multiple publicly deployed chatbots. We identify recurring repertoires for negotiating identity including probabilistic framing of group tendencies and harm-value balancing. We also observe failure modes in which models avoid hard tradeoffs or apply principles inconsistently. Bias negotiation matters for justice because a positive role for sociocultural reasoning is required to recognize and potentially remediate structural inequities. But it is equally implicated in core model functionality as sociocultural competence is needed for systems that operate across heterogeneous cultural contexts. Because bias negotiation is a procedural capability expressed through deliberation and interaction, it cannot be validated by static benchmarks alone. To support targeted training, we introduce a broad but explicit framework that decomposes bias negotiation into an action space of negotiation moves (what to observe and score) and a complementary set of case features (over which the model negotiates), enabling systematic test-suite design and evaluation.