Implicit Bias in LLMs for Transgender Populations

arXiv:2602.13253v1 Announce Type: new
Abstract: Large language models (LLMs) have been shown to exhibit biases against LGBTQ+ populations. While safety training may lessen explicit expressions of bias, previous work has shown that implicit stereotype-driven associations often persist. In this work, we examine implicit bias toward transgender people in two main scenarios. First, we adapt word association tests to measure whether LLMs disproportionately pair negative concepts with “transgender” and positive concepts with “cisgender”. Second, acknowledging the well-documented systemic challenges that transgender people encounter in real-world healthcare settings, we examine implicit biases that may emerge when LLMs are applied to healthcare decision-making. To this end, we design a healthcare appointment allocation task where models act as scheduling agents choosing between cisgender and transgender candidates across medical specialties prone to stereotyping. We evaluate seven LLMs in English and Spanish. Our results show consistent bias in categories such as appearance, risk, and veracity, indicating stronger negative associations with transgender individuals. In the allocation task, transgender candidates are favored for STI and mental health services, while cisgender candidates are preferred in gynecology and breast care. These findings underscore the need for research that address subtle stereotype-driven biases in LLMs to ensure equitable treatment of transgender people in healthcare applications.

Liked Liked