Fairness Calibration in Credit Scoring via Counterfactual Perturbation and Group-Wise Regularization

This study develops a fairness-calibrated credit-scoring method that combines counterfactual perturbation with group-wise regularization. Using 3.1 million credit files with demographic annotations, the model evaluates whether score outputs remain stable when protected attributes are perturbed in a causal graph. A fairness-adjusted gradient-boosting model is then trained with penalties on group-level prediction disparities. The final model reduces demographic disparity in predicted default probability from 0.112 to 0.034 while maintaining an ROC-AUC of 0.89. Counterfactual-stability checks show that 94.6% of predictions remain invariant after perturbation. This demonstrates that fairness calibration can be achieved with minimal loss in predictive power.

Liked Liked