Optimizing Learning-to-Rank Models for Ex-Post Fair Relevance
Learning-to-rank (LTR) models rank items based on specific features, aiming to maximize ranking utility by prioritizing highly relevant items. However, optimizing only for ranking utility can lead to representational harm and may fail to address implicit bias in relevance scores. Prior studies introduced algorithms to train stochastic ranking models, such as the Plackett-Luce ranking model, that maximize expected ranking utility while achieving fairness in expectation (ex-ante fairness). Still, every sampled ranking may not satisfy group fairness (ex-post fairness). Post-processing methods ensure ex-post fairness; however, the LTR model lacks awareness of this step, creating a mismatch between the objective function the LTR model optimizes and the one it is supposed to optimize. In this paper, we first propose a novel objective where the relevance (or the expected ranking utility) is computed over only those rankings that satisfy given representation constraints for groups of items. We call this the ex-post fair relevance. We then give a framework for training Group-Fair LTR models to maximize our proposed ranking objective.
Leveraging an efficient sampler for ex-post group-fair rankings and efficient algorithms to train the Plackett-Luce LTR model, we demonstrate their use in training the Group-Fair Plackett-Luce model in our framework. Experiments on MovieLens and Kiva datasets reveal improved fairness and relevance with our group-fair Plackett-Luce model compared to post-processing. In scenarios with implicit bias, our algorithm generally outperforms existing LTR baselines in both fairness and relevance.