Content area
As the application of Machine Learning (ML) models continues to expand across various domains, it becomes increasingly important to ensure the fairness of these models' function. While measuring fairness metrics is straightforward in the classification domain, it remains complex and computationally intractable in regression. To address the computational intractability challenge, prior research proposed various methods to approximate fairness metrics in regression, but their consistency remain an open question. To fill this gap, this dissertation investigates the consistency of fairness measurement methods in regression tasks through the following studies. The first study examines the consistency of the outcome of various fairness measurement methods. The experimental results reveal varying levels of consistency, with some methods, particularly, the probabilistic classification-based density ratio estimation approach, exhibit relatively poor consistency in certain cases. Then, the second study focuses on the probabilistic classification-based density ratio estimation approach to fairness measurement and explores the sensitivity of its outcome to the choice of underlying classifiers. Results demonstrate that the use of different classifiers could impact fairness values, leading to inconsistent measurements in certain circumstances. The third study analyzes alternative density ratio measurement approaches beyond the probabilistic classification-based one. The experimental results indicate concerning inconsistencies among various density ratio estimation-based approaches, raising fundamental questions about their reliability for fairness measurement in regression. To gain deeper insight, the fourth study investigates whether data distributions could contribute to such inconsistencies by generating synthetic datasets with varying distributions. The findings suggest that inconsistencies may indeed arise from the data distribution in certain cases.