Counterfactual Explanations Help Identify Sources of Bias
By the end of 2020, the topic of eXplainable Artificial Intelligence (XAI) has become quite mainstream. One important developlment is counterfactual explanations, which (among other benefits) can to identify and reduce bias in machine learning models. Counterfactual explanations provide insights by showing how minimal changes in input features can alter model predictions. This approach has been crucial in exposing biased behavior, especially in sensitive applications like credit scoring or hiring. By identifying how protected attributes (e.g., gender or race) affect outcomes, practitioners could better address and mitigate unfair biases in AI systems (Verma et al., 2020).
Reference: Verma, S., & Rubin, J. (2020). Fairness Definitions Explained. Proceedings of the 2020 ACM/IEEE International Workshop on Software Fairness.