Revisiting the Performance-Explainability Trade-Off

Julian Hatwell
Sep 01, 2023

I was very excited to read and review the paper Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI) last month. I wrote an extensive section on this topic for my Ph.D. thesis (although I coined the name Accuracy-Interpretability Trade-Off or AITO). I have always felt that the subject is too rarely discussed, and never in enough depth and scientific rigour. This Performance-Explanability Trade-off (PET) in the notion that improving model performance (by this, they must mean accuracy or related measures such as true positive rate or AUC/ROC) comes at the cost of explainability.

The authors of this paper state their goal as wanting to refine the discussion of PET in the field of Requirements Engineering for AI systems. Frankly, the paper is quite generic with respect to this self-stated niche once the text gets going, although that in no way detracts from their position on the topic itself. For the most part, the paper focuses on Cynthia Rudin’s influential critique of the performance-explainability trade-off. Rudin is described by the authors as being particularly critical of post-hoc explainability techniques, arguing that they can produce misleading or incomplete explanations that fail to remain faithful to the model’s decision-making process. Again, this was also a foundational point in my thesis on XAI; what good is an explanation of an output other than what the model gave? Proxy (simplified) explanatory models are particularly prone to this behaviour.

Rudin also contends that interpretable models can often match the performance of black-box models, provided that sufficient effort is invested in knowledge discovery and feature engineering. This phenomenon is known as the Rashomon Set argument, which posits that for many real-world tasks, there exist multiple high-performing models, including some that are inherently explainable. The authors argue that while this is an intriguing theoretical claim, it lacks strong empirical backing and does not guarantee that such explainable models will be easily identifiable or practical to develop in all domains. On this point, I find myself in total agreement with the authors. The Rashomon Set argument is merely conjecture from what I can tell and it’s something I would like to revisit in a future blog post.

The authors’ strongest arguments lie in the fact that analyst/researcher-led feature engineering has been vastly superseded and overpowered by the capabilities of deep learning, which hinges on a feature self-learning paradigm built into the model. It’s just so much faster to build a deep neural layer, that researcher/analyst time a resource can be freed up and expended on making post-hoc explainability much for feasible. The authors argue that the real issue is not just whether performance and explainability are in tension, but how much effort is required to achieve both. They suggest that model development should be viewed as a multi-objective optimization problem, where teams must balance the trade-offs between performance, explainability, and available resources, while also considering domain-specific risks such as ethical concerns or financial constraints. From this more nuanced position, they are able to derive an extended framework called PET+ (Performance-Explainability-Time trade-off), which incorporates time and resource constraints into the equation.

I appreciate and commend the reflection on this chronically overlooked and misunderstood topic and hope that their paper contributes towards frameworks for evaluating modeling approaches in the future.

Tags:

Categories: