How Subsets of the Training Data Affect a Prediction
I was quite excited by the title of a new paper, on pre-publication this month. “Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction” by Andreas Brandsæter and Ingrid K. Glad, at first glance, appeared to have some close alignment to my own work CHIRPS: Explaining random forest classification, published earlier this year in June. It’s generally highly desirable to connect with other researchers with which you share common ground, working contemporaneously. Often, fruitful collaborations are born.
As it turns out, the authors have taken a fairly different approach to mine. The CHIRPS method discovers a large, high precision subset of neighbours in the training data, using a minimal number of constraints, that share the same classification from the model, and returns robust statistics that proxy for precision and coverage. Brandsæter and Glad’s method is a novel approach that works with regression and time series problems, and pre-supposes that there are subsets in the data (that may or may not be adjacent) that can be set up in advance to reveal regions of influence on the final prediction of a given data point. We share a recognition of the importance of interpretability in AI and machine learning, especially in critical applications.
Tthe authors propose a methodology that uses Shapley values to measure the importance of different training data subsets in shaping model predictions. Shapley values, originating from coalitional game theory, are adapted here to quantify the contribution of each subset of training data as if each subset were a “player” influencing the outcome of the model’s prediction. This approach offers a fresh perspective by directly associating predictions with specific training data subsets, which can reveal patterns or biases that feature-based explanations might miss.
The paper delves into the theoretical framework of Shapley values in a coalitional game context and extends this to analyze subset importance. The authors describe how their methodology can pinpoint the impact of specific subsets on predictions, facilitating insights into model behavior, training data errors, and potential biases. By using subsets rather than individual data points or features, this approach is particularly well-suited to models that rely on large, high-dimensional datasets where feature importance alone may not fully capture influential patterns. This method is demonstrated to be useful in understanding how similar predictions may stem from different subsets of data, emphasizing the complex interactions within training data that influence predictions.
Through several case studies, the paper demonstrates how Shapley values for subset importance can be applied in real-world scenarios. For example, in time series data and autonomous vehicle predictions, subsets of training data based on chronological segmentation reveal how specific periods contribute to model outputs. This approach is shown to be valuable for identifying anomalies or segment-specific patterns that could affect model accuracy or introduce biases. Additionally, by explaining the squared error for predictions, the authors illustrate how this methodology can also diagnose errors in training data, which could improve overall model reliability.
The authors discuss limitations and challenges, particularly around the computational complexity of retraining models on multiple subsets to calculate Shapley values. They suggest that, while computationally intensive, this process can be optimized with parallel processing and may not need to be repeated for each new test instance. They also propose potential applications of this methodology in tailoring training data acquisition strategies, such as for cases where predictions are most critical, which can improve model performance by selectively sampling from influential subsets.
In conclusion, Brandsæter and Glad’s paper represents a significant advancement in explainable AI by emphasizing the training data’s impact on model predictions. By shifting focus to data-centric explanations, their approach highlights how subsets within the data contribute directly to individual predictions, expanding the interpretative toolkit beyond traditional feature importance. This approach aligns with my own work on CHIRPS, underscoring the notion that providing contextual information from training data strengthens model transparency and interpretability. Using training data as a reference framework enables explainable AI methods to draw on established statistical theory, which ultimately lends robustness to explanations, even in black-box models. Together, these methods suggest a promising direction for explainable AI, wherein training data subsets serve as crucial elements to understand and elucidate model behavior effectively.