Source: Deep Learning on Medium
Idea Behind LIME and SHAP
The intuition behind ML interpretation models
In machine learning, there has been a trade-off between model complexity and model performance. Complex machine learning models e.g. deep learning (that perform better than interpretable models e.g. linear regression) have been treated as black boxes. Research paper by Ribiero et al (2016) titled “Why Should I Trust You” aptly encapsulates the issue with ML black boxes. Model interpretability is a growing field of research. Please read here for the importance of machine interpretability. This blog discusses the idea behind LIME and SHAP.
This blog is not about how to use/code/interpret LIME or SHAP (there are plenty of good resources on this). These resources show how to use the LIME and SHAP values given to the independent variables, individualistically (one LIME/SHAP value for one variable for one data point). This blog is about understanding how do we get these values and what are we actually doing when using the LIME and SHAP library. Before jumping to LIME and then SHAP, few primers for better understanding.
Local vs Global Interpretability
In linear regression models, Beta coefficients explain the prediction for all the data points (if variable value increases by 1, prediction increases by Beta for every data point). This is global fidelity. In a causal analysis, it is called “average” causal analysis. However, this does not explain individual differences (heterogeneity). The effect of a variable for one user could be different from another user. This is local fidelity (explanation for individual data point or local subsection of the joint distribution of independent variables). Local function explanations (expectedly) could be more accurate than global explanations because, in a local region, the function (higher probability) is linear and monotonic. Even local regions could be highly non-linear and local explainability would fail — limitations of such methods. (This also is a part of smoothness assumption — without this, it is a cannot do many situations as many optimization models would fail). LIME and SHAP explore and use the property of local explainability to build surrogate models to black-box machine learning models to provide them interpretability. Few examples in Figure 1.
LIME and SHAP are surrogate models (Figure 1). It means they still use the black-box machine learning models. They tweak the input slightly (like we do in sensitivity tests) and test the changes in prediction. This tweak has to be small so that it is still close to the original data point (or in the local region). LIME and SHAP models are surrogate models that model the changes in the prediction (on the changes in the input). For example, if the model prediction does not change much by tweaking the value of a variable, that variable for that particular data point may not be an important predictor. Since surrogate models still treat the ML models as a black box, these are model agnostic.
Interpretable Data Representation
A data point has to be converted into a format that is easier to work with for building surrogate models (sampling data points in the neighborhood of the original data point). This representation is called interpretable (in the LIME paper) as it is understandable to humans (it converts the data in binary). As shown in Figure 2, x: d-dimensional data is converted into x’: d’-dimensional binary vector data for its interpretable representation. Some of the examples are given in Figure 2. Figure 2 also shows how LIME uses interpretable data format in surrogate models for interpretability. The objective (loosely) is to minimize the difference in response (prediction) between x and its neighbor.
Now we have the background to understand LIME. If we select a point x, we can draw samples (z’) around x by switching off some of the binary dimensions (from x’) representation (weighted by proximity measure). Once we get the sample, recover a variable z from z’. For example, (1) let f(z) be the deep learning model that detects if a sentence is hateful or not. Our dataset contains sentences. (2) If we consider a data point (a sentence x), we first convert it into a format of x’ (words present or absent). (3) From x’, we sample z’ around the neighborhood of x’ (by switching off some of the binary vectors uniformly). (4) These samples are converted to z (the sentence is recovered). In z, some of the words may not be present. (5) Using z as input, get the value f(z). Figure 3 shows an example of sampling. It also covers what is LIME “value” for the individual data point (and for each variable) and how we get it. It shows how do we get global importance from the local importance (explainability) of a variable. It loosely means adding up the local weights such that even the switched off dimensions are maximally covered (useful to keep in mind as we go through SHAP). As mentioned before, there are plenty of good resources that tell us how to use (or interpret) these values so this blog will not cover it.
Disadvantages of LIME
Although LIME has the desirable property of additivity (sum of the individual impact is equal to the total impact), it has got some criticism on lack of stability, consistency (changing the model does not decrease the attribution of a variable if its contribution increases or remains the same) and missingness (missing variable should have 0 attribution). All three properties are fulfilled by SHAP (hence more commonly used). Also, LIME needs to define “local”.
SHAP values use a similar concept to LIME. SHAP provides theoretical guarantees based on the game theory concept of Shapley values. Please see this short video on Shapley value before reading further to understand SHAP. You can also see this for the theoretical background of Shapley value.
SHAP stands for SHapley Additive exPlanation. “Additive” is an important key term. Like LIME, SHAP has additive attribution property. The sum of SHAP values of all the variables for a data point is equal to the final prediction. SHAP can be understood keeping LIME in mind as explained in Figure 4. Figure 4 shows how SHAP values are calculated. In SHAP, we do not have to build a local model (like linear regression in LIME), rather the same function f(x) is used to calculate the shapley values for each dimension.
Shapley value guarantees a fair distribution of contribution for each of the variables (LIME do not provide the guarantee). LIME assumes that the local model is linear, SHAP does not have any such assumptions. SHAP value calculation is very time expensive (as it checks all the possible combinations: it does it through monte Carlo simulations rather than brute force). SHAP value is NOT the difference between the prediction with and without a variable, rather it is a contribution of a variable to the difference between the actual prediction and the mean prediction. The variable importance at a global level is given by adding the absolute value of the SHAP values for each individual data point. Although SHAP uses all the variables, we can select some variables with higher variable importance, drop other variables and rerun SHAP if we want to reduce the number of variables (because of the property like consistency, the order of variable importance will not change so less important variables can be ignored).
I hope the background of what we are doing in LIME and SHAP is clearer. Both are model agnostic and the library is available for standard machine learning models. Due to its theoretical guarantees and simplicity, SHAP is widely used and maybe more acceptable . In LIME, we need to define how we are considering a “neighbor”. Also, we build a linear local model which might not be linear in a very complicated decision surface (even at a local level). In SHAP, we can use the same model we trained using the training data.