Model Interpretation with Microsoft’s Interpret ML

Original article was published on Artificial Intelligence on Medium

Model Interpretation with Microsoft’s Interpret ML

Photo by Mauro Sbicego on Unsplash

The predictive power of a machine learning model and its interpretability have long been considered as opposite. But not anymore ! For the past two or three years, there has been renewed interest from researchers, the industry and more broadly the data science community, to make machine learning more transparent, or even make it “white box”.. I have written series of articles on Model Interpretation — ELI5 & Permutation Importance & LIME explaining the need for Model Interpretation & technique’s involved. In this article , I will talk discuss about new package from Microsoft InterpretML


Interpret ML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable Glassbox models and explain blackbox systems. It is collection of tools and packages with focus on Interpretation which contains lot of key explainability techniques . Interpret-text supports a collection of interpretability techniques for models trained on text data. Azureml-Interpret provides a Azureml wrapper for running all the interpretation techniques on azure.

Interpretability Approach

In Interpret the explainability algorithms are divided into two major sections — Glassbox Models & Blackbox Explanations .

Glassbox Models

Glassbox models means learning algorithms that are designed to be interpretable i.e Linear Models , Decision trees .Glassbox Models typically provides exact explainability , that is you can trace and reason about how any Glassbox makes its decisions . Besides the regular scikit learn models , Glassbox contains a new model called Explainable Boosting Machine (EBM)

Explainable Boosting Machine (EBM)

EBM is an interpretable model developed at Microsoft Research. It uses modern machine learning techniques like bagging, gradient boosting, and automatic interaction detection along with GAMs (Generalized Additive Models). This makes EBMs as accurate as state-of-the-art techniques like random forests and gradient boosted trees. However, unlike these blackbox models, EBMs produce interpretable explanations .


Linear Model are highly interpretable but often do not provide high accuracy provided by complex models .

Y = B1X1 + B2X2+ ……. + BnXn + ϵ. (1, 2, n are all subscripts)

To solve this problem , statisticians created GAMs (Generalized Additive Models) which keeps the additive structure (interpredibaility of linear model) but make them more flexible and accurate .

Y= f1(X1) + f2(X2)+f3(X3)….fn(Xn). ( 1, 2, n are all subscripts)

Microsoft Researchers enhanced GAM’s further with technique like Boosting & Bagging . EBM continues to be interpretable like Liner model while it provides accuracy at levels of complex models like XgBoost etc .

Y=∑i fi(Xi)+∑ij fij( Xi, Xj). ( i, j are all subscripts )

Underneath , EBM works in following way . Assume dataset has ’n’ features. EBM will create a tree based on Feature 1 only & using boosting algo pass on the residual to the next tree . Now train a small tree that can look at Feature 2 and pass on residual . This will continue till Feature N . This iteration of tree modelling for each feature and passing residuals can be done for eg: 5000 times . At the end of the iteration for each feature , you have 5000 trees trained on each feature . In this way, it can find the best feature function f() for each feature and shows how each feature contributes to the model’s prediction for the problem. All of these models are trained in parallel because we keep cycling through the models.

Implementation (Prediction & Interpretation)

I have provided detailed implementation of EBM Classifiers in my Github notebook . I will be using Bank Marketing Data Set — LINK. The data is related with direct marketing campaigns of a Portuguese banking institution. We need to predict if the product (bank term deposit) would be (‘yes’) or not (‘no’) subscribed by customer as part of this campaign. We start by installing InterpretML library

pip install interpret

After preprocessing of data (described in Notebook) , we will first understand the data using useful ClassHistogram() , which enables doing EDA on dataset. In order to use this, the data set should not have missing values, so ensure that preprocessing is performed beforehand

from interpret import show
from import ClassHistogram
hist = ClassHistogram().explain_data(X_train, y_train, name = 'Train Data')show(hist)

This will create a dashboard with Plotly histograms , where colours represent if customer subscribed (1) or Not subscribed (0) the product.

Summary of Training Data across all dimensions

Now , let’s try out EBM . We will fit it on Training data

from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier(random_state=42), y_train)

After fitting , its time to check Global explanations with EBM

ebm_global = ebm.explain_global(name='EBM')
Global Interpretation with EBM

Feature importance summary shows that two categorical features contact_celluar and contact_telephone are very important features . One can look into individual features also to see the impact for eg: Age feature shows that , campaign had more success with old people (60+) .

EBM can also be used for Local explanations . In below snippet , we will see explanation for first 4 records . We can notice that for record 3 predicted value is 0.085 while actual value was 0 . Even though Month of June contributed positively but not calling via Celluar and calling via telephone seems to have impacted the result negatively.

ebm_local = ebm.explain_local(X_test[:4], y_test[:4], name='EBM')
EBM Local Explanations

Prediction with EBM

We will use ROC curve to explain the prediction quality of EBM . Below code can be used to draw the ROC curve

from interpret.perf import ROCebm_perf = ROC(ebm.predict_proba).explain_perf(X_test, y_test, name=’EBM’)
ROC Curve for EBM

We can compare EBM in quality of prediction with Logistic Regression , Classification tree & Light GBM . The accuracy from EBM (AUC = 0.77)) is very close to Light GBM (AUC = 0.78) results can be find below

lr_perf = ROC(lr_model.predict_proba).explain_perf(X_test, y_test, name=’Logistic Regression’)
tree_perf = ROC(rf_model.predict_proba).explain_perf(X_test, y_test, name=’Classification Tree’)
lgbm_perf = ROC(lgb_model.predict_proba).explain_perf(X_test, y_test, name=’Light GBM’)
ebm_perf = ROC(ebm.predict_proba).explain_perf(X_test, y_test, name=’EBM’)
Comparison of EBM with other Models


Dashboard is great feature from Interpret ML which allows you to see all the results in one view .The explanations available are split into tabs, each covering an aspect of the pipeline.

  • Data covers exploratory data analysis, designed mostly for feature-level.
  • Performance covers model performance both overall and user-defined groups.
  • Global explains model decisions overall.
  • Local explains a model decision for every instance/observation
lr_global = lr_model.explain_global(name=’LR’)
tree_global = rf_model.explain_global(name=’Tree’)
show([hist, lr_global, lr_perf, tree_global, tree_perf,ebm_global,ebm_local,ebm_perf], share_tables=True)
Overview of the Dashboard

Data tab : Provides summary view & feature level view. to gain insight in data

Data summary view

Performance Tab : This can be used to see accuracy metrics of all the models being used .

Performance view

Global Tab : This can be used to view Global interpretation of all the models being used for predication in one view .

Global Interpretation view

Local Tab : This can be used to view Local interpretation of all the models being used for predication in one view.

Local Interpretation view


Microsoft’s Interpret ML provides a new state of the art Modelling technique Explainable Boosting Machine (EBM) & state of the art interpretability algorithms under a unified API . This API provides Glassbox models which are inherently intelligible and explainable to the user & Blackbox models that generate explanations for any machine learning irrespective of its complexity using Interpretation techniques like LIME , SHAP , Partial Dependence Plots etc . Dashboards provides great interactive visualisations and interpretability algorithm comparison. Now this package is also supported in R.

You can find the code on my GitHub . In next article , I will cover the Blackbox model functionality provided by Interpret ML.

If you have any questions on Model Interpretation , let me know happy to help. Follow me on Medium or Twitter if you want to receive updates on my blog posts!