plotDependence

Plot dependence of Shapley values on predictor values

Since R2024b

collapse all in page

Syntax

plotDependence(explainer,predictor)

plotDependence(explainer,predictor,Name=Value)

plotDependence(ax,___)

p = plotDependence(___)

Description

plotDependence(explainer,predictor) returns a dependence plot for the predictor specified by predictor and the Shapley values in the shapley object explainer. The plot contains Shapley values for the query points in explainer.QueryPoints.

If predictor specifies a categorical predictor (explainer.CategoricalPredictors), then the function displays a box plot of the corresponding Shapley values for each category. Each box plot displays: the median, the lower and upper quartiles, any outliers (computed using the interquartile range), and the minimum and maximum values that are not outliers.
If predictor specifies a noncategorical predictor, then the function displays a scatter plot of the corresponding Shapley values.

If explainer.BlackboxModel is a classification model, the function displays a plot for class explainer.BlackboxModel.ClassNames(1) by default.

example

plotDependence(explainer,predictor,Name=Value) specifies additional options using one or more name-value arguments. For example, use color to display a second predictor in the plot by specifying the ColorPredictor name-value argument.

example

plotDependence(ax,___) displays the dependence plot in the target axes ax. Specify ax as the first argument in any of the previous syntaxes.

p = plotDependence(___) returns a Box or Scatter object. Use p to query or modify the properties (BoxChart Properties or Scatter Properties) of an object after you create it.

Examples

collapse all

Shapley Dependence Plot for One Predictor

Open Live Script

Train a classification model and create a shapley object. Use the fit object function to compute the Shapley values for the specified query points. Then for each predictor, visualize the dependence of the Shapley values on the predictor values by using the plotDependence object function.

Load the CreditRating_Historical data set. The data set contains customer IDs and their financial ratios, industry labels, and credit ratings.

tbl = readtable("CreditRating_Historical.dat");

Display the first three rows of the table.

head(tbl,3)

     ID      WC_TA    RE_TA    EBIT_TA    MVE_BVTD    S_TA     Industry    Rating
    _____    _____    _____    _______    ________    _____    ________    ______

    62394    0.013    0.104     0.036      0.447      0.142       3        {'BB'}
    48608    0.232    0.335     0.062      1.969      0.281       8        {'A' }
    42444    0.311    0.367     0.074      1.935      0.366       1        {'A' }

Train a blackbox model of credit ratings by using the fitcecoc function. Use the variables from the second through seventh columns in tbl as the predictor variables. A recommended practice is to specify the class names to set the order of the classes.

blackbox = fitcecoc(tbl,"Rating", ...
    PredictorNames=tbl.Properties.VariableNames(2:7), ...
    CategoricalPredictors="Industry", ...
    ClassNames={'AAA','AA','A','BBB','BB','B','CCC'});

Create a shapley object that explains the predictions for multiple query points. For faster computation, shapley subsamples 100 observations from the predictor data in blackbox to compute the Shapley values. Specify the sampled observations as the query points in the call to the fit object function.

rng("default") % For reproducibility
explainer = shapley(blackbox);
queryPoints = explainer.X(explainer.SampledObservationIndices,:);
explainer = fit(explainer,queryPoints);

Visualize the Shapley values for a specified predictor by using the plotDependence object function.

predictor = "MVE_BVTD";
plotDependence(explainer,predictor)

By default, the function shows the Shapley values for the first class, AAA. For noncategorical predictors, the function displays a scatter plot, where the x-axis corresponds to the predictor values and the y-axis corresponds to the Shapley values for the predictor.

For class AAA, the Shapley values for the MVE_BVTD predictor tend to increase as the predictor values increase from 0 to 4. For MVE_BVTD values greater than 4, the corresponding Shapley values tend to remain constant (between 1.5 and 2).

For categorical predictors, plotDependence displays box plots for each category in the categorical predictor. The function determines categorical predictors based on the CategoricalPredictors property of the shapley object.

Visualize the Shapley values for the categorical predictor Industry. Specify the class.

class = "A";
plotDependence(explainer,"Industry",ClassName=class)

For class A, the distribution of the Shapley values varies across different industries. For example, industry 3 has exclusively positive Shapley values, whereas industry 9 has exclusively negative Shapley values.

Shapley Dependence Plot with Additional Color Predictor

Open Live Script

Train a regression model and create a shapley object using multiple query points. Then for each predictor, visualize the dependence of the Shapley values on the predictor values. Use color to see the dependence on a second predictor.

Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s.

load carbig

Create a table containing the predictor variables Acceleration, Cylinders, and so on, as well as the response variable MPG.

tbl = table(Acceleration,Cylinders,Displacement, ...
    Horsepower,Model_Year,Weight,MPG);

Removing missing values in a training set helps to reduce memory consumption and speed up training for the fitrkernel function. Remove missing values in tbl.

tbl = rmmissing(tbl);

Train a blackbox model of MPG by using the fitrkernel function. Specify the Cylinders and Model_Year variables as categorical predictors. Standardize the remaining predictors.

mdl = fitrkernel(tbl,"MPG",CategoricalPredictors=[2 5], ...
    Standardize=true);

Create a shapley object that explains the predictions for multiple query points. Because mdl does not contain training data, specify to compute Shapley values using the predictor data in tbl. For faster computation, specify to subsample 200 observations from tbl. Use all observations in tbl as query points.

explainer = shapley(mdl,tbl,NumObservationsToSample=200, ...
    QueryPoints=tbl);

Visualize the Shapley values for a specific predictor by using the plotDependence object function. Use color to display a second predictor. Note that if you want to specify a color predictor, the x-axis predictor must be a noncategorical predictor.

predictor = "Weight";
colorPredictor = "Horsepower";
plotDependence(explainer,predictor,ColorPredictor=colorPredictor)

Figure contains an axes object. The axes object with title Shapley Dependence Plot, xlabel Weight, ylabel Shapley Values for Weight contains 2 objects of type scatter, constantline.

For Weight values between 2000 and 4000, the corresponding Shapley values tend to decrease as the Weight values increase. Based on the color of the points in the plot, Horsepower values tend to increase as Weight values increase.

Input Arguments

collapse all

`explainer` — Object explaining blackbox model
`shapley` object

Object explaining the blackbox model, specified as a shapley object. explainer must contain Shapley values; that is, explainer.Shapley must be nonempty.

`predictor` — Predictor variable
positive integer scalar | character vector | string scalar

Predictor variable to plot, specified as a positive integer scalar, character vector, or string scalar.

If you specify a positive integer scalar, it must be the index value corresponding to a column in the predictor data explainer.X.
If you specify a character vector or string scalar, it must be the name of a predictor variable. When explainer.BlackboxModel is a machine learning model object, the name must match one of the names in the PredictorNames property of the model (explainer.BlackboxModel.PredictorNames). When explainer.BlackboxModel is a custom model specified as a function handle, the name must match one of the variable names in explainer.X.

Example: "x1"

Data Types: single | double | char | string

`ax` — Axes for plot
`Axes` object

Axes for the plot, specified as an Axes object. If you do not specify ax, then plotDependence creates the plot using the current axes. For more information on creating an Axes object, see axes.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: plotDependence(explainer,"x1",ColorPredictor="x3",ColorMap="abyss") creates a scatter plot of Shapley values for the numeric predictor x1 and uses the x3 predictor to color the points with the abyss colormap.

`ClassName` — Class label to plot
`explainer.BlackboxModel.ClassNames(1)` (default) | numeric scalar | logical scalar | character vector | string scalar | categorical scalar

Class label to plot, specified as a numeric scalar, logical scalar, character vector, string scalar, or categorical scalar. The value and data type of ClassName must match one of the class names in the ClassNames property of the machine learning model in explainer (explainer.BlackboxModel.ClassNames). The software accepts character vectors, string scalars, and categorical scalars interchangeably.

This argument is valid only when the machine learning model (BlackboxModel) in explainer is a classification model.

Example: ClassName="AAA"

`ColorPredictor` — Predictor variable to plot using color
`[]` (default) | positive integer scalar | character vector | string scalar

Predictor variable to plot using color, specified as a positive integer scalar, character vector, or string scalar.

If you specify a positive integer scalar, it must be the index value corresponding to a column in the predictor data explainer.X.
If you specify a character vector or string scalar, it must be the name of a predictor variable. When explainer.BlackboxModel is a machine learning model object, the name must match one of the names in the PredictorNames property of the model (explainer.BlackboxModel.PredictorNames). When explainer.BlackboxModel is a custom model specified as a function handle, the name must match one of the variable names in explainer.X.

For more information on how plotDependence maps color predictor values to the colormap, see Color Assignment for Color Predictor Values.

This argument is valid only when the variable predictor is not a categorical predictor.

Example: "x2"

Data Types: single | double | char | string

`ColorMap` — Colormap for plot
`"default"` (default) | `"bluered"` | colormap name | three-column matrix of RGB triplets

Colormap for the plot, specified as "default", "bluered", a colormap name, or a three-column matrix of RGB triplets.

A value of "default" sets the colormap to the default colormap for the target axes ax, and a value of "bluered" sets the colormap to a color scale that ranges from blue to red.
A colormap name specifies a predefined colormap, and a three-column matrix of RGB triplets specifies a custom colormap. For more information on the available colormaps and the creation of a matrix of RGP triplets, see map.

This argument is valid only when the variable predictor is not a categorical predictor, and the color predictor variable ColorPredictor is specified.

Example: ColorMap="parula"

Example: ColorMap="bluered"

Data Types: single | double | char | string

Output Arguments

collapse all

`p` — Dependence plot
`BoxChart` object | `Scatter` object

Dependence plot, returned as a BoxChart or Scatter object.

If predictor specifies a categorical predictor, then p is a BoxChart object. For more information, see BoxChart Properties.
If predictor specifies a noncategorical predictor, then p is a Scatter object. For more information, see Scatter Properties.

More About

collapse all

Shapley Values

In game theory, the Shapley value of a player is the average marginal contribution of the player in a cooperative game. In the context of machine learning prediction, the Shapley value of a feature for a query point explains the contribution of the feature to a prediction (response for regression or score of each class for classification) at the specified query point.

The Shapley value of a feature for a query point is the contribution of the feature to the deviation from the average prediction. For a query point, the sum of the Shapley values for all features corresponds to the total deviation of the prediction from the average. That is, the sum of the average prediction and the Shapley values for all features corresponds to the prediction for the query point.

For more details, see Shapley Values for Machine Learning Model.

Tips

Use plotDependence when explainer contains Shapley values for many query points.

Algorithms

collapse all

Color Assignment for Color Predictor Values

plotDependence maps color predictor values (ColorPredictor) to the colormap (ColorMap) as follows:

If the color predictor is numeric, the function maps the minimum and maximum values to the appropriate colormap endpoints, and maps the remaining values to the interior of the colormap range.
If the color predictor is nonnumeric, the function maps categories to discrete colors in the colormap.

Version History

Introduced in R2024b

plotDependence

Syntax

Description

Examples

Shapley Dependence Plot for One Predictor

Shapley Dependence Plot with Additional Color Predictor

Input Arguments

`explainer` — Object explaining blackbox model
`shapley` object

`predictor` — Predictor variable
positive integer scalar | character vector | string scalar

`ax` — Axes for plot
`Axes` object

Name-Value Arguments

`ClassName` — Class label to plot
`explainer.BlackboxModel.ClassNames(1)` (default) | numeric scalar | logical scalar | character vector | string scalar | categorical scalar

`ColorPredictor` — Predictor variable to plot using color
`[]` (default) | positive integer scalar | character vector | string scalar

`ColorMap` — Colormap for plot
`"default"` (default) | `"bluered"` | colormap name | three-column matrix of RGB triplets

Output Arguments

`p` — Dependence plot
`BoxChart` object | `Scatter` object

More About

Shapley Values

Tips

Algorithms

Color Assignment for Color Predictor Values

Version History

See Also

Topics

plotDependence

Syntax

Description

Examples

Shapley Dependence Plot for One Predictor

Shapley Dependence Plot with Additional Color Predictor

Input Arguments

explainer — Object explaining blackbox model shapley object

predictor — Predictor variable positive integer scalar | character vector | string scalar

ax — Axes for plot Axes object

Name-Value Arguments

ClassName — Class label to plot explainer.BlackboxModel.ClassNames(1) (default) | numeric scalar | logical scalar | character vector | string scalar | categorical scalar

ColorPredictor — Predictor variable to plot using color [] (default) | positive integer scalar | character vector | string scalar

ColorMap — Colormap for plot "default" (default) | "bluered" | colormap name | three-column matrix of RGB triplets

Output Arguments

p — Dependence plot BoxChart object | Scatter object

More About

Shapley Values

Tips

Algorithms

Color Assignment for Color Predictor Values

Version History

See Also

Topics

`explainer` — Object explaining blackbox model
`shapley` object

`predictor` — Predictor variable
positive integer scalar | character vector | string scalar

`ax` — Axes for plot
`Axes` object

`ClassName` — Class label to plot
`explainer.BlackboxModel.ClassNames(1)` (default) | numeric scalar | logical scalar | character vector | string scalar | categorical scalar

`ColorPredictor` — Predictor variable to plot using color
`[]` (default) | positive integer scalar | character vector | string scalar

`ColorMap` — Colormap for plot
`"default"` (default) | `"bluered"` | colormap name | three-column matrix of RGB triplets

`p` — Dependence plot
`BoxChart` object | `Scatter` object