How to get predictor contribution to R^2 ?

Question

Amin Kassab-Bachi 2021-12-6

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1604130-how-to-get-predictor-contribution-to-r-2

评论： Walter Roberson 2024-2-26

CP.mat
w.mat

As a sensitivity analysis I calculated the mean absolute Shapley values of each predictor for each observation. But this way the values don't directly represent the (contribution) percentage of each predictor (i.e. the values from all predictors sum to more than 100%) at any observation. I'm trying to compare my results to an article where the relaimpo R-package was used to determined the R^2 % for each predictor. Using that package, the contributions of deifferent independent variables sum to the total R^2 of the model. I'd like to get that same results in MATLAB.

Thanks.

load CP

load w

rng('default') % For reproducibility

hyperopts = struct('AcquisitionFunctionName','expected-improvement-plus');

[Mdl_CP] = fitrlinear(w,CP,'OptimizeHyperparameters','auto','HyperparameterOptimizationOptions',hyperopts);

|=====================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | Lambda | Learner | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | |=====================================================================================================| | 1 | Best | 0.44111 | 0.8655 | 0.44111 | 0.44111 | 188.65 | leastsquares | | 2 | Best | 0.25511 | 0.23621 | 0.25511 | 0.2662 | 4.3771e-08 | svm | | 3 | Best | 0.25282 | 0.095941 | 0.25282 | 0.25289 | 8.3523e-08 | leastsquares | | 4 | Accept | 0.25895 | 0.10276 | 0.25282 | 0.25283 | 0.19137 | svm | | 5 | Accept | 0.25509 | 0.20684 | 0.25282 | 0.25283 | 6.9496e-05 | svm | | 6 | Accept | 0.45529 | 0.049379 | 0.25282 | 0.25291 | 263.16 | svm | | 7 | Accept | 0.25282 | 0.062419 | 0.25282 | 0.25282 | 3.2264e-08 | leastsquares | | 8 | Accept | 0.25501 | 0.13478 | 0.25282 | 0.25282 | 0.0059902 | svm | | 9 | Accept | 0.25511 | 0.19408 | 0.25282 | 0.25282 | 1.3176e-06 | svm | | 10 | Best | 0.25281 | 0.050581 | 0.25281 | 0.25279 | 6.593e-05 | leastsquares | | 11 | Accept | 0.25282 | 0.03587 | 0.25281 | 0.2528 | 5.1029e-06 | leastsquares | | 12 | Best | 0.2527 | 0.08496 | 0.2527 | 0.25273 | 0.0026352 | leastsquares | | 13 | Accept | 0.25279 | 0.12599 | 0.2527 | 0.25266 | 0.00059206 | leastsquares | | 14 | Accept | 0.25541 | 0.12623 | 0.2527 | 0.25266 | 0.035035 | svm | | 15 | Accept | 0.25282 | 0.065571 | 0.2527 | 0.25264 | 6.7407e-07 | leastsquares | | 16 | Accept | 0.25276 | 0.038044 | 0.2527 | 0.25259 | 0.0012419 | leastsquares | | 17 | Accept | 0.25282 | 0.037534 | 0.2527 | 0.2526 | 1.8086e-05 | leastsquares | | 18 | Accept | 0.25282 | 0.042249 | 0.2527 | 0.2526 | 2.6769e-08 | leastsquares | | 19 | Accept | 0.25511 | 0.11975 | 0.2527 | 0.2526 | 1.9808e-07 | svm | | 20 | Accept | 0.25282 | 0.04568 | 0.2527 | 0.2526 | 2.1026e-07 | leastsquares | |=====================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | Lambda | Learner | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | |=====================================================================================================| | 21 | Accept | 0.25276 | 0.037319 | 0.2527 | 0.25265 | 0.0012639 | leastsquares | | 22 | Accept | 0.25282 | 0.094409 | 0.2527 | 0.25265 | 1.8877e-06 | leastsquares | | 23 | Accept | 0.25281 | 0.087808 | 0.2527 | 0.25265 | 0.00019974 | leastsquares | | 24 | Accept | 0.25276 | 0.044042 | 0.2527 | 0.25268 | 0.0012435 | leastsquares | | 25 | Best | 0.2525 | 0.044093 | 0.2525 | 0.2525 | 0.0074555 | leastsquares | | 26 | Accept | 0.25263 | 0.059475 | 0.2525 | 0.2525 | 0.0043332 | leastsquares | | 27 | Accept | 0.25264 | 0.04981 | 0.2525 | 0.25254 | 0.0042096 | leastsquares | | 28 | Accept | 0.25264 | 0.047394 | 0.2525 | 0.25256 | 0.0042037 | leastsquares | | 29 | Accept | 0.32947 | 0.090975 | 0.2525 | 0.25254 | 2.5175 | leastsquares | | 30 | Accept | 0.38563 | 0.05888 | 0.2525 | 0.25259 | 7.633 | svm | __________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 18.5495 seconds Total objective function evaluation time: 3.3346 Best observed feasible point: Lambda Learner _________ ____________ 0.0074555 leastsquares Observed objective function value = 0.2525 Estimated objective function value = 0.25259 Function evaluation time = 0.044093 Best estimated feasible point (according to models): Lambda Learner _________ ____________ 0.0043332 leastsquares Estimated objective function value = 0.25259 Estimated function evaluation time = 0.05548

%%

% The Shapley function is not available in Matlab 2019a. Therefore it was executed

% in MATLAB Online using the latest software version.

for i = 1:size(w,1)

explainer_CP(i) = shapley(Mdl_CP,w,'QueryPoint',w(i,:)); % Determine predictor contribution for each observation in the dataset

ShapVal_CP(i,:) = (explainer_CP(i).ShapleyValues.ShapleyValue); % Isolate the Shapley values from the Shapley objects

end

Array formation and parentheses-style indexing with objects of class 'shapley' is not allowed. Use objects of class 'shapley' only as scalars or use a cell array.

normShapVal_CP = normalize(abs(ShapVal_CP),2,"range"); % Normalize the range of Shapley values to [0 1] for each prediction (the observations are the same as the predictions in this case because the observartions were used as QueryPoints)

meanShapVal_CP = mean(normShapVal_CP,1).*100; % Mean shapley value for individual predictors (weights)

bar(meanShapVal_CP)

title("Individual weight contribution to CP")

ylabel("Contribution(%)")

xlabel("weights")

%%

sum(meanShapVal_CP)

ans =

690.1947

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Walter Roberson 2024-2-26

在 MATLAB Online 中打开

CP.mat
w.mat

load CP

load w

rng('default') % For reproducibility

hyperopts = struct('AcquisitionFunctionName','expected-improvement-plus');

[Mdl_CP] = fitrlinear(w,CP,'OptimizeHyperparameters','auto','HyperparameterOptimizationOptions',hyperopts);

|=====================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | Lambda | Learner | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | |=====================================================================================================| | 1 | Best | 0.44111 | 0.92788 | 0.44111 | 0.44111 | 188.65 | leastsquares | | 2 | Best | 0.25511 | 0.28916 | 0.25511 | 0.2662 | 4.3771e-08 | svm | | 3 | Best | 0.25282 | 0.15965 | 0.25282 | 0.25289 | 8.3523e-08 | leastsquares | | 4 | Accept | 0.25895 | 0.11646 | 0.25282 | 0.25283 | 0.19137 | svm | | 5 | Accept | 0.25509 | 0.22451 | 0.25282 | 0.25283 | 6.9496e-05 | svm | | 6 | Accept | 0.45529 | 0.064721 | 0.25282 | 0.25291 | 263.16 | svm | | 7 | Accept | 0.25282 | 0.056241 | 0.25282 | 0.25282 | 3.2264e-08 | leastsquares | | 8 | Accept | 0.25501 | 0.11157 | 0.25282 | 0.25282 | 0.0059902 | svm | | 9 | Accept | 0.25511 | 0.10316 | 0.25282 | 0.25282 | 1.3176e-06 | svm | | 10 | Best | 0.25281 | 0.067996 | 0.25281 | 0.25279 | 6.593e-05 | leastsquares | | 11 | Accept | 0.25282 | 0.058185 | 0.25281 | 0.2528 | 5.1029e-06 | leastsquares | | 12 | Best | 0.2527 | 0.065507 | 0.2527 | 0.25273 | 0.0026352 | leastsquares | | 13 | Accept | 0.25279 | 0.05291 | 0.2527 | 0.25266 | 0.00059206 | leastsquares | | 14 | Accept | 0.25541 | 0.079854 | 0.2527 | 0.25266 | 0.035035 | svm | | 15 | Accept | 0.25282 | 0.062219 | 0.2527 | 0.25264 | 6.7407e-07 | leastsquares | | 16 | Accept | 0.25276 | 0.045887 | 0.2527 | 0.25259 | 0.0012419 | leastsquares | | 17 | Accept | 0.25282 | 0.045201 | 0.2527 | 0.2526 | 1.8086e-05 | leastsquares | | 18 | Accept | 0.25282 | 0.064552 | 0.2527 | 0.2526 | 2.6769e-08 | leastsquares | | 19 | Accept | 0.25511 | 0.20968 | 0.2527 | 0.2526 | 1.9808e-07 | svm | | 20 | Accept | 0.25282 | 0.053456 | 0.2527 | 0.2526 | 2.1026e-07 | leastsquares | |=====================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | Lambda | Learner | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | |=====================================================================================================| | 21 | Accept | 0.25276 | 0.053516 | 0.2527 | 0.25265 | 0.0012639 | leastsquares | | 22 | Accept | 0.25282 | 0.064874 | 0.2527 | 0.25265 | 1.8877e-06 | leastsquares | | 23 | Accept | 0.25281 | 0.07612 | 0.2527 | 0.25265 | 0.00019974 | leastsquares | | 24 | Accept | 0.25276 | 0.057865 | 0.2527 | 0.25268 | 0.0012435 | leastsquares | | 25 | Best | 0.2525 | 0.07282 | 0.2525 | 0.2525 | 0.0074555 | leastsquares | | 26 | Accept | 0.25263 | 0.058606 | 0.2525 | 0.2525 | 0.0043332 | leastsquares | | 27 | Accept | 0.25264 | 0.055083 | 0.2525 | 0.25254 | 0.0042096 | leastsquares | | 28 | Accept | 0.25264 | 0.098395 | 0.2525 | 0.25256 | 0.0042037 | leastsquares | | 29 | Accept | 0.32947 | 0.065543 | 0.2525 | 0.25254 | 2.5175 | leastsquares | | 30 | Accept | 0.38563 | 0.069852 | 0.2525 | 0.25259 | 7.633 | svm | __________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 21.3994 seconds Total objective function evaluation time: 3.5315 Best observed feasible point: Lambda Learner _________ ____________ 0.0074555 leastsquares Observed objective function value = 0.2525 Estimated objective function value = 0.25259 Function evaluation time = 0.07282 Best estimated feasible point (according to models): Lambda Learner _________ ____________ 0.0043332 leastsquares Estimated objective function value = 0.25259 Estimated function evaluation time = 0.066363

%%

% The Shapley function is not available in Matlab 2019a. Therefore it was executed

% in MATLAB Online using the latest software version.

for i = 1:size(w,1)

explainer_CP{i} = shapley(Mdl_CP,w,'QueryPoint',w(i,:)); % Determine predictor contribution for each observation in the dataset

ShapVal_CP(i,:) = (explainer_CP{i}.ShapleyValues.ShapleyValue); % Isolate the Shapley values from the Shapley objects

end

normShapVal_CP = normalize(abs(ShapVal_CP),2,"range"); % Normalize the range of Shapley values to [0 1] for each prediction (the observations are the same as the predictions in this case because the observartions were used as QueryPoints)

meanShapVal_CP = mean(normShapVal_CP,1).*100; % Mean shapley value for individual predictors (weights)

bar(meanShapVal_CP)

title("Individual weight contribution to CP")

ylabel("Contribution(%)")

xlabel("weights")

%%

sum(meanShapVal_CP)

ans = 690.1735

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Vatsal 2024-2-26

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1604130-how-to-get-predictor-contribution-to-r-2#answer_1416863

在 MATLAB Online 中打开

Hi,

It appears that you are attempting to determine the relative importance of predictors in a model, similar to the 'relaimpo' package in R. To calculate the relative importance of each predictor, you can normalize the Shapley values so that they collectively sum to 100%.

Below is a modification to the code to achieve this:

for i = 1:size(w,1)
    explainer_CP(i) = shapley(Mdl_CP,w,'QueryPoint',w(i,:)); % Determine predictor contribution for each observation in the dataset
    ShapVal_CP(i,:) = (explainer_CP(i).ShapleyValues.ShapleyValue); % Isolate the Shapley values from the Shapley objects
end
absShapVal_CP = abs(ShapVal_CP); % Take the absolute value of the Shapley values
sumShapVal_CP = sum(absShapVal_CP,2); % Sum the absolute Shapley values for each observation
normShapVal_CP = absShapVal_CP ./ sumShapVal_CP; % Normalize the absolute Shapley values so they sum to 1 for each observation
meanShapVal_CP = mean(normShapVal_CP,1).*100; % Mean shapley value for individual predictors (weights)
bar(meanShapVal_CP)
title("Individual weight contribution to CP")
ylabel("Contribution(%)")
xlabel("weights")

I hope this helps!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How to get predictor contribution to R^2 ?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How to get predictor contribution to R^2 ?

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论