How is predictor importance for classification trees calculated?
2 次查看(过去 30 天)
显示 更早的评论
I am using MATLAB's function:
predictorImportance
to evaluate the usefulness of features I am extracting from 360° images.
I don't fully understand how predictor importance estimates are calculated and was hoping for a mathematical explanation for the algorithm used.
I have read the MATLAB documentation on this, however, I am unsure about a few things.
Firstly, what is risk? I have assumed it to be the impurity reduction if using the Gini index as the splitting criterion.
Secondly, what does "his sum is taken over best splits found at each branch node" when surrogate splits aren't used.
Finally, I don't understand why the estimates change when you reorder the columns in the feature matrix.
Thank you in advance to anyone able to shed light on this for me.
0 个评论
回答(1 个)
Gaurav Garg
2021-1-27
Hi Ryan,
Yes, risk means impurity reduction if using the Gini index as the splitting criterion. You can also give 'twoing' or 'deviance' as split criterions by following the doc here.
To know about why the estimates change when you reorder columns, you can go through the doc here to understand the algorithm involved behind selections of nodes and splitting of each branch node.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Classification Trees 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!