How to access training data in regression trees in TreeBagger

1 次查看(过去 30 天)
I need to access training data (x) in each regression tree within an ensemble of trees created by TreeBagger.
I am using TreebBagger.Trees, which returns a cell with all the trees in the ensemble. The problem is that the trees are CompactRegressionTrees, which do not include the data for training the regression tree.
I am wondering how I can either make TreeBagger use RegressionTrees instead of CompactRegressionTrees when building the ensemble, or if there is any other way of accessing training data at leaf nodes of CompactRegressionTrees.

回答(1 个)

Ilya
Ilya 2015-8-13
Logical indices of observations used for each tree are stored in the OOBIndices property. This property wouldn't tell you though if an observation is sampled multiple times for the same tree.
If you need access to that info, your best shot is to introduce another property in the TreeBagger class to hold numeric indices of observations used for each tree. Take a look at line 1945 or so that should look like this:
idxtrain = weightedSample(s,w,fboot,sampleWithReplacement);
You just need to store the idxtrain array for each tree. I would add another output to the loopBody function and modify the call to loopBody accordingly.
I wouldn't recommend replacing compact trees with full trees. This is harder and would blow up memory consumption.

类别

Help CenterFile Exchange 中查找有关 Regression Tree Ensembles 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by