How does Treebagger handle missing values?

Question

0 个投票

I've seen bits and pieces of this answer, such that NaNs get ignored in Treebagger, but no explicit answer. How are the NaNs being ignored? Does the entire row or column containing a NaN get removed? Or if an observation in the training data for an individual tree is missing that variable, is the variable simply not used on that individual tree but still used in other trees in the random forest? Or do the missing values get imputed? If so, with what?

If anyone could give me a definitive answer on what the Treebagger function is doing with them that would be amazing.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Matlab 2017-11-25

0 个投票

Random forest consists of the decision tree. I think the answer of the question is how fittree resolve the missing value.Actually the question can divide into two parts——training part and prediction part. In default, when it comes to split a node, it will ignore the sample whose testing value is missing in the impurity computation. It also can use another split method surrogate decision splits to deal with the missing value. The details are explained in the help document. When it comes to Prediction, the sample is missing in the testing attribute.I'm not sure about this part. It will produce some copies, and each copy will come along the branch with corresponding probability. The main idea is from the paper 《Induction of the decision tree》

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How does Treebagger handle missing values?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

标签

Community Treasure Hunt

How does Treebagger handle missing values?

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论