Why does "stepwiselm" not remove terms with high p-values?
1 次查看(过去 30 天)
显示 更早的评论
Hi,
I'm using stepwiselm to model some data - however I seem to have difficulties to exactly understand what this command does:
stepwiselm(X,'linear','PEnter',0.05)
From what I understand this should give out a model with each term not having a p-value above 0.05. The output does however contain terms with p values up to 0.51.
If I'm using for the same set of data the command below, the output seems fine (no p-values above 0.05).
stepwiselm(X,'linear','upper','linear)
From what I understood 'upper' suppresses Matlab to make any kind of linear combination of the term to get better results. So I assume its rather a 'coincidence' that it removes terms with high p-values from the model.
Thanks a lot for your help!
0 个评论
回答(1 个)
Aylin
2016-10-12
Hello,
I understand that you are trying to remove terms with high p-values from a stepwise regression model of your data. In order to do this, I would recommend setting the ‘ PRemove ’ property of the stepwiselm function.
Let me explain this further using your code. The first line of your code:
stepwiselm(X, ‘linear’, ‘PEnter’, 0.05)
is building a regression model of your X data. Initially, it only includes ‘ linear ’ terms of the regression model. Then, the ‘ PEnter ’ property allows additional terms to be included in the regression model only if their p-value is less than 0.05. Note however, in the above line of code, the ‘ PRemove ’ property is already set by default to 0.10. This means that only terms with p-values greater than 0.10 are actually removed from the regression model. Please refer to the following documentation link for more information about the ‘ PRemove ’ property:
In order to exclude regression terms with a p-value of greater than 0.05, your first line of code should be modified to:
stepwiselm(X, ‘linear’, ‘PEnter’, 0.025, ‘PRemove’, 0.05)
This should remove any regression terms with a p-value greater than 0.05.
As you mentioned in your question, setting the ‘ Upper ’ property of the stepwiselm function constrains the regression model to use only ‘ linear ’ terms. Yes, it probably is only a coincidence here that the p-values of the regression terms with this setting are all less than 0.05.
The MATLAB documentation contains some detailed examples that can help clarify the use of stepwiselm:
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Model Building and Assessment 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!