Probit: removing groups that perfectly predict failures
显示 更早的评论
Hi all,
I have a group-year panel data as attahced. Apologies the data is very low quality.
There are 3 groups, each has 20 observations. Outcome y is a dummy variable for success. The first column in x is a continuous variable "effort". The second column is a dummy indicates group A. The third column is a dummy for group B. There is no dummy for group C to avoid collinearity.
I want to predict the probability of success using the probit model. The code I try is:
b = glmfit(x,y,'binomial','Link','probit');
b =
0.1857 (constant)
-1.8149 (effort)
-16.1148 (group A)
-16.2994 (group B)
As you can see in the data, all outcomes for group A are failures. So the second column in x predicts y == 0 perfectly. Matlab also raises a warning:
Warning: The estimated coefficients perfectly separate failures from successes. This means the theoretical best estimates are not finite.
For the fitted linear combination XB of the predictors, the sample proportions P of Y=N in the data satisfy:
XB<-0.834093: P=0
XB=-0.834093: P=1
XB>-0.834093: P=0
However, it still returns an estimated coefficient for group A dummy, which is b(3) = -16.1148.
Question:
Since x(:,2) perfectly predict failures, b(3) should be 0. Is there an option in glmfit to remove observations for group A within glmfit function, then return the coefficient as 0 for this column? So I can get something like:
b =
0.1857 (constant)
-1.8149 (effort)
0 (group A)
xxx (group B)
Stata does this automatically using the command:
probit y effort i.group
It turns out the estiamtes for the constant and effort are the same. So the perfect failure issue only affects the group dummies coefficients...
Thank you!!!
采纳的回答
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Logistic Distribution 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!