Multivariate GLMFIT and GLMVAL

5 次查看(过去 30 天)
I have a problem with plotting the results of a glm model with several predictors.
I first fit the model with GLMFIT as:
[b,dev,stats]= glmfit(x, y, distr);
where 'x' is a matrix of N observations by M predictors, and 'y' a vector of observed responses.
Then I produce predictions as:
xpred= zeros(100,size(x,2));
for k= 1:size(x,2)
xpred(:,k)= linspace(min(x(:,k)),max(x(:,k)),100);
end
ypred= glmval(b,xpred,link,stats);
Then I want to plot the effect of each individual predictor on the response variable as:
for j= 1:size(x,2)
figure(j);
plot(x(:,j),y,'k.',xpred(:,j),ypred,'b-','LineWidth',1)
end
But the line (predicted values) for most of the predictors goes totally out of the cloud of observed values. Am I doing something totally wrong or is just that the fit is poor? Any help would be much appreciated. Thanks

采纳的回答

Tom Lane
Tom Lane 2012-1-24
Consider the code below. It fits a multi-predictor model, but plots the fit as a function of one predictor at a time, with the others held fixed at their mean values. In contrast, your code has each predictor increase along with the others.
This has its own issues. For one thing, the superimposed points represent data, and may or may not be comparable to the line that constrains predictors to their means. But it is one of the things you can do to visualize a multi-predictor fit as a function of one variable at a time.
x = mvnrnd([0 0],[1 .9;.9 1],200);
y = 5*x(:,1) + 5*x(:,2) + randn(200,1);
[b,dev,stats] = glmfit(x,y,'normal');
% Create an array with all predictors at their mean
meanx = mean(x);
xpred = repmat(meanx,100,1);
xplot = xpred;
ypred = zeros(100,size(x,2));
for k= 1:size(x,2)
% Allow the kth predictor to vary, others stay fixed
xplot(:,k)= linspace(min(x(:,k)),max(x(:,k)),100);
xpred(:,k) = xplot(:,k);
ypred(:,k) = glmval(b,xpred,'identity',stats);
xpred(:,k) = meanx(k); % put the mean back
end
for j= 1:size(x,2)
figure(j);
plot(x(:,j),y,'k.',xplot(:,j),ypred(:,k),'b-','LineWidth',1)
end
If you run this you'll see that the lines don't match the points very well. That's because I generated the x values with correlation 0.9, meaning that it's unlikely that one predictor will stay fixed while the other increases. If you adjust 0.9 down toward 0.0, you should see better agreement between the fit and the data points.

更多回答(2 个)

Peter Perkins
Peter Perkins 2012-1-19
Francisco, it's pretty hard to say what's happening without knowing anything about the data or the model, but this may be an artifact of the way you've created xpred. Notice that the first row of xpred has the min values for all of the predictors, and the last row has the max values. You've created what is essentially a straight line through the design variable space that may or may not have anything to do with the actual cloud of points in your data. That could explain why ypred doesn't look much like y.
Hope this helps.

Francisco de Castro
Peter, thanks for your comment. However, to plot the predicted values of the model I need the values of each predictor to span its observed values, so the line of predicted (in the x axis) can go through the cloud of observed vals. That's why I use 100 values from min(x(:,j)) to max(x(:,j)) for each.
A friend of mine suggested that the problem is that I need to re-fit a single-predictor model for each plot. Like this:
for j= 1:size(x,2)
XPRED= linspace(min(x(:,j)),max(x(:,j)),100);
[b,dev,stats]= glmfit(XPRED, y, distr);
ypred= glmval(b,XPRED,link,stats);
figure(j); hold on
plot(x(:,j),y,'k.',XPRED,ypred,'b-','LineWidth',1)
end
This works, of course, however, then each plot represents a different 1-predictor model, not the contribution of each predictor to the multi-predictor model.

类别

Help CenterFile Exchange 中查找有关 Analysis of Variance and Covariance 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by