Why I cannot fit a function with more than five variables？

3 次查看（过去 30 天）

显示更早的评论

huazai2020 2020-6-5

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/542861-why-i-cannot-fit-a-function-with-more-than-five-variables

评论： Walter Roberson 2020-6-14

采纳的回答： the cyclist

Why I cannot fit a function with more than five variables？It gives the errors below， who can help me？

20 个评论
显示 18更早的评论隐藏 18更早的评论

David Goodmanson 2020-6-8

Hi h2020,

While I would not go so far as to say this can't/shouldn't be modeled, I think you should take a close look at the variables involved. Could you comment on what is the physical significance of the five columns of x, and how are they expected to relate to y? If the basic mechanism is known, is there a model that might naturally apply?

As far as x is concerned, x3 and x4 are almost the same, being almost exactly proportional to each other. So they are not really independent variables. y most closely resembles x3 in the sense that there appears to be some correlation in their variations. In terms of their shape and not their size, x1,x2 resemble each other. x5 is similar exept for a downturn on the left that does not appear to be of much help for a potential fit.

On the basis of what is going on physically, one of x1,x2,x5 might be expected to have a closer relationship to y than the others. But no matter what the mathematical form of the model is, fitting all three will probably result in one or two extra parameters that do not mean very much (as AI has mentioned).

huazai2020 2020-6-14

what does this （B = [ones(nx,1), log(x)] \ log(y(:));）mean？

Walter Roberson 2020-6-14

在 MATLAB Online 中打开

Consider the model

log(y) = log(b1) + b2*log(x1) + b3*log(x2) + b4*log(x3) + b5*log(x4) + b6*log(x5)

but suppose you have several x* and y values. Then you can create a series of equations

log(y(1)) = log(b1)*1 + b2*log(x1(1)) + b3*log(x2(1)) + b4*log(x3(1)) + b5*log(x4(1)) + b6*log(x5(1))
log(y(2)) = log(b1)*1 + b2*log(x1(2)) + b3*log(x2(2)) + b4*log(x3(2)) + b5*log(x4(2)) + b6*log(x5(2))
log(y(3)) = log(b1)*1 + b2*log(x1(3)) + b3*log(x2(3)) + b4*log(x3(3)) + b5*log(x4(3)) + b6*log(x5(3))
...

Now arrange those in matrix form:

log(y(:)) = [1, log(x1(1)), log(x2(1)), log(x3(1)), log(x4(1)), log(x5(1)), log(x6(1));
             1, log(x1(2)), log(x2(2)), log(x3(2)), log(x4(2)), log(x5(2)), log(x6(2));
             1, log(x1(3)), log(x2(3)), log(x3(3)), log(x4(3)), log(x5(3)), log(x6(3));
             ...
             ] * [log(b1); b2; b3; b4; b6; b6]

where * is algebraic matrix multiplication.

This can then be written more compactly as

log(y(:)) = [Column of 1s, log(x1(:)), log(x2(:)), log(x3(:)), log(x4(:)), log(x5(:)), log(x6(:))] ...
             * [log(b1); b2; b3; b4; b6; b6]
         

and since your x1 = x(:,1) and x2 = x(:,2) and so on, the log(x1(:)), log(x2(:)) and so on can be written more compactly as log(x), so

log(y(:)) = [Column of 1's, log(x)] * [log(b1); b2; b3; b4; b6; b6]

This is a system of linear equations. If we say

b = log(y(:))
A = [Column of 1's, log(xdata)]
X = [log(b1); b2; b3; b4; b6; b6]

then we get the familiar A*X = b .

If A were square (that is, you had 7 samples) then mathematically you would multiply the left sides by inv(A), getting

inv(A) * A * X = inv(A) * b

and inv(A) * A would be the identity matrix, and inv(A) * b could be calculated as all of those values are known, so the vector of unknowns X = inv(A) * b

You have more than 7 samples, so you do not have a square system, so you cannot use inv(), but you can do the equivalent of

pinv(A) * A * X = pinv(A) * b

to get X = pinv(A) * b as a solution that attempts to minimize error.

The MATLAB operator A\b roughly calculates pinv(A) * b, but does so using a different method of minimizing error; A\b is a least-squared error method of making that calculation.

ones(nx,1) is the "column of 1s" mentioned earlier.

Now, this all would be a least-squared calculation in log space, and would be the best fit you could get in log space for that model. But you probably want least-squared calculation in linear space, so take the result as initial values to feed into a linear least squared fit routine.

请先登录，再进行评论。

请先登录，再回答此问题。

采纳的回答

the cyclist 2020-6-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/542861-why-i-cannot-fit-a-function-with-more-than-five-variables#answer_447719

在 MATLAB Online 中打开

I think the primary problem is that some of your variables are highly correlated with each other, and therefore (a) add very little information to the model, and (b) will contribute to over-fitting because of the extra parameters. The following works:

data = xlsread('data_youhua.xlsx','Sheet1');
X = data(:,[1 3]);
y =data(:,6);
modelfun = @(b,x) b(1).*x(:,1).^b(2) .*x(:,2).^b(3);
beta0 = [-1 0.1 0.2];
mdl = fitnlm(X,y,modelfun,beta0);
predicted_y = predict(mdl,X);
figure
hold on
hd = plot(y,'.');
hp = plot(predicted_y,'.');
set([hd hp],'MarkerSize',24)
for ny = 1:numel(y)
    hc = line([ny ny],[y(ny) predicted_y(ny)]);
    set(hc,'Color','black')
end
legend([hd hp],{'data','prediction'})
print('-dpng','-r600','test.png')

and results in the following fit

Note that I did not draw the fit as a continuous line. The reason is that the x-axis here is not a continuous variable. It is just the ordinal count of your data points. There are almost certainly better ways to plot the comparison of the data and the fit, but this is at least not incorrect.