Find approximation function z=f(x,y) for a 3-dimension set of data

14 次查看(过去 30 天)
I have a set of data (attached), where x is the first line, y is the first column and z is a result of some function z=f(x,y). There is no exact function, which would exactly produce z in any cell given x and y. However, I would like to find the most close function or several functions, which would produce very close results. I wonder is it possible to do in Matlab for a novice like me or where should look / what should I read to do this exercise. If it is not possible to do in Matlab, this would also be very useful to know. Thank you!
  2 个评论
Alex Sha
Alex Sha 2024-2-4
Having a look the function below:
Sum Squared Error (SSE): 5.61072195513791
Root of Mean Square Error (RMSE): 0.163846146106527
Correlation Coef. (R): 0.999574092032803
R-Square: 0.999148365463203
Parameter Best Estimate
--------- -------------
p1 -0.39149248886381
p2 -0.283129212082827
p3 -0.170132169616882
p4 0.00532452597021556
p5 0.429447957187154
p6 0.0427534991611015
p7 0.000160529102276903

请先登录,再进行评论。

回答(2 个)

John D'Errico
John D'Errico 2024-1-30
编辑:John D'Errico 2024-1-30
This is almost always a difficult problem, when you have no physical reasoning to suggest a valid model. That forces you to guess a model that would be consistent with your data. @Star Strider seems to have made a reasonable stab at a solution. So unless I can suggest better, I'm just adding this response to add my own thoughts. So far, I don't know where I am going as I write this...
First, ALWAYS PLOT YOUR DATA!
A1 = readmatrix('Data Matlab.xlsx');
x = A1(1,2:end);
y = A1(2:end,1);
z = A1(2:end,2:end);
surf(x,y,z)
xlabel X
ylabel Y
zlabel Z
What do I see there? For large y, the surface seems to be approaching a linear function of x. And for small y, the function has an apparently infinite slope. It seems to go to infinity.
Next, I spent some time rotating the plot around on my own, and for ANY value of y, the surface appears to be still virtually linear in x.
A model that would have a property like that might be something like this:
z = (a*x + b)*f(y)
So if y is fixed, then we would see a simple linear model in x. Does this seem to work? For example, pick a single value of y, 10, for example.
y1 = y(1);
z1 = z(1,:);
P1 = fit(x',z(1,:)','poly1')
P1 =
Linear model Poly1: P1(x) = p1*x + p2 Coefficients (with 95% confidence bounds): p1 = 0.00192 (0.001828, 0.002012) p2 = -0.4429 (-0.6192, -0.2666)
P2 = fit(x',z(1,:)','poly2')
P2 =
Linear model Poly2: P2(x) = p1*x^2 + p2*x + p3 Coefficients (with 95% confidence bounds): p1 = 1.744e-07 (1.731e-07, 1.756e-07) p2 = 0.00131 (0.001306, 0.001314) p3 = -0.01787 (-0.02129, -0.01445)
plot(P1,'r')
hold on
plot(P2,'g')
plot(x,z1,'k:')
title("y = " + y(1))
xlabel X
ylabel 'Z(x,y==10)'
hold off
So a little bit of curvature, but not a lot. A straight line fit would not be bad, and a quadratic essentially nails it. The green curve pretty much overlaps on top of the data (in black).
Looking at the fit itself, the quadratic coefficient is clearly bounded away from zero. So at this point, I would tentatively suggest the model might look like
z = (a*x^2 + b*x + c)*f(y)
So we expect some pretty simple behavior as a function of x. Does this still follow for other values of y? y(14)==0.5. Lets look there next.
P2 = fit(x',z(14,:)','poly2')
P2 =
Linear model Poly2: P2(x) = p1*x^2 + p2*x + p3 Coefficients (with 95% confidence bounds): p1 = -6.702e-07 (-6.872e-07, -6.532e-07) p2 = 0.006351 (0.006291, 0.006412) p3 = -0.02908 (-0.07643, 0.01828)
plot(P2,'g')
hold on
plot(x,z(14,:),'k:')
title("y = " + y(14))
xlabel X
ylabel 'Z(x,y==0.5)'
hold off
Again, a quadratic model fits nicely, but here you should see the curve is concave down, whereas the curve was concave up for larger values of y. And that invalidates my simple idea of a separable model. But even so, the model is still very much only a quadratic in x. We would now be forced to accept a slightly more complex form for a model. Perhaps something like this:
z = a + f(y)*x + g(y)*x^2
where at least one of f(y) and g(y) have singularities as y approaches zero. However, if this model applies, then g(y) does not have a singularity, because the coeffic ient of x^2 changes sign, and goes negative.
So next, lets fix x at some value. What happens there? When x==500, for example, I'll pick a fairly general model in y.
mdl = fittype('a + b*Y+c*(1./Y).^d','indep','Y')
mdl =
General model: mdl(a,b,c,d,Y) = a + b*Y+c*(1./Y).^d
ymdl = fit(y,z(:,1),mdl)
Warning: Start point not provided, choosing random start point.
ymdl =
General model: ymdl(Y) = a + b*Y+c*(1./Y).^d Coefficients (with 95% confidence bounds): a = 4.213e-10 (-4.768e-11, 8.904e-10) b = -1.156e-11 (-4.592e-11, 2.281e-11) c = 2.106 (2.106, 2.106) d = 0.4915 (0.4915, 0.4915)
Ah. Do you see? There is no constant term. No linear term in y. And the other coefficients are quite well known. If we try other fixed values for x, again this same pattern holds, although the exponent on y changes.
ymdl = fit(y,z(:,end),mdl)
Warning: Start point not provided, choosing random start point.
ymdl =
General model: ymdl(Y) = a + b*Y+c*(1./Y).^d Coefficients (with 95% confidence bounds): a = 1.04e-11 (-3.714e-11, 5.793e-11) b = -8.891e-14 (-1.738e-12, 1.56e-12) c = 10.65 (10.65, 10.65) d = 0.2885 (0.2885, 0.2885)
Anyway, the model is now becoming more clear. A few more iterations, looking at what would be needed for a model, and you should be able to deduce very accurately what the model should be. However, since this is clearly a homework exersize, I won't go any further. (Why do I know that? Your data is TOO perfect. No noise at all.)

Star Strider
Star Strider 2024-1-30
The value of the surface along the ‘x’ axis appears to be increasing linearly, so one option would be to add a term specifically for that, in addition to a general exponential fit.
You will need to experiment, however something like this should work for a start —
A1 = readmatrix('Data Matlab.xlsx');
x = A1(1,2:end);
y = A1(2:end,1);
z = A1(2:end,2:end);
xy = ndgrid(y.',x); % Grid Of 'x' And 'y' Values (Independent Variable #1)
xm = repmat(x, numel(y), 1); % Grid Of 'x' Values (Independent Variable #2)
objfcn = @(b) xy.^b(1) + xy.^b(2) + xm.*b(3) + b(4); % Function To Approximate Observed Data
B0 = rand(4,1); % Initial Parameter Estimates
[B,rn] = fminsearch(@(b)norm(z - objfcn(b)), B0 )
Exiting: Maximum number of function evaluations has been exceeded - increase MaxFunEvals option. Current function value: 19.198476
B = 4×1
-0.7376 -0.7505 0.0026 0.0364
rn = 19.1985
figure
surfc(x, y, z)
hold on
mesh(x, y, objfcn(B))
hold off
colormap(turbo)
grid on
xlabel('x')
ylabel('y')
zlabel('z')
title('Original Surface & Estimate Plotted Together')
figure
surfc(x, y, z-objfcn(B))
Ax = gca;
colormap(turbo)
grid on
xlabel('x')
ylabel('y')
zlabel('z')
title('Error Plot')
This is not perfect, however it appears to have captured the essential trends. I use fminsearch here because it does a derivative-free oprimisation. A more robust option would be to use ga or other Global Optimization Toolbox functions to estimate its parameters.
.

类别

Help CenterFile Exchange 中查找有关 Solver Outputs and Iterative Display 的更多信息

产品


版本

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by