# From a set of curves, how to find the one that the "most similar" to the reference curve?

42 views (last 30 days)
NIvvei on 10 Oct 2019
Edited: Matt J on 11 Oct 2019
I have one reference curve, and a set of curves from simulated results, these curves are of different shapes and different scales. My objective is to find one curve that is the "most similar" to the refernece curve. I have to related questions:
1. My objective is to find a curve that "looks the most similar" to the reference, so what would be a good metric to define such "similarity"?
2. what functions or toolboxs that can help easily do this job?
For example, in the following picture, the green curve is the reference, and the blue curve is one of many candidates. These candidates may have very different shape. And even it has similar shape to the reference cruve, its scale may be different. So basically I'll need to "squeeze" or "stretch" the candidate curves, and then test its similarity to the reference curve. (figure borrowed from https://stackoverflow.com/questions/9119316/how-to-find-out-the-scaling-factors-to-match-two-curves-in-matlab)
Any hint is appreciated, thank you very much! Rik on 10 Oct 2019
Apart from all the 'boring' strategies of using tried and tested goodness-of-fit parameters (many of which are implemented in base Matlab and the stats toolbox), I have a fun suggestion:
For all the points in one curve, find the distance to the closest point on the other curve. Then either take the median or the maximum distance and you have a GoF parameter that is relatively insensitive for a shift in x and scaling. This should yield results close to what should be two visually similar curves.

NIvvei on 11 Oct 2019
One issue that I have to implement what you have suggested is that these curves has different data points, for example, the reference curve has 100 points while all candicate curve has 120 points. In this cause, i am not sure what do you mean by "closest point"?
Rik on 11 Oct 2019
pdist2 or a similar function could do the trick. I'm currently on mobile, so I can't code an example for you.

Matt J on 10 Oct 2019
Edited: Matt J on 10 Oct 2019
The code below uses fminspleas
to do a simple 1D registration of the two curves.The final registration error can be used to measure the similarity of x2,y2 with the reference x1,y1. You can repeat this comparison with as many data sets {x2,y2} as needed.
F=griddedInterpolant(x1,y1,'cubic');
fun=@(p,xdata) F(xdata/p(1)-p(2));
a0=(max(x2)-min(x2))/(max(x1)-min(x1));
t0=mean(x2/a0)-mean(x1);
[pr,Ar]=fminspleas({fun},[a0,t0],x2,y2);
efun=@(x) Ar*fun(pr,x);
plot(x2,y2,'x',x2,efun(x2));legend
shg
registrationError=norm(efun(x2)-y2)

#### 1 Comment

Matt J on 11 Oct 2019
For example, here is code for the comparison of a reference sinusoid to 2 different test curves. The first is a stretched/shifted version of the reference, while the second is just a linear ramp.
close all
x1=linspace(0,11,100);
y1=sin(x1);
A=1.5; a=1.1; t=0.5; %stretch parameters
x2=a*(linspace(3,12,150)+t); y2=A*sin(x2);
x3=x1; y3=x3/20;
reg1d(x1,y1,x2,y2)
reg1d(x1,y1,x3,y3)
function reg1d(x1,y1,x2,y2)
F=griddedInterpolant(x1,y1,'cubic');
fun=@(p,xdata) F(xdata/p(1)-p(2));
flist={fun};
a0=(max(x2)-min(x2))/(max(x1)-min(x1));
t0=mean(x2/a0)-mean(x1);
[pr,Ar]=fminspleas(flist,[a0,t0],x2,y2);
efun=@(x) Ar*fun(pr,x);
registrationError=norm(efun(x2)-y2)
figure;
subplot(2,1,1)
plot(x1,y1,'x',x2,y2);
title 'INITIAL';legend('reference','test');
set(gca,'FontSize',15);
subplot(2,1,2)
plot(x2,efun(x2),'x', x2,y2);
title("ALIGNED: Error ="+registrationError);
legend('deformed reference','test');
set(gca,'FontSize',15);
end
We see that the first data set does well in terms of fitting error. The linear ramp does not do as well. 