Spline vs Linear Interpolation
133 次查看(过去 30 天)
显示 更早的评论
Hello, I have a Question: which is better…linear or spline interpolation? Linear and spline interpolation which is better to handle spectral data interpolation? Statistically how do I compare the performance of either spline or linear interpolation on the spectral data? In other words, how do I estimate the error associated with each method so I can choose the best one for my application?. Thanks!
0 个评论
采纳的回答
John D'Errico
2022-5-29
编辑:John D'Errico
2022-5-29
This is not remotely a question about MATLAB. I'll answer it only because I have some time, and because I do understand interpolation reasonably well. :)
Which interpolant is better? Both have their advantages. So asking which is better is like asking if an apple is better than an orange.
A linear interpolant has the advantage that it will be shape preserving in one respect, i.e., that it will never exceed the range of your data. And since spectra have the property that they tend to be peaky, and then rapidly dive down to zero, a classic spline can often be problematic here, since a spline will probably oscillate above and below zero. That is meaningless behavior when used to interpolate an always positive spectral curve. As such a linear interpolant can be viewed as a better method there.
At the same time, since spectra will be sampled at a finite set of points that are often too coarse to approximate the peaks of your curve well, a linear interpolant will be poor there since a linear interpolant is just a straight line, connect the dots interpolant.
If you read the last two paragraphs, I am tell you that a spline is better near the peaks of your spectra, while the linear interpolant is better near the baseline. You can't win here. Or, said differently, they both suck.
Statistically, there is nothing you can say, since statistics tells you nothing about an interpolatatory curve fit. So I have no idea what you are asking there. Well, with one caveat. I once put together an analysis showing that in the presence of significant noise on a curve, a linear interpolant can be a lower variance predictor than a cubic spline. But that applies only to a curve where the noise really is pretty significant. Then the spline will have oscillations in it, because the spline will also be interpolating the noise. Again, this requires that the noise is significant, and spectra are typically sufficiently carefully measured that the noise is not that large of a component. So a spline will usually win in this respect on spectra.
As an example of what I am saying, consider the following vaguely spectral curve:
F = @(x) x.*(sin(5*x.^2) + 1).^4;
fplot(F,[0,2.5],'g--')
hold on
xint = linspace(0,2.5,40);
plot(xint,F(xint),'ro')
hold off
The above curve in blue will be recognized as ground truth. Just pretend it is a spectrum for something. I choce it because it has peaks of varying widths, and because it then dives down to zero, where it will be flat, but not go below zero. I probably should have chosen a curve where some of the peaks come near each other, to partially overlap. So sue me. This is good enough.
But now plot the linear interpolant on that set of points. (Note that I can just do connect the dots as plot does for me.)
plot(xint,F(xint),'ro',xint,F(xint),'b-')
Do you accept that the curve, where the point straddle a peak, will approximate the spectra poorly? At the same time, a linear interpolant will NEVER pass below zero. What does a spline do here?
xf = linspace(0,2.5,1000);
spl = spline(xint,F(xint));
plot(xint,F(xint),'ro',xf,fnval(spl,xf),'b-')
hold on
fplot(F,[0,2.5],'g--')
hold off
So the green dashed line is ground truth here. The spline is smoother, but it still fails terribly. It oscillates in places where it should never do so, going below zero, which is meaningless. It does a little better at hitting the peaks, but it still performs poorly where the peaks are sharp and poorly approximated by only a few scattered points.
You can do a little better if you use pchip as the interpolant, because it will never pass below zero. So the baseline oscillations go away. But pchip will now perform more poorly at the peaks, much like the linear interpolant there.
pspl = pchip(xint,F(xint));
plot(xint,F(xint),'ro',xf,fnval(pspl,xf),'b-')
hold on
fplot(F,[0,2.5],'g--')
hold off
You can't really win for trying here. Sorry. Is there a decent compromise? There are tricks using log transformations I've tried in the past. But they tend to be difficult to get to work perfectly. I recall that long ago, in a galaxy far, far away, I wrote an interpolant specifically designed to work well on spectra, that had the good behaviors of a cubic spline near the peaks, but the good behaviors of pchip near the valleys. That code lives in another universe though - the universe of APL. That also means it was written close to 40 years ago.
4 个评论
John D'Errico
2022-5-30
Again, I don't know what you mean by estimating the error. Unless you know the true "function" behind your data, you cannot compute the error for an interpolant. If you did have a known function behind those spectra, then you might do something like compute the integral of the square of the error over some interval. But you don't have it.
In your case, you will have spectra, no more. So you don't have ground truth available. All you have is a list of points. At best, you can decide which behavior you are willing to tolerate for an interpolant in general. I've shown you the different basic behaviors, so you can expect what will happen.
Having said all of that, I suppose you could do things like take every other point, resampling the spectra to be twice as coarse. Then predict the spectra at the missing points, and see what you give up. But that still misses an miportant thing. A spline will predict NEGATIVE values for the spectra on the baseline, unless you use pchip. Go back and look at the oscillatory behavior I show on the base line for a cubic spline That is probably a BAD thing when trying to interpolate spectra. But splines exhibit that sort of thing as a classic behavior. (Think of this as a variation of Gibbs phenomena.) I suppose you can use the max function to delete any such negative lobes in the predicted interpolation.
Another thing you might do is to use a leave one out cross-validation scheme. So drop out each point consecutively in a typical curve, then predict the dropped out point using the desired interpolation. Compute the sqrt of the sum of squares of all of those errors as some sort of measure of the ability of each method, on your specific data. I'll give you a hint: the spline will probably be better, as long as your spectra are relatively smooth. If noise corrupts your data, again, as I pointed out, then a piecewise linear interpolant can actually come out ahead as a lower variance estimator. But the spline still has some serious issues, as I explained in my answer.
You might consider a tool like pchip as a reasonable compromise. It is not truly a classical spline in terms of C2 differentiability, but it has good behavior near the baseline, the the high degree of differentiability is wasted on a problem like this. The only problem is how spiky are your spectra.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Smoothing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!