# Curve fitting and scatter plots

18 views (last 30 days)
Craig on 22 Sep 2013
Edited: Image Analyst on 24 Nov 2013
Hi all,
This may not be possible but is there a function or a way you can fit a curve to a select number of points within a scatterplot? Looking specifically to fit a curve around the 'lower boundary' of a scatterplot. For example say I had a scatterplot of hundreds of points that made up a circle. Is there a way to fit a curve around the bottom perimeter?
Best

#### 1 Comment

dpb on 22 Sep 2013
How do you define what is the "bottom perimeter", specifically?
One (probably crude) choice would be to pick all values in the data set whose y-values<=mean(y)

Image Analyst on 22 Nov 2013
Craig, try this to see if it's what you want. I create a bunch of noisy points around a circle. Then I select the bottom half of those. Then I fit a circle through the points, using only those points in the bottom half of the circle to determine the fitted equation. I think that's what you've been saying you want. Save the following in test.m and run it.
function test
clc; % Clear the command window.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 20;
xCenter = 12;
yCenter = 10;
theta = 0 : 0.01 : 2*pi;
radius = 5 + 2*rand(1, length(theta));
x = radius .* cos(theta) + xCenter;
y = radius .* sin(theta) + yCenter;
plot(x, y, 'b*');
axis square;
xlim([0 20]);
ylim([0 20]);
grid on;
% Enlarge figure to full screen.
set(gcf, 'units','normalized','outerposition',[0 0 1 1]);
% Get the bottom half of them
selector = y < yCenter;
bottomHalfY = y(selector);
bottomHalfX = x(selector);
hold on;
plot(bottomHalfX, bottomHalfY, 'ro');
% Fit a circle to the bottom half noisy data:
[xc,yc,R,a] = circfit(x,y)
xFit = R .* cos(theta) + xc;
yFit = R .* sin(theta) + yc;
plot(xFit, yFit, 'r-', 'LineWidth', 3);
function [xc,yc,R,a] = circfit(x,y)
%CIRCFIT Fits a circle in x,y plane
% http://matlab.wikia.com/wiki/FAQ#How_can_I_fit_a_circle_to_a_set_of_XY_data.3F
% [XC, YC, R, A] = CIRCFIT(X,Y)
% Result is center point (yc,xc) and radius R. A is an optional
% output describing the circle's equation:
%
% x^2+y^2+a(1)*x+a(2)*y+a(3)=0
% by Bucher izhak 25/oct/1991
n=length(x); xx=x.*x; yy=y.*y; xy=x.*y;
A=[sum(x) sum(y) n;sum(xy) sum(yy) sum(y);sum(xx) sum(xy) sum(x)];
B=[-sum(xx+yy) ; -sum(xx.*y+yy.*y) ; -sum(xx.*x+xy.*y)];
a=A\B;
xc = -.5*a(1);
yc = -.5*a(2);
R = sqrt((a(1)^2+a(2)^2)/4-a(3));

Craig on 23 Nov 2013
This was food for thought so thanks for your time. It has given me an idea of how to achieve what I need. I think basically I need away to find the minimum y value in my plot at intervals along the x-axis (if my x-axis goes from 0 - 100 I would think 10 points would suffice). Then use the selector to highlight these points and fit the curve round that.
Any ideas?
Image Analyst on 23 Nov 2013
Assuming you have xData and yData, try this:
counter = 1;
for x = 0 : 10 : 100
x1 = x;
x2 = x1 + 10;
indexes = xData > x1 & xData < x2;
minY(counter) = min(y(indexes));
counter = counter + 1;
end
Now minY has the min y value for each segment 0-10, 10-20, 20-30, etc.
Craig on 24 Nov 2013
You are good. :) thanks alot.

Image Analyst on 22 Sep 2013
If you know the indexes of those points, then, sure. Just put the arrays with those indexes into polyfit() or whatever you're using.

SCADA Miner on 13 Nov 2013
Hi Craig, Did you ever figure out how to do this? I want to do the same thing. I have a bunch of measurements which form a scatter plot, I want to fit a curve which is the lower bound for say 95% of those points. I can think of a really inefficient way to do it but surely something already exists? Cheers Tom

Image Analyst on 13 Nov 2013
Why can't you also just use polyfit()? Then calculate your residuals by subtracting the fit from the data and sorting and taking the lowest 5%. I mean, that's the simple, intuitive way that I'd use. Would that work for you?
Craig on 22 Nov 2013
I think the difficulty is I am working with a scatter plot containing upwards of 200,000 data points meaning my polynomial could be over order 10 or higher. I am not too familiar with polyfit. Could you explain how I could use it here?
dpb on 22 Nov 2013
The polynomial is only of the order you select.
Fit the data; if there's any way to know a priori to subset it at least some beforehand that would be a_good_thing (tm) but not mandatory. Then evaluate the result over the same fitted range, compute the residuals and as IA says, find the smallest group. You could then selectively refit those to improve the fit w/o the others polluting the estimates.
See
doc polyfit % and friends for more details