How can I restrict data arrays to do a linear regression between 2 points?
5 次查看(过去 30 天)
显示 更早的评论
I would like to do a linear regression using polyfit, but only on part of the dataset. I have 2 arrays, Wavelength (x axis) and Flux (y axis). I would like to regress the data in the range of Wavelength >1515 & Wavelength < 1750, and then find the slope of the trend line that unites the fluxes (y values) in this range. I do to know how to restrict my data set in this way (without importing the data again!). I tried scaling my axes, but the polyfit function still considered all values in my dataset.
Here is what I have so far:
if true
% code
%%Initialize variables.
filename = '/Users/lexiwilson/Documents/SURF/DataIrradiance/DEC/WSD_26DEC/WAIS1226201500166.asd.irr.pco.txt';
delimiter = {'\t',' '};
startRow = 39;
datetime = strcat('/Users/lexiwilson/Documents/SURF/DataIrradiance/DEC/WSD_26DEC/','122615_','00:33:56');
%%Read columns of data as strings:
% For more information, see the TEXTSCAN documentation.
formatSpec = '%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
% This call is based on the structure of the file used to generate this
% code. If an error occurs for a different file, try regenerating the code
% from the Import Tool.
textscan(fileID, '%[^\n\r]', startRow-1, 'ReturnOnError', false);
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', true, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Convert the contents of columns containing numeric strings to numbers.
% Replace non-numeric strings with NaN.
raw = [dataArray{:,1:end-1}];
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2]
% Converts strings in the input cell array to numbers. Replaced non-numeric
% strings with NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and
% suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(thousandsRegExp, ',', 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%%Replace non-numeric cells with NaN
R = cellfun(@(x) ~isnumeric(x) && ~islogical(x),raw); % Find non-numeric cells
raw(R) = {NaN}; % Replace non-numeric cells
%%Allocate imported array to column variable names
Wavelength = cell2mat(raw(:, 1));
Flux = cell2mat(raw(:, 2));
%%Plot wavelength vs irradiance
figure()
plot(Wavelength, Flux);
title(filename);
xlabel('Wavelength (nm)');
ylabel('Irradiance (W/m^2)');
axis([350,2200,-0.5,2]);
%zoom to 1.6 micron window
figure()
plot(Wavelength, Flux);
title(filename);
xlabel('Wavelength (nm)');
ylabel('Irradiance (W/m^2)');
axis([1374,1838,-0.05,0.15]);
Ystartindx = find(Wavelength == 1515); %index of wavelength = 1515nm
Ystart = Flux(Ystartindx); %corresponding flux
Yendindx = find(Wavelength == 1750); %index of wavelength = 1750nm
Yend = Flux(Yendindx);%corresponding flux
hold on;
%make linear fit and print slope to console
waverange = find(Wavelength > 1515 & Wavelength < 1750);
fluxrange = find(Flux > Ystart & Flux < Yend);
P = polyfit(waverange,fluxrange,1);
fit = P(1)*waverange + P(2);
plot(waverange,fit,'k');
disp(P(1)); %print slope to console
%save plot in directory as jpeg
%saveas(gcf,datetime,'jpeg');
%%Clear temporary variables
clearvars filename delimiter startRow formatSpec fileID dataArray ans raw numericData col rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me R;
end
There errors I get claim that my arrays waverange & fluxrange are not the same size (which, they aren't). How can I make them the same size, and restrict the X & Y values to a range in the middle of my data set?
0 个评论
采纳的回答
Star Strider
2016-7-20
The waverange seems to be defining your data range, so use it for both, and use polyval to evaluate the fit:
%make linear fit and print slope to console
waverange = find(Wavelength > 1515 & Wavelength < 1750);
P = polyfit(Wavelength(waverange),Flux(waverange),1);
fit = polyval(p, Wavelength(waverange));
See if that works.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Distribution Plots 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!