After importing data from excel and plotting them, I need to subtract the noise and extract only the five peaks, and finally to find the area under the peaks? Can anyone tell me what function to use?
2 次查看(过去 30 天)
显示 更早的评论
I have attached a sample plot. The left plot is plotted in MATLAB and I need to convert this to the one similar to the plot in the right side by reducing the noise. I then need to find the area under the five peaks.
采纳的回答
Ryan Takatsuka
2018-7-20
You can probably apply a highpass filter to the data to isolate the peaks. This should remove the low frequency/offset of the plot, while allowing the quickly changing peaks to pass through unchanged.
Alternatively, you can locate the peaks of the data with something like:
[pks, locs] = findpeaks(data);
Because the peaks seem to have a consistent width, you can divide the data into small "subsections" and plot each individual subsection.
To find the area under the curve, you can use a trapezoidal approximation using one of the following:
cumtrapz();
trapz();
19 个评论
Joswin Leslie
2018-7-20
data = xlsread('P540.xlsx','Line1-5','B1:C9001'); x = data(:,1); y = data(:,2); plot(x,y,'r')
I used this function to import data and plot in MATLAB. I tried your code and I'm getting an error.
Also, my peaks will not be always of consistent width. I need to locate where the peaks are, and find the area under these peaks.
Can you please tell me why the error occurs?
Ryan Takatsuka
2018-7-20
It's not super clean, but this seems to be a successful way at isolating the peaks in the data.
Essentially there are a few steps:
- Identify the trend in the data and subtract it out
- Find the peaks by detecting when the signal goes above a certain threshold. Assuming 'relatively' similar peak width, the peaks can be extracted. This should be an okay assumption anyway because the data is shifted to a y=0 mean anyway.
- numerically integrate the data
Hopefully this helps:
%%Import the data
data = xlsread('P540.xlsx', 'Line1-5', 'B1:C9001');
x = data(:,1);
y = data(:,2);
%%Get the trend
% Find the derivative of the signal
dy_dx = diff(y) ./ diff(x); % derivative
% append a value to the end of the vector so the lengths remain the same
dy_dx = [dy_dx; dy_dx(end)].^2;
% Find the points where the derivative is high (These are points where the signal is changing
% significantly, and should not be used to calculate the trend
ind = find(dy_dx>2e9);
% Get new x and y variables without the high derivative points
x_new = x;
y_new = y;
x_new(ind) = [];
y_new(ind) = [];
% Fit a polynomial to the data (This defines the trend)
p = polyfit(x_new, y_new, 2);
y2 = polyval(p, x);
y2 = y-y2;
%%Find the peak
MORE_PEAKS = true;
peak_end_ind = 1;
k = 1;
while MORE_PEAKS
peak_start_ind = find(y2(peak_end_ind:end)>500, 1) - 50 + peak_end_ind;
peak_end_ind = peak_start_ind + 400;
i_peak_start(k) = peak_start_ind;
i_peak_end(k) = peak_end_ind;
if isempty(find(y2(peak_end_ind:end)>500,1))
MORE_PEAKS = false;
end
k = k+1;
end
%%Create the clean version of the vectors
y_clean = zeros(size(y2));
for i=1:length(i_peak_start)
y_clean(i_peak_start(i):i_peak_end(i)) = y2(i_peak_start(i):i_peak_end(i)) - y2(i_peak_start(i));
end
%%Calculate integral
y_int = cumtrapz(x, y_clean);
fprintf('The total area under the peaks: %0.5f \n', y_int(end))
%%PLOTS
figure
hold on
plot(x, y)
title('Original Data')
grid on
figure
plot(x, y_clean)
title('Peak-Only Data')
grid on
figure
plot(x, y_int)
title('Integrated-peak Data')
grid on
Joswin Leslie
2018-7-20
Thank you! This code really helped me. Instead of finding the total area under the curves, how can I find the area under each peak?
Image Analyst
2018-7-20
编辑:Image Analyst
2018-7-20
If you have the Image Processing Toolbox it's trivial. Just sum the values of the array in each peak. Untested code:
binarySignal = y_clean > someSmallValue; % Threshold.
props = regionprops(binarySignal, y_clean, 'PixelValues');
for k = 1 : length(props)
peakAreas(k) = sum([props.PixelValues]);
end
By the way, I work daily with several spectroscopists doing this sort of signal and image processing.
Ryan Takatsuka
2018-7-21
Because the peaks have already been separated, you can find the area under each peak by looping through all the peak start values (i_peak_start)
for i=1:length(i_peak_start)
peak_areas(i) = trapz(x(i_peak_start(i):i_peak_end(i)), y_clean(i_peak_start(i):i_peak_end(i)));
end
This should create a vector, peak_areas that contains the area under each peak.
Joswin Leslie
2018-7-23
Thanks Ryan. The above code really helped me to find the area of five peaks. I have one more problem.
I have attached the excel sheet. I was able to specify the number of rows in the code. This specified number of rows depend on the value of "0" in column A, and will not always be constant. Similarly, I need to plot 9 more graphs for the values in column A ranging from 1 to 9. I need to find the area of all the peaks in all the 10 plots.
In each of these 10 plots, there will be five peaks. I also need to find the average of the 1st peak, 2nd peak.....till the 5th peak.
Ryan Takatsuka
2018-7-23
You can import the entire Excel spreadsheet, and then split the data up based on the value in column 1. The following should replace the first 4 lines of the original example code I provided. Essentially, it puts all of the data in a cell array, with each cell equal to one graph. By changing the value in graph_number different parts of the file will be plotted and analyzed. To fully automate this, the entire code can be placed in a for loop that iterates through each cell in the data variable.
%%Import the data
raw_data = xlsread('P540.xlsx', 'Line1-5');
% Find the row number where the value in column 1 changes
for i=1:9
new_graph_index(i+1) = find(raw_data(:,1)==i,1);
end
% manually add the row number for the begining and end of the data
new_graph_index(1) = 1;
new_graph_index(11) = length(raw_data);
% Split the raw data into a cell array for each graph
for k = 1:length(new_graph_index)-1
data{k} = raw_data(new_graph_index(k):new_graph_index(k+1)-1,2:3);
end
% Select which graph to analyze
graph_number = 5;
% Set the x and y variables from the cell array
x = data{graph_number}(:,1);
y = data{graph_number}(:,2);
Joswin Leslie
2018-7-24
Thanks. But I don't necessary need to plot the graphs (plotting is also fine). I need to find the area of all the peaks from all graphs and automate this. Then I need to find the average area of the peaks. For example, in this plot there are 5 peaks in each graph. I need to find the average area of the first peak from all the 10 plots. Similarly for the other 4 peaks also.
Joswin Leslie
2018-8-9
Hey Ryan! I was trying this code for a different file. I have attached the excel file. I am getting an error in the line where "data{k} = raw_data(new_graph_index(k):new_graph_index(k+1)-1,2:3);".
Can you please tell me why this error occurs? I have attached my complete code.
Ryan Takatsuka
2018-8-9
This is because it is trying to access the index "0" in the vector new_graph_index. Change the way the last point in the variable is calculated (lines 8-14) using something like this:
%Find the row number where the value in column 1 changes
for i=1:4
new_graph_index(i+1) = find(raw_data(:,1)==i,1);
end
%manually add row number for the beginning and end of the data
new_graph_index(1) = 1;
new_graph_index(end+1) = length(raw_data);
This solves the index problem, but there seems to be a problem with the data itself. For example, look at line 958 in the excel file. It looks like this data point is 10000X bigger than the rest of the data, causing the script to fail when analyzing it.
Joswin Leslie
2018-8-9
I'm sorry. I believe I attached the wrong excel file. I am attaching the new file now. I'm getting the following error:
Warning: Polynomial is badly conditioned. Add points with distinct X values, reduce the degree of the polynomial, or try centering and scaling as described in HELP POLYFIT. > In polyfit (line 79) In samples (line 45) Array indices must be positive integers or logical values.
Error in samples (line 70) y_clean(i_peak_start(i):i_peak_end(i)) = y2(i_peak_start(i):i_peak_end(i)) - y2(i_peak_start(i));
Can you please help me with this?
Ryan Takatsuka
2018-8-10
In the distance column, around lines 2597-2738, the numbers get extremely large.
Joswin Leslie
2018-8-10
They will. Is there any way possible to process it with these large numbers?
Ryan Takatsuka
2018-8-10
It'll be difficult to get any useful area information without a clean height vs. distance plot (it's hard to find the area under the curve without a reasonable looking curve).
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)