Perform peak fitting, find peaks and label, and integrate certain areas

28 次查看(过去 30 天)
I have a data set where I need to do a baseline correction, and then integrate certain areas to calculate ratio between them printed as out put.
My baseline correction section seems to be working, but I do not get any peak results nor integration result. I think not finding peaks in the defined regions is the main issue.
I am sharing my code below, along with a sample data set.
% Load Raman data from a text file
data = readmatrix('path to my data.txt');
x = data(:, 1); % Wavenumbers
y = data(:, 2); % Intensity
% Baseline correction using the Asymmetric Least Squares (ALS) algorithm
baseline = als_baseline(y, 0.001, 10^6);
corrected_y = y - baseline;
% Find peaks in the spectrum- This section does not give any output
prominence_threshold = 120;
distance_threshold = 10;
peaks = findpeaks(corrected_y, 'MinPeakProminence', prominence_threshold, 'MinPeakDistance', distance_threshold);
% Perform peak fitting and integration
range1 = [2050, 2200];
range2 = [2800, 3000];
integrated_area1 = 0;
integrated_area2 = 0;
for i = 1:numel(peaks.loc)
position = x(peaks.loc(i));
intensity = corrected_y(peaks.loc(i));
if position >= range1(1) && position <= range1(2)
integrated_area1 = integrated_area1 + intensity;
end
if position >= range2(1) && position <= range2(2)
integrated_area2 = integrated_area2 + intensity;
end
end
result = integrated_area2 / ((integrated_area1 + integrated_area2) + (integrated_area1 + integrated_area2));
fprintf('Integration Result: %f\n', result);
% Define the ALS baseline correction function
function baseline = als_baseline(y, p, lam)
L = numel(y);
D = diff(speye(L), 2);
w = ones(L, 1);
for i = 1:10
W = spdiags(w, 0, L, L);
C = chol(W + lam * (D' * D));
z = C \ (C' \ (w .* y));
w = p * (y > z) + (1 - p) * (y < z);
end
baseline = z;
end
Can you please recommend a better way to do it? Or spotting of any code issues is much appreciated.

采纳的回答

Chiththaka Chaturanga
Hi,
Thank you so much for the update.
Both areas are having a broad peak (Sample_DataSet). Peak at 2050-2200 is smaller, while 2800-3000 is much bigger.
Yes, I want to measure the total area under the peak of selected areas, then do the following calculation from the results.
Int.1/(Int1+Int2)
result = integrated_area2 / (integrated_area1 + integrated_area2); I have updated this.
To double check, I did a calculation in excel adapting trapazoid rule, and got a very different result with the Sample_DataSet..!!
I also tried to do a peak fitting using the following code.
S = load('Sample_Raman_Data2.mat');
% Perform Lorentzian peak fitting.
S.Fit = PeakFit(S.Data, 'Window', [2050, 2200], 'PeakShape', 'Lorentzian';
It gives an error as
Unrecognized function or variable 'PeakFit'.
I tried to set up corect path and install the required additional packages in the correct place. But nothing was successful unfortunately. Totally lost in this approach here.
  8 个评论
Chiththaka Chaturanga
Mathieu NOE,
Thank you so much for your contrubution. Code wise, it is soved now now. Somehow peak fitting is not good, which affects my final result. I look into peak fitting, it is likely a higher order polynormial fitting, rather gaussian. So I checked literature to verify this, and most of the publications used gaussian.
Thanks again for your help. I will accept this answer, as now code is fine.
Mathieu NOE
Mathieu NOE 2023-6-28
ok , good to hear that the problem is solved
i was nevertheless surprised you accepted yourself indeed .. ?

请先登录,再进行评论。

更多回答(1 个)

Mathieu NOE
Mathieu NOE 2023-6-14
hello
fixed some minor bugs
hope it helps
% Load Raman data from a text file
data = readmatrix('03--Spectrum--006--Spec.Data 1.txt');
x = data(:, 1); % Wavenumbers
y = data(:, 2); % Intensity
% Baseline correction using the Asymmetric Least Squares (ALS) algorithm
baseline = als_baseline(y, 0.001, 10^6);
corrected_y = y - baseline;
% Find peaks in the spectrum- This section does not give any output
prominence_threshold = 10;
distance_threshold = 10;
[peaks,locs] = findpeaks(corrected_y, 'MinPeakProminence', prominence_threshold, 'MinPeakDistance', distance_threshold);
% Perform peak fitting and integration
range1 = [2050, 2200];
range2 = [2800, 3000];
integrated_area1 = 0;
integrated_area2 = 0;
for ci = 1:numel(locs)
position = x(locs(ci));
intensity = peaks(ci);
if position >= range1(1) && position <= range1(2)
integrated_area1 = integrated_area1 + intensity;
end
if position >= range2(1) && position <= range2(2)
integrated_area2 = integrated_area2 + intensity;
end
end
result = integrated_area2 / ((integrated_area1 + integrated_area2) + (integrated_area1 + integrated_area2));
fprintf('Integration Result: %f\n', result);
% Define the ALS baseline correction function
function baseline = als_baseline(y, p, lam)
L = numel(y);
D = diff(speye(L), 2);
w = ones(L, 1);
for i = 1:10
W = spdiags(w, 0, L, L);
C = chol(W + lam * (D' * D));
z = C \ (C' \ (w .* y));
w = p * (y > z) + (1 - p) * (y < z);
end
baseline = z;
end
  4 个评论
Mathieu NOE
Mathieu NOE 2023-6-15
I spent some time at this code and figure out what can be the issue - and wonder if the code reflects what the intention is
now, this code with the parameters we gave to findpeaks (prominence_threshold = 10;distance_threshold = 10;) will only return those two peaks
and as the first selected peak does not fullfill the statements vs
range1 = [2050, 2200];
range2 = [2800, 3000];
we are left with only one value used in this computation
if position >= range1(1) && position <= range1(2)
integrated_area1 = integrated_area1 + intensity;
end
if position >= range2(1) && position <= range2(2)
integrated_area2 = integrated_area2 + intensity;
end
so here
integrated_area1 = 0 (no peak valid)
integrated_area2 = 22.2544 (only one single (second) peak valid)
and of course, when you do this
result = integrated_area2 / ((integrated_area1 + integrated_area2) + (integrated_area1 + integrated_area2));
you can only get a result of 0.5 whatever the value of integrated_area2
also , why did you code this denominator with twice the same expression ? this is obviously the same as :
result = integrated_area2 / (2*(integrated_area1 + integrated_area2));
Mathieu NOE
Mathieu NOE 2023-6-15
I changed the value of to get more peaks being selected so now the result is different from 0.5
in the code I added a few lines to display how many peaks are retained for both ranges
I am still a bit concerned that what you call area_integral is not an area integral, but a simple amplitude sum of a few peaks. Aren't you sippose to do a real integral over the range you define , assuming this range actually contains a bump ?
% Load Raman data from a text file
data = readmatrix('03--Spectrum--006--Spec.Data 1.txt');
x = data(:, 1); % Wavenumbers
y = data(:, 2); % Intensity
% Baseline correction using the Asymmetric Least Squares (ALS) algorithm
baseline = als_baseline(y, 0.001, 10^6);
corrected_y = y - baseline;
% Find peaks in the spectrum- This section does not give any output
prominence_threshold = 3;
distance_threshold = 10;
[peaks,locs] = findpeaks(corrected_y, 'MinPeakProminence', prominence_threshold, 'MinPeakDistance', distance_threshold);
plot(x,corrected_y,'b',x(locs),peaks,'dr');
% Perform peak fitting and integration
range1 = [2050, 2200];
range2 = [2800, 3000];
integrated_area1 = 0;
integrated_area2 = 0;
n1 = 0; % new code
n2 = 0; % new code
for ci = 1:numel(locs)
position = x(locs(ci));
intensity = peaks(ci);
if position >= range1(1) && position <= range1(2)
integrated_area1 = integrated_area1 + intensity;
n1 = n1 + 1;
end
if position >= range2(1) && position <= range2(2)
integrated_area2 = integrated_area2 + intensity;
n2 = n2 + 1;
end
end
disp(['number of selected peaks in range 1 = ' num2str(n1)]); % new code
disp(['number of selected peaks in range 2 = ' num2str(n2)]); % new code
% result = integrated_area2 / ((integrated_area1 + integrated_area2) + (integrated_area1 + integrated_area2));
result = integrated_area2 / (2*(integrated_area1 + integrated_area2)); % same thing !
fprintf('Integration Result: %f\n', result);
% Define the ALS baseline correction function
function baseline = als_baseline(y, p, lam)
L = numel(y);
D = diff(speye(L), 2);
w = ones(L, 1);
for i = 1:10
W = spdiags(w, 0, L, L);
C = chol(W + lam * (D' * D));
z = C \ (C' \ (w .* y));
w = p * (y > z) + (1 - p) * (y < z);
end
baseline = z;
end

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by