how to find similiarity of peak/pattern time series?

34 次查看(过去 30 天)
I have two timeseries that looks like as below, my aim is how to correlate them for example to find daily peak and check if they are match or any kind relationship (the hypothesis is one curve influence the other parameter), because the pattern itseft seems similiar but i don't know how to quantify them (here my script). so far i am tried to do crosscorrelation but also not sure if it is the best way to quantify it. i will appreciate any suggestion from you
clear all, clc,% close all
PE = load('PE.mat');
PE = PE.PE;
start_date = datetime(2023, 1, 1);
end_date = datetime(2024, 5, 31);
num_points = length(PE);
timePE = linspace(start_date, end_date, num_points);
% --- Load Tidal data ---
b = load("tidalfile.mat")
tidal= b.tidal;
timetidal = b.timetidal;
% Convert timetidal if stored as datenum
if isnumeric(timetidal)
timetidal = datetime(timetidal, 'ConvertFrom', 'datenum');
end
% Select time range (optional), to check only
cutoff_date = datetime(2024, 5, 31);
idxPE = timePE <= cutoff_date;
PE_select = PE(idxPE);
PE_tbound = timePE(idxPE);
idxTide = timetidal <= cutoff_date;
tidal_select = tidal(idxTide);
tidal_tbound = timetidal(idxTide);
% Find Global max every winDays
winDays = 1; % window size in day
edges = PE_tbound(1):days(winDays):PE_tbound(end);
%
PE_peaks = [];
PE_times = datetime.empty;
for i = 1:length(edges)-1
idx = (PE_tbound >= edges(i)) & (PE_tbound < edges(i+1));
if any(idx)
[val, id] = max(PE_select(idx)); % global max in this window
PE_peaks(end+1) = val;
tmp = PE_tbound(idx);
PE_times(end+1) = tmp(id);
end
end
% For Tidal
Tide_peaks = [];
Tide_times = datetime.empty;
for i = 1:length(edges)-1
idx = (tidal_tbound >= edges(i)) & (tidal_tbound < edges(i+1));
if any(idx)
[val, id] = max(tidal_select(idx));
Tide_peaks(end+1) = val;
tmp = tidal_tbound(idx);
Tide_times(end+1) = tmp(id);
end
end
%% --- Detrend before cross-correlation ---
PE_peaks_detr = detrend(PE_peaks);
Tide_peaks_detr = detrend(Tide_peaks);
% Cross-correlation
n = min(length(PE_peaks_detr), length(Tide_peaks_detr)); % align lengths
[xc, lags] = xcorr(Tide_peaks_detr(1:n), PE_peaks_detr(1:n),'normalized');
[~, I] = max(abs(xc));
bestLag = lags(I);
% Plot comparison
figure
subplot(2,1,1)
plot(PE_times, PE_peaks_detr, '-', 'DisplayName','PE '); axis tight
hold on
yyaxis right
plot(Tide_times, Tide_peaks_detr, '-', 'DisplayName','Tidal'); axis tight
legend('Location','southoutside','Orientation', 'horizontal'); grid on; title('1-day window peaks')
subplot(2,1,2)
plot(lags*winDays, xc, '-','LineWidth',1.5), axis tight
xlabel('Lag (days)'); ylabel('Correlation');
title('Cross-correlation between PE and tidal peaks');
grid on
set(gcf, 'Position', [100, 100, 900, 350]);
  3 个评论
Image Analyst
Image Analyst 2025-9-11,4:22
@nirwana "...that looks like as below..." <= Did I somehow overlook where the "below" is? Where is the screenshot of the plots? Where do I look for them?
nirwana
nirwana 2025-9-11,4:44
编辑:nirwana 2025-9-11,4:45
@Image Analyst sorry forgot to put figure. here is the original overlay between PE and tidal before detrending (partial length)
here is my xcorr result

请先登录,再进行评论。

回答(1 个)

Umar
Umar 2025-9-11,3:36
编辑:Umar 2025-9-11,3:41

Hi @Nirwana,

Thank you for sharing your code and detailing your analysis objectives. I understand that your primary goal is to assess whether two time series—PE and tidal data—exhibit a quantifiable relationship, particularly in terms of daily peaks, and to determine whether one may influence the other.

Based on your comments:

I have two timeseries … my aim is how to correlate them … because the pattern itself seems similar but I don’t know how to quantify them … so far I am tried to do crosscorrelation but also not sure if it is the best way to quantify it. I will appreciate any suggestion from you.

I have prepared a MATLAB script (`PE_Tidal_Correlation.m`) that implements a fully toolbox-free workflow to address these questions. The workflow and resulting plots provide direct answers as follows:

1. Daily Peak Extraction: The script identifies the maximum value within each day for both PE and tidal series. This isolates the daily peaks you are interested in and allows meaningful comparisons between the two series.

2. Detrending: Daily peaks are detrended using MATLAB’s built-in `detrend()` function. This removes linear trends, ensuring that correlations reflect day-to-day fluctuations rather than long-term trends.

3. Cross-Correlation: The script computes cross-correlation (`xcorr()`) between the detrended daily peaks. This quantitatively identifies the lag at which the series are most strongly related, directly addressing your question regarding potential influence between the curves.

4. Visualization:

Daily Peaks Comparison: Plots detrended daily maxima for both series on the same time axis, allowing visual assessment of pattern similarity.

Cross-Correlation Plot: Displays correlation values across different lags, highlighting the lag at which the series align most strongly.

5. Toolbox-Free Implementation: All steps—peak extraction, detrending, correlation, and plotting—use standard MATLAB functions, ensuring compatibility without additional toolboxes.

In summary, the script enables you to quantify the similarity between PE and tidal series both visually and numerically. Peaks are identified, aligned, and compared; correlations are calculated; and the resulting plots illustrate temporal alignment and degree of relationship.

References

Cross-Correlation Function The MathWorks Inc. (2022). `xcorr` — Cross-correlation and autocorrelation. MATLAB Documentation. Available: [ https://www.mathworks.com/help/matlab/ref/xcorr.html ]

Detrending Function The MathWorks Inc. (2022). `detrend` — Remove linear trends from data. MATLAB Documentation. Available: [ https://www.mathworks.com/help/matlab/ref/detrend.html ]

Datetime Function The MathWorks Inc. (2022). `datetime` — Create datetime arrays. MATLAB Documentation. Available: [ https://www.mathworks.com/help/matlab/ref/datetime.html ]

Linspace Function The MathWorks Inc. (2022). `linspace` — Generate linearly spaced vectors. MATLAB Documentation. Available: [ https://www.mathworks.com/help/matlab/ref/linspace.html ]

Max Function The MathWorks Inc. (2022). `max` — Find maximum values. MATLAB Documentation. Available: [ https://www.mathworks.com/help/matlab/ref/max.html ]

  2 个评论
nirwana
nirwana 2025-9-11,4:06
编辑:nirwana 2025-9-11,4:18
Thank for adding the reference, but i don't see any difference between my code and your script. What i want to ask is if there any other mthod to quantify it beside cross correlation itself.
Umar
Umar 2025-9-11,5:13
编辑:Umar 2025-9-11,5:17

Hi @Nirwana,

Thank you for your thoughtful comments. After reviewing your feedback, I’ve taken a broader approach to quantifying the similarity between the two time series (PE and tidal data). While cross-correlation is a useful method, I’ve expanded the analysis to incorporate additional techniques that might provide a deeper understanding of the relationship between the two series. Below is an updated version of the script that includes Dynamic Time Warping (DTW), Pearson Correlation*, and Mutual Information, which allow for a more comprehensive analysis beyond just cross-correlation.

Updated MATLAB Code:

clear all; clc; % clear workspace and command window
% --- Load Data ---
PE_struct = load('PE.mat');  % Load PE data
PE = PE_struct.PE;
tidal_struct = load('tidalfile.mat');  % Load Tidal data
tidal = tidal_struct.tidal;
timetidal = tidal_struct.timetidal;
% Convert timetidal if stored as datenum
if isnumeric(timetidal)
  timetidal = datetime(timetidal, 'ConvertFrom', 'datenum');
end
% Define PE time vector
start_date = datetime(2023, 1, 1);
end_date   = datetime(2024, 5, 31);
timePE = linspace(start_date, end_date, length(PE));
% Limit data to cutoff date
cutoff_date = datetime(2024, 5, 31);
PE_idx = timePE <= cutoff_date;
PE = PE(PE_idx);
timePE = timePE(PE_idx);
tidal_idx = timetidal <= cutoff_date;
tidal = tidal(tidal_idx);
timetidal = timetidal(tidal_idx);
% --- Extract Daily Peaks ---
winDays = 1;  % 1-day window
edges = timePE(1):days(winDays):timePE(end);
% Preallocate
PE_peaks = zeros(1, length(edges)-1);
PE_times = datetime.empty(1,0);
Tide_peaks = zeros(1, length(edges)-1);
Tide_times = datetime.empty(1,0);
% Extract PE peaks
for i = 1:length(edges)-1
  idx = timePE >= edges(i) & timePE < edges(i+1);
  if any(idx)
      [val, loc] = max(PE(idx));
      PE_peaks(i) = val;
      tmp = timePE(idx);
      PE_times(end+1) = tmp(loc);
  end
end
% Extract Tidal peaks
for i = 1:length(edges)-1
  idx = timetidal >= edges(i) & timetidal < edges(i+1);
  if any(idx)
      [val, loc] = max(tidal(idx));
      Tide_peaks(i) = val;
      tmp = timetidal(idx);
      Tide_times(end+1) = tmp(loc);
  end
end
% --- Detrend Peaks ---
PE_detr = detrend(PE_peaks);
Tide_detr = detrend(Tide_peaks);
% --- Cross-Correlation (Optional) ---
n = min(length(PE_detr), length(Tide_detr));  % align lengths
[xc, lags] = xcorr(Tide_detr(1:n), PE_detr(1:n), 'normalized');
[~, I] = max(abs(xc));
bestLag = lags(I);
% --- Dynamic Time Warping (DTW) Calculation ---
dtw_distance = dtw(PE_detr, Tide_detr);
disp(['DTW distance: ', num2str(dtw_distance)]);
% --- Pearson Correlation ---
pearson_corr = corr(PE_detr', Tide_detr');
disp(['Pearson Correlation: ', num2str(pearson_corr)]);
% --- Mutual Information Calculation ---
% Estimate the histogram of each time series
nbins = 50; % Number of bins for the histogram
% Marginal Entropies
p_pe = histcounts(PE_detr, nbins, 'Normalization', 
'probability'); 
p_tide = histcounts(Tide_detr, nbins, 'Normalization', 
'probability');
% Joint Entropy
% Joint histogram: we use hist3 but in a manual way (this will 
compute joint   
histograms)
edges_pe = linspace(min(PE_detr), max(PE_detr), nbins + 1);
edges_tide = linspace(min(Tide_detr), max(Tide_detr), nbins + 1);
[counts, ~, ~] = histcounts2(PE_detr, Tide_detr, edges_pe, 
edges_tide);
p_joint = counts / sum(counts(:)); % Normalize to get joint 
probability   
distribution
% Calculate the entropy of each distribution
H_pe = -sum(p_pe .* log2(p_pe + eps));  % eps to avoid log(0)
H_tide = -sum(p_tide .* log2(p_tide + eps)); 
H_joint = -sum(p_joint(:) .* log2(p_joint(:) + eps)); 
% Calculate Mutual Information
mi = H_pe + H_tide - H_joint;
disp(['Mutual Information: ', num2str(mi)]);
% --- Plot Results ---
figure;
subplot(2,1,1);
plot(PE_times, PE_detr, '-', 'DisplayName', 'PE'); hold on;
yyaxis right;
plot(Tide_times, Tide_detr, '-', 'DisplayName', 'Tidal');
xlabel('Time'); ylabel('Detrended Values');
legend('Location','southoutside','Orientation','horizontal'); 
grid on;
title('Daily Peaks Comparison');
subplot(2,1,2);
plot(lags * winDays, xc, '-', 'LineWidth', 1.5); grid on;
xlabel('Lag (days)'); ylabel('Correlation');
title(sprintf('Cross-correlation (Best Lag = %d days)', 
bestLag));
set(gcf, 'Position', [100, 100, 900, 350]);

Please see attached.

Explanation of Results:

1.Dynamic Time Warping (DTW):

‘179544.5271` — This distance metric quantifies how well the two time series align when allowing for shifts in time. A lower value would indicate better alignment, and this result shows a relatively high dissimilarity. DTW accounts for any non-linear shifts in the series and is useful when the patterns are similar but may be temporally misaligned.

2.Pearson Correlation: `-0.099041` — This is a measure of the linear relationship between the two series. The negative value suggests a very weak inverse linear relationship, which means that while the series may look similar, their daily peaks do not align in a clear linear fashion. Pearson correlation might not capture more complex, non-linear relationships that exist between the series.

3.Mutual Information: ‘1.6276` — This metric captures the *shared information* between the two series, even if the relationship is non-linear. A positive value (greater than 0) suggests that the two series share some common information, though the amount is moderate. This is a more general measure compared to Pearson, as it can capture complex, non-linear relationships.

How This Addresses Your Comments:

Beyond Cross-Correlation: This updated analysis extends beyond cross-correlation by incorporating DTW and Mutual Information, offering a more holistic view of the relationship between the two series. These additional methods account for both temporal misalignments (via DTW) and non-linear dependencies(via Mutual Information), which cross-correlation alone might not fully capture.

Answer to Your Query on Quantification Methods: The results from DTW, Pearson Correlation, and Mutual Information give you a multi-faceted perspective on the similarity between the two time series. Cross-correlation measures the temporal alignment of the series, while DTW gives a flexible alignment metric, Pearson gives linear correlation, and Mutual Information offers insight into shared information that might not be captured by linear methods.

These methods together provide a comprehensive analysis that helps quantify the relationship more effectively than relying on cross-correlation alone.

If you need any further explanations or modifications, feel free to reach out. I’d be happy to discuss these results in more detail!

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Preprocessing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by