Comparing Timeseries to get similar Timeseries based on Euclidean Distance

8 次查看(过去 30 天)
I have timeseries data in an array which I want to compare in order to build clusters of similar time series.
Generate sample data using the following piece of code:
timeseries = [1, 2, 3, 4; 1, 2, 3, 4; 1, 2, 3, 4; 4, 5, 6, 7; 4, 5, 6, 8; 4, 5, 6, 9; 4, 5, 6, 10];
Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp.
First I compute the eucledian distance of the data generated above. This can be done through
distance = squareform(pdist(timeseries));
From the above distance matrix we can find out unique distances by code below
unique_distances = unique(distance);
I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8). See below
t1 , t2 .. represent time series 1, 2 and so on.
First row and first column of the matrix would show how many timeseries have zero distance with the first time series and so on so forth.
First row and second column of matrix represent how many timeseries have distance of 1 with first timeseries and so on and so forth.
I am new to MATLAB I've done the desired result using code below;
dist = nan(size(timeseries, 1), size(unique_distances,1));
for i = 1:size(timeseries, 1)
disp(i)
for j = 1:size(unique_distances,1)
disp(j)
dist(i,j) = sum(distance(i,:) == unique_distances(j));
end
end
I am looking for a vectorised approach for above code.
Also I need to cluster based on time series which has zero distance with maximum number of other time series therefore I need to sort the matrix based on that as well. In this example it is already sorted as t1 had distance of zero with 3 timeseries as it can be seen from the matrix. an 3 is the max value aswell.
  2 个评论
Ameer Hamza
Ameer Hamza 2020-12-26
You mentioned, "I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8)." But the matrix you create has seven rows. Are time series arranged along with columns or rows? I think you intend to take the transpose of the matrix before passing it to pdist()
distance = squareform(pdist(timeseries.'));
Furqan Hashim
Furqan Hashim 2020-12-27
You've correctly pointed out the mistake. I've edited my question where I've rephrased
"Here we have 4 timeseries where each column represent a timeseries and each row represen the timestamp."
to
"Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp."
Now we do not need to take the transpose, for simplicity we can consider each row represents a timeseries instead of each column.

请先登录,再进行评论。

回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

产品


版本

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by