How to group different sensor's data based on their similarities?

1 次查看(过去 30 天)
I have multiple sensors’ data over one year. I wanted to know if there are any unsupervised methods to divide and group sensors’ data that have close characteristics/behavior.
For example, if I have electricity consumption data for 1000 buildings stored in a table with 1000 columns, how I can divide or cluster these columns such that those that have close characteristics are placed in a specific group?
I appreciate your time in advance.
Thank you.
Time D1 D2 D3 D4 D5 Dn
____________________ _______ _______ _______ _______ _______ .... _______
01-Jan-2020 00:00:00 2.9675 32.502 23.454 3.5067 . .
01-Jan-2020 00:01:00 -6.298 -96.793 -64.711 -9.9581 . .
01-Jan-2020 00:02:00 -5.5285 -75.355 -54.29 -8.215 . .
01-Jan-2020 00:03:00 -1.4514 -34.475 -24.879 -3.468 . .
01-Jan-2020 00:04:00 3.9736 66.112 42.284 6.639 . .
01-Jan-2020 00:05:00 3.1481 64.577 41.262 6.9614 . .
01-Jan-2020 00:06:00 -44.042 -699.24 -414.33 -75.339 . .
01-Jan-2020 00:07:00 4.4172 69.015 37.355 6.6763 . .
01-Jan-2020 00:08:00 23.509 284.8 186.89 32.597 . .
01-Jan-2020 00:09:00 17.329 214.71 124.45 20.634 . .
  6 个评论
Walter Roberson
Walter Roberson 2022-6-25
编辑:dpb 2022-6-25
principal component analysis, and cross-correlation might help
smoa
smoa 2022-6-25
编辑:smoa 2022-6-25
Thank you @Walter Roberson for your suggestions. I will try corr(x) to see their correlation and perhaps find those that are close to each other.

请先登录,再进行评论。

回答(1 个)

Abhas
Abhas 2025-5-28
编辑:Abhas 2025-5-28
Hi @smoa,
You can use several learning methods in MATLAB to cluster your building electricity consumption data by similar characteristics. Here are some effective approaches for your scenario:
1. K-means Clustering: This is ideal for your use case as it:
  • Groups buildings with similar consumption patterns
  • Identifies representative centroids for each cluster
  • Is efficient for large datasets (1000 buildings)
  • Provides clear membership assignments
2. Hierarchical Clustering: This creates a dendrogram that shows:
  • Relationships between all buildings
  • How clusters merge at different similarity levels
  • Flexibility to choose the number of clusters after analysis
  • Good for exploring the natural grouping structure
3. PCA + Clustering: This two-step approach will:
  • Reduce the dimensionality of your time series data
  • Identify the most important consumption patterns
  • Make clustering more effective by removing noise
  • Improve visualization of the clusters
4. Dynamic Time Warping (DTW): Particularly useful for energy data because:
  • It handles temporal shifts in consumption patterns
  • Buildings with similar patterns but different peak times can be grouped
  • It's more robust to phase differences than Euclidean distance
5. Spectral Clustering: Good for identifying complex relationships:
  • Can find non-convex cluster shapes
  • Often performs better on complex real-world data
  • Considers the global structure of your dataset
You may refer to the below MathWorks documentation links to know more about each of them:
  1. K-Means: https://www.mathworks.com/help/stats/kmeans.html
  2. Hierarchical: https://www.mathworks.com/help/stats/hierarchical-clustering.html
  3. PCA: https://www.mathworks.com/help/stats/pca.html
  4. DTW: https://www.mathworks.com/help/signal/ref/dtw.html
  5. Special Clustering: https://www.mathworks.com/help/stats/spectral-clustering.html
I hope this helps!

类别

Help CenterFile Exchange 中查找有关 MATLAB 的更多信息

产品


版本

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by