How to group different sensor's data based on their similarities?
1 次查看(过去 30 天)
显示 更早的评论
I have multiple sensors’ data over one year. I wanted to know if there are any unsupervised methods to divide and group sensors’ data that have close characteristics/behavior.
For example, if I have electricity consumption data for 1000 buildings stored in a table with 1000 columns, how I can divide or cluster these columns such that those that have close characteristics are placed in a specific group?
I appreciate your time in advance.
Thank you.
Time D1 D2 D3 D4 D5 Dn
____________________ _______ _______ _______ _______ _______ .... _______
01-Jan-2020 00:00:00 2.9675 32.502 23.454 3.5067 . .
01-Jan-2020 00:01:00 -6.298 -96.793 -64.711 -9.9581 . .
01-Jan-2020 00:02:00 -5.5285 -75.355 -54.29 -8.215 . .
01-Jan-2020 00:03:00 -1.4514 -34.475 -24.879 -3.468 . .
01-Jan-2020 00:04:00 3.9736 66.112 42.284 6.639 . .
01-Jan-2020 00:05:00 3.1481 64.577 41.262 6.9614 . .
01-Jan-2020 00:06:00 -44.042 -699.24 -414.33 -75.339 . .
01-Jan-2020 00:07:00 4.4172 69.015 37.355 6.6763 . .
01-Jan-2020 00:08:00 23.509 284.8 186.89 32.597 . .
01-Jan-2020 00:09:00 17.329 214.71 124.45 20.634 . .
6 个评论
Walter Roberson
2022-6-25
编辑:dpb
2022-6-25
principal component analysis, and cross-correlation might help
回答(1 个)
Abhas
2025-5-28
编辑:Abhas
2025-5-28
You can use several learning methods in MATLAB to cluster your building electricity consumption data by similar characteristics. Here are some effective approaches for your scenario:
1. K-means Clustering: This is ideal for your use case as it:
- Groups buildings with similar consumption patterns
- Identifies representative centroids for each cluster
- Is efficient for large datasets (1000 buildings)
- Provides clear membership assignments
2. Hierarchical Clustering: This creates a dendrogram that shows:
- Relationships between all buildings
- How clusters merge at different similarity levels
- Flexibility to choose the number of clusters after analysis
- Good for exploring the natural grouping structure
3. PCA + Clustering: This two-step approach will:
- Reduce the dimensionality of your time series data
- Identify the most important consumption patterns
- Make clustering more effective by removing noise
- Improve visualization of the clusters
4. Dynamic Time Warping (DTW): Particularly useful for energy data because:
- It handles temporal shifts in consumption patterns
- Buildings with similar patterns but different peak times can be grouped
- It's more robust to phase differences than Euclidean distance
5. Spectral Clustering: Good for identifying complex relationships:
- Can find non-convex cluster shapes
- Often performs better on complex real-world data
- Considers the global structure of your dataset
You may refer to the below MathWorks documentation links to know more about each of them:
- K-Means: https://www.mathworks.com/help/stats/kmeans.html
- Hierarchical: https://www.mathworks.com/help/stats/hierarchical-clustering.html
- PCA: https://www.mathworks.com/help/stats/pca.html
- DTW: https://www.mathworks.com/help/signal/ref/dtw.html
- Special Clustering: https://www.mathworks.com/help/stats/spectral-clustering.html
I hope this helps!
0 个评论
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!