Problem with the cluster/linkage function

2 次查看(过去 30 天)
hey guys!
i have to cluster some data. The data i have to cluster are basically matrices in which the price difference of different nodes is written.
I have an adjacencymatrix in which the informstion is stored, whether 2 nodes are connected with each other or not. If they are connected it is marked with a 1. the rest of the fields are 0.
secondly every node has a price. the goal of the clustering is to cluster all nodes in n different zones on base of the price differences and the adjacencymatrix. The script to accomplish this is attached.
My problem is, that i would like to use the setting 'ward' of the linkage function, so that i have differenz cluster in which the different nodes are stored. But i cant really use this setting right now, cause under this setting only euclidical distances are allowed, and right now for every node pair which is not connected i'm using inf. so in every field of a nodepair without a connection the value inf is given. i cant replace it with 0 cause then the algorithm would assume the goven price is 0.
my goal is to cluster the nodes with the smallest price difference possible into the different clusters.
i dont want to use the setting 'single' cause it just leads to one big cluster, and if i for example want 4 clusters, its mostly one big cluster, and 3 clusters with one node each.
can somebody help me?

回答(1 个)

Yash Sharma
Yash Sharma 2024-5-27
To achieve your goal of clustering nodes based on price differences while considering the adjacency matrix, you'll need to adapt your approach to work with the 'ward' linkage method in hierarchical clustering. As you've noted, the 'ward' method requires Euclidean distances, and using inf for disconnected nodes is not compatible with this requirement. Here's a strategy to reframe your problem to use the 'ward' linkage method effectively:
Step 1: Calculate Price Differences
Compute the pairwise price differences between all nodes. For nodes not connected, assign a large but finite value instead of inf to maintain compatibility with Euclidean distances required by the 'ward' method.
Step 2: Hierarchical Clustering
Apply the 'ward' linkage method on your modified distance matrix, which includes the price differences (and large values for disconnected nodes).
Z = linkage(squareform(priceDiffMatrix), 'ward');
Step 4: Determine the Number of Clusters
Decide on the number of clusters, n, based on your requirements. You can use the dendrogram function to visualize the clustering process and help decide on n.
dendrogram(Z)
Step 3: Determine and Form Clusters
Decide on the number of clusters n, and then use the cluster function to assign nodes to clusters.
clusters = cluster(Z, 'maxclust', n);
This approach allows you to cluster nodes based on price differences while considering their connectivity, compatible with the 'ward' method's requirements.
Hope this helps!

产品


版本

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by