Error while trying to perform hierarchical clustering
1 次查看(过去 30 天)
显示 更早的评论
I have a problem about clustering particularly hierarchical. I have being trying to cluster a dataset and this is how the code goes, I import the dataset(Scaleddata) which consist of 82815 rows and 16 cloumns (82815x16table). It contains both catergorical and numerical data, so i select the numrical to change to array.
data=Scaleddata(:,4:16);
data=table2array(data);
T = clusterdata(data,3)
Error using internal.stats.pdistmex
Out of memory. Type "help memory" for your options.
Error in pdist (line 242)
Y = internal.stats.pdistmex(X',dist,additionalArg);
Error in linkage (line 259)
Z = internal.stats.linkagemex(Y,method,pdistArg, memEff);
Error in clusterdata (line 130)
Z = linkage(X,linkageargs{1},pdistargs, 'savememory',savememoryargs);
I got the above error.
When i tried to follow the steps to hierarchical clustering in the documentation to i got errors. Please any assitance would be appreaciated as i am new to matlab. Thanks
0 个评论
回答(1 个)
Omega
2025-4-6
Hey!
Hierarchical clustering can be memory-intensive due to the computation of pairwise distances, particularly with large datasets like yours. As a result, you're encountering memory issues.
One approach to alleviate this is to reduce the size of your dataset and consider using a random sample of your data if that aligns with your analysis goals. Additionally, performing dimensionality reduction with techniques such as Principal Component Analysis (PCA) can help by reducing the number of features.
Moreover, hierarchical clustering might not be the most efficient choice for large datasets. Methods like "kmeans" or "kmedoids" are generally more memory-efficient and could be suitable alternatives depending on your analysis needs. If possible, running your code on a machine with more RAM could also help manage the memory load.
If you have MATLAB's Parallel Computing Toolbox, consider using "tall" arrays, which are designed to handle large datasets that don't fit into memory. Additionally, clearing unnecessary variables with ">> clearvars" and checking available memory with the ">> memory" command can help manage your workspace more effectively.
0 个评论
另请参阅
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!