Confidence intervals/significance testing for dendrogram

1 次查看(过去 30 天)
Is there a simple and/or straightforward way to generate confidence intervals or significance measures for each of the branches when using the dendrogram function in Matlab?
I am just using the standard code for generating the dendrogram:
Y=pdist(Matrix, 'euclidean'); Z=linkage(Y,'average'); [H,T]=dendrogram(z,20);
and would like to assess the clustering quality. From what I can tell, it would require some permutation-based inferences.
I was hoping for a built-in function, but can't find one.
There was a relatively recent publication doing this for gene research: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3023458/
"we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division"
Could this be easily implemented?
Thanks

回答(1 个)

Sebastien De Landtsheer
As far as I know, the standard way to assess support for nodes in a tree is to compute multiple trees from bootstrapped data and counting how often a given group of the original tree appears in the perturbed ones, then indicating the support on the original tree. I have some code doing that, I might write a function eventually.

类别

Help CenterFile Exchange 中查找有关 Cluster Analysis and Anomaly Detection 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by