optimalleaforder
Optimal leaf ordering for hierarchical clustering
Description
returns an optimal leaf ordering for the hierarchical binary cluster tree,
leafOrder
= optimalleaforder(tree
,D
)tree
, using the distances, D
. An optimal
leaf ordering of a binary tree maximizes the sum of the similarities between
adjacent leaves by flipping tree branches without dividing the clusters.
returns the optimal leaf ordering using one or more name-value pair arguments.leafOrder
= optimalleaforder(tree
,D
,Name,Value
)
Examples
Plot Dendrogram with Optimal Leaf Order
Create a hierarchical binary cluster tree using linkage
. Then, compare the dendrogram plot with the default ordering to a dendrogram with an optimal leaf ordering.
Generate sample data.
rng(0,"twister") % For reproducibility X = rand(10,2);
Create a distance vector and a hierarchical binary clustering tree. Use the distances and clustering tree to determine an optimal leaf order.
D = pdist(X);
tree = linkage(D,"average");
leafOrder = optimalleaforder(tree,D);
Plot the dendrogram with the default ordering and the dendrogram with the optimal leaf ordering.
figure() subplot(2,1,1) dendrogram(tree) title("Default Leaf Order") subplot(2,1,2) dendrogram(tree,reorder=leafOrder) title("Optimal Leaf Order")
The order of the leaves in the bottom figure corresponds to the elements in leafOrder
. The optimal leaf order flips tree branches to maximize the sum of the similarities between adjacent leaves.
leafOrder
leafOrder = 1×10
1 4 9 10 2 5 8 3 7 6
Create a scatter plot of the sample data and label the points.
figure()
plot(X(:,1),X(:,2),".")
text(X(:,1),X(:,2),num2str((1:size(X,1))'))
In the default leaf ordering, points 1 and 4 are located next to points 3 and 8. In the optimal leaf ordering, points 1 and 4 are located next to point 9, which reflects their relative positions in the scatter plot.
Optimal Leaf Order Using Inverse Distance Similarity
Generate sample data.
rng('default') % For reproducibility X = rand(10,2);
Create a distance vector and a hierarchical binary clustering tree.
D = pdist(X);
tree = linkage(D,'average');
Use the inverse distance similarity transformation to determine an optimal leaf order.
leafOrder = optimalleaforder(tree,D,'Transformation','inverse')
leafOrder = 1×10
1 4 9 10 2 5 8 3 7 6
Input Arguments
tree
— Hierarchical binary cluster tree
matrix returned by linkage
Hierarchical binary cluster tree, specified as an (M – 1)-by-3 matrix that you generate using linkage
, where M is the number of leaves.
D
— Distances
matrix | vector
Distances for determining similarities between leaves, specified as a matrix or vector of distances. For example, you can generate distances using pdist
.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: 'Criteria','group','Transformation','inverse'
specifies that the sum of similarities be maximized between every leaf and all other leaves in adjacent clusters, using an inverse similarity transformation.
Criteria
— Optimization criterion
'adjacent'
(default) | 'group'
Optimization criterion for determining an optimal leaf ordering, specified as the comma-separated pair consisting of 'criteria'
and one of these values:
'adjacent' | Maximize the sum of similarities between adjacent leaves. |
'group' | Maximize the sum of similarities between every leaf and all other leaves in the adjacent clusters at the same level of the dendrogram. |
Example: 'Criteria','group'
Transformation
— Method for transforming distances to similarities
'linear'
(default) | 'inverse'
| function handle
Method for transforming distances to similarities, specified as the comma-separated pair consisting of 'Transformation'
and one of 'linear'
, 'inverse'
, or a function handle.
Let di,j and Simi,j denote the distance and similarity between leaves i and j, respectively. The included similarity transformations are:
'linear' | Simi,j = maxi,j (di,j ) – di,j |
'inverse' | Simi,j = 1/di,j |
To use a custom transformation function, specify a handle to a function that accepts a matrix of distances, D
, and returns a matrix of similarities, S
. The function should be monotonic decreasing in the range of distance values. S
must have the same size as D
, with S(i,j)
being the similarity computed based on D(i,j)
.
Example: 'Transformation',@myTransform
Output Arguments
leafOrder
— Optimal leaf order
vector
Optimal leaf order, returned as a length-M vector, where M is the number of leaves. leafOrder
is a permutation of the vector 1:M
, giving an optimal leaf ordering based on the specified distances and similarity transformation.
References
[1] Bar-Joseph, Z., Gifford, D.K., and Jaakkola, T.S. (2001). "Fast optimal leaf ordering for hierarchical clustering." Bioinformatics Vol. 17, Suppl 1:S22–9. PMID: 11472989.
Version History
Introduced in R2012b
See Also
dendrogram
| linkage
| pdist
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)