Auto-CVI-Tool , An Automatic Cluster Validity Index Toolbox

版本 1.0.0 (131.9 KB) 作者: Farhad Abedinzadeh

Automatic toolbox for Cluster Validity Indexes (CVI) to determine the number of clusters automatically

https://github.com/farhadabedinzadeh/Auto-CVI-Tool

关注

0.0

(0)

93.0 次下载

更新时间 2022/10/6

在 GitHub 上查看许可证

`Auto-CVI-Tool`

An Automatic Toolbox for Cluster Validity Indexes (CVI)

A cluster analysis involves identifying the optimal number and natural division of clusters through automatic clustering. A cluster validity index (CVI) is a simple technique for estimating the number of clusters. Several cluster solutions have been proposed in the literature in terms of intra-cluster cohesiveness and inter-cluster separation. In spite of this, it is crucial to identify the situations where these CVIs work well and their limitations. To estimate the number of clusters, this toolbox presents 28 robust CVIs. It is extremely user-friendly and does not require any coding knowledge. Without writing a single line of code, it is possible to compare 28 CVIs and visualize the results in a comparable manner. When the data is loaded, all parameters will be automatically selected by the user, or the default setting will be used, and the CVIs can be compared without any additional programming. It is important to note that one section of this paper was used in the development of this toolbox,I would appreciate a citation to both the reference article and to myself if you use any part of this toolbox.

A cluster validity index (CVI) estimates the quality of a clustering solution by defining a relationship between intracluster cohesiveness (within-group scatter) and intercluster separation (between-group scatter). Table1 summarizes the 22 CVIs examined in this toolbox. Each CVI is identified by an acronym in the table, which is followed by an up arrow ↑ or a down arrow ↓ to indicate whether the index is maximized or minimized, respectively.

`Table1`

no.	Index	Full Name & Accronym	Min\Max
1	chindex	Calinski-Harabasz index (ch).	`↑`
2	cindex	C index (cind).	`↓`
3	copindex	COP index (cop).	`↓`
4	csindex	CS index (cs).	`↓`
5	cvddindex	Index based on density-involved distance (cvdd).	`↑`
6	cvnnindex	Index based on nearest neighbors (cvnn).	`↓`
7	dbindex	Davies-Bouldin index (db).	`↓`
8	db2index	Enhanced Davies-Bouldin index (db2).	`↓`
9	dbcvindex	Density-based index (dbcv).	`↑`
10	dunnindex	Dunn index (dunn).	`↑`
11	gd31index	Dunn index variant 3,1 (gd31).	`↑`
12	gd33index	Dunn index variant 3,3 (gd33).	`↑`
13	gd41index	Dunn index variant 4,1 (gd41).	`↑`
14	gd43index	Dunn index variant 4,3 (gd43).	`↑`
15	gd51index	Dunn index variant 5,1 (gd51).	`↑`
16	gd53index	Dunn index variant 5,3 (gd53).	`↑`
17	lccvindex	Index based on local cores (lccv).	`↑`
18	pbmindex	PBM index (pbm).	`↑`
19	sdbwindex	S_Dbw validity index (sdbw).	`↓`
20	sfindex	Score Function index (sf).	`↑`
21	silindex	Silhouette index (sil).	`↑`
22	ssddindex	Index based on shapes, sizes, densities, and separation distances (ssdd).	`↓`
23	svindex	SV index (sv).	`↑`
24	symindex	Symmetry index (sym).	`↑`
25	symdbindex	Davies-Bouldin index based on symmetry (sdb).	`↓`
26	symdunnindex	Dunn index based on symmetry (sdunn).	`↑`
27	wbindex	WB index (wb).	`↓`
28	xbindex	Xie-Beni index (xb).	`↓`

`How to Use?`

There are two scripts which are named KMeans_Evaluation.m and Hierarchichal_Evaluation.m; they evalute the clustering based on KMeans and Hierarchichal Clustering resepctively.

KMeans_Evaluation.m parameter settings

data : data
- load data
DistanceKMeans : Distance Type for k-means clustering (Table2)
- ```
DistanceKMeans = DistKMeans;
```

Kmax : Maximum Number of Clusters

Kmax = 6; % Maximum Number of Cluster
clust = zeros(size(data,1),Kmax);
for k=1:Kmax
   clust(:,k) = kmeans(data,k,'distance',DistanceKMeans);
end

CVI : Select form (Table1)

%% Select CVI
CVI = Select_CVI_KMeans;
% Evaluation of the clustering solutions
eva = evalcvi(clust,CVI, data);

`Table2`

No.	Distance
2	sqeuclidean
3	cityblock
4	hamming
5	correlation
6	cosine

You may compare multiple CVIs simultaneously by executing the following code:

CVIs = Select_Multiple_CVI_KMeans;
Multiple_Result = Do_Multiple(CVIs,clust,data);

Also it's possible to visualize the reuslt automatically.

Hierarchichal_Evaluation.m parameter settings

data : data
- load data
HierarchichalMethod : Method for Hierarchical Cluster Tree (Table3)
- ```
 Z = linkage(data, HierarchichalMethod);
```

Kmax : Maximum Number of Clusters

Kmax = 6; % Maximum Number of Cluster
for k=1:Kmax
clust(:,k) = cluster(Z, 'maxclust', k);
end

DistanceType : Type of pairwise distance between two sets of observations (Table4)
- ```
 DistanceType = Distance_PDIST2;
 DXX = pdist2(data,data,DistanceType);
```

CVI : Select form (Table1)

 CVI = Select_CVI_Hierarchichal;
 eva = evalcvi(clust,CVI, DXX);

If you wish to compare multiple CVIs,run following code

CVIs = Select_Multiple_CVI_Hierarchichal;
Multiple_Result = Do_Multiple(CVIs,clust,DXX);

`Table3`

No.	Method
2	average
3	centroid
4	complete
5	median
6	single
7	ward

`Table4`


euclidean	seuclidean
squaredeuclidean	cityblock
minkowski	jaccard
chebychev	mahalanobis
correlation	cosine
spearman	hamming

`Visualization`

Refrences

(1) A. José-García and W. Gómez-Flores.
    A survey of cluster validity indices for automatic data clustering using differential evolution.
    The Genetic and Evolutionary Computation Conference* (GECCO '21), Lille, France, 2021.
    DOI: 10.1145/3449639.3459341
(2) Farhad Abedinzadeh (2022). Auto-CVI-Tool (https://github.com/farhadabedinzadeh/Auto-CVI-Tool/releases/tag/v1.0.0),
    GitHub. Retrieved October 6, 2022.

Further Question

引用格式

Farhad Abedinzadeh (2024). Auto-CVI-Tool , An Automatic Cluster Validity Index Toolbox (https://github.com/farhadabedinzadeh/Auto-CVI-Tool/releases/tag/v1.0.0), GitHub. 检索时间: 2024/11/22.

MATLAB 版本兼容性

创建方式 R2022a

兼容 R2020b 到 R2022b 的版本

平台兼容性

Windows macOS Linux

标签添加标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

cvi

cvi/cvi_utils

cvi/proximity

functions

cprintf

版本	已发布	发行说明
1.0.0	2022/10/6		下载

要查看或报告此来自 GitHub 的附加功能中的问题，请访问其 GitHub 仓库。

Auto-CVI-Tool , An Automatic Cluster Validity Index Toolbox

`Auto-CVI-Tool`

`Table1`

`How to Use?`

`Table2`

`Table3`

`Table4`

`Visualization`

Refrences

Further Question

引用格式

必需项

MATLAB 版本兼容性

平台兼容性

标签添加标签

Community Treasure Hunt

探索实时编辑器

cvi

cvi/cvi_utils

cvi/proximity

functions

Auto-CVI-Tool , An Automatic Cluster Validity Index Toolbox

Auto-CVI-Tool

Table1

How to Use?

Table2

Table3

Table4

Visualization

Refrences

Further Question

引用格式

必需项

MATLAB 版本兼容性

平台兼容性

标签 添加标签

Community Treasure Hunt

探索实时编辑器

cvi

cvi/cvi_utils

cvi/proximity

functions

`Auto-CVI-Tool`

`Table1`

`How to Use?`

`Table2`

`Table3`

`Table4`

`Visualization`

标签添加标签