Git: Clustering Based on Graph of Intensity Topology
Zhangyang Gao, Haitao Lin, Cheng Tan, Lirong Wu, Stan. Z Li
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/gaozhangyang/gitOfficialIn papernone★ 10
- github.com/gaozhangyang/DGCpytorch★ 10
- github.com/2024-MindSpore-1/Code2/tree/main/model-1/gitmindspore★ 0
- github.com/yangyucheng000/University/tree/main/model-2/gitmindspore★ 0
Abstract
Accuracy, Robustness to noises and scales, Interpretability, Speed, and Easy to use (ARISE) are crucial requirements of a good clustering algorithm. However, achieving these goals simultaneously is challenging, and most advanced approaches only focus on parts of them. Towards an overall consideration of these aspects, we propose a novel clustering algorithm, namely GIT (Clustering Based on Graph of Intensity Topology). GIT considers both local and global data structures: firstly forming local clusters based on intensity peaks of samples, and then estimating the global topological graph (topo-graph) between these local clusters. We use the Wasserstein Distance between the predicted and prior class proportions to automatically cut noisy edges in the topo-graph and merge connected local clusters as final clusters. Then, we compare GIT with seven competing algorithms on five synthetic datasets and nine real-world datasets. With fast local cluster detection, robust topo-graph construction and accurate edge-cutting, GIT shows attractive ARISE performance and significantly exceeds other non-convex clustering methods. For example, GIT outperforms its counterparts about 10\% (F1-score) on MNIST and FashionMNIST. Code is available at https://github.com/gaozhangyang/GIT.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Fashion-MNIST | AE+GIT | ARI | 49 | — | Unverified |
| Fashion-MNIST | k-Means++ | ARI | 35 | — | Unverified |
| Fashion-MNIST | Spectral Clustering | ARI | 34 | — | Unverified |
| Fashion-MNIST | GIT | ARI | 32 | — | Unverified |
| Fashion-MNIST | SpectACI | ARI | 29 | — | Unverified |
| Fashion-MNIST | QuickShiftPP | ARI | 16 | — | Unverified |
| MNIST | AE+GIT | ARI | 77 | — | Unverified |
| MNIST | GIT | ARI | 42 | — | Unverified |
| MNIST | k-Means++ | ARI | 36 | — | Unverified |
| MNIST | Spectral Clustering | ARI | 33 | — | Unverified |
| MNIST | SpectACI | ARI | 17 | — | Unverified |
| MNIST | QuickShiftPP | ARI | 13 | — | Unverified |
| Olivetti face | GIT | F1-score | 62 | — | Unverified |
| Olivetti face | QuickShiftPP | F1-score | 60 | — | Unverified |
| Olivetti face | k-Means++ | F1-score | 52 | — | Unverified |
| Olivetti face | Spectral Clustering | F1-score | 37 | — | Unverified |
| Olivetti face | SpectACI | F1-score | 34 | — | Unverified |