SOTAVerified

A High-Performance External Validity Index for Clustering with a Large Number of Clusters

2024-09-22Unverified0· sign in to hype

Mohammad Yasin Karbasian, Ramin Javadi

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper introduces the Stable Matching Based Pairing (SMBP) algorithm, a high-performance external validity index for clustering evaluation in large-scale datasets with a large number of clusters. SMBP leverages the stable matching framework to pair clusters across different clustering methods, significantly reducing computational complexity to O(N^2), compared to traditional Maximum Weighted Matching (MWM) with O(N^3) complexity. Through comprehensive evaluations on real-world and synthetic datasets, SMBP demonstrates comparable accuracy to MWM and superior computational efficiency. It is particularly effective for balanced, unbalanced, and large-scale datasets with a large number of clusters, making it a scalable and practical solution for modern clustering tasks. Additionally, SMBP is easily implementable within machine learning frameworks like PyTorch and TensorFlow, offering a robust tool for big data applications. The algorithm is validated through extensive experiments, showcasing its potential as a powerful alternative to existing methods such as Maximum Match Measure (MMM) and Centroid Ratio (CR).

Tasks

Reproductions