SOTAVerified

Clustering for Binary Featured Datasets

2018-10-25Transactions on Engineering Technologies 2018Code Available0· sign in to hype

Peter Taraba

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Clustering is one of the most important concepts for unsupervised learning in machine learning. While there are numerous clustering algorithms already, many, including the popular one—k-means algorithm, require the number of clusters to be specified in advance, a huge drawback. Some studies use the silhouette coefficient to determine the optimal number of clusters. In this study, we introduce a novel algorithm called Powered Outer Probabilistic Clustering, show how it works through back-propagation (starting with many clusters and ending with an optimal number of clusters) , and show that the algorithm converges to the expected (optimal) number of clusters on theoretical examples.

Tasks

Reproductions