SOTAVerified

Text Clustering

Grouping a set of texts in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). (Source: Adapted from Wikipedia)

Papers

Showing 125 of 123 papers

TitleStatusHype
MTEB: Massive Text Embedding BenchmarkCode4
Text Clustering as Classification with LLMsCode1
Large Language Models Enable Few-Shot ClusteringCode1
ClusterLLM: Large Language Models as a Guide for Text ClusteringCode1
Robust Representation Learning with Reliable Pseudo-labels Generation via Self-Adaptive Optimal Transport for Short Text ClusteringCode1
DeepLens: Interactive Out-of-distribution Data Detection in NLP ModelsCode1
Training Effective Neural Sentence Encoders from Automatically Mined ParaphrasesCode1
EASE: Entity-Aware Contrastive Learning of Sentence EmbeddingCode1
Proposition-Level Clustering for Multi-Document SummarizationCode1
Proposition-Level Clustering for Multi-Document SummarizationCode1
Supporting Clustering with Contrastive LearningCode1
Discovering New Intents with Deep Aligned ClusteringCode1
ComStreamClust: a communicative multi-agent approach to text clustering in streaming dataCode1
Dissimilarity Mixture Autoencoder for Deep ClusteringCode1
Neural Topic Modeling with Bidirectional Adversarial TrainingCode1
Enhancement of Short Text Clustering by Iterative ClassificationCode1
Short Text Clustering via Convolutional Neural NetworksCode1
CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward PassCode0
Moving Past Single Metrics: Exploring Short-Text Clustering Across Multiple Resolutions0
Advanced Text Analytics -- Graph Neural Network for Fake News Detection in Social Media0
k-LLMmeans: Scalable, Stable, and Interpretable Text Clustering via LLM-based Centroids0
Reliable Pseudo-labeling via Optimal Transport with Attention for Short Text ClusteringCode0
Discriminative Representation learning via Attention-Enhanced Contrastive Learning for Short Text ClusteringCode0
LITA: An Efficient LLM-assisted Iterative Topic Augmentation Framework0
Dial-In LLM: Human-Aligned LLM-in-the-loop Intent Clustering for Customer Service Dialogues0
Show:102550
← PrevPage 1 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ST5-XXLV-Measure43.71Unverified
2MPNetV-Measure43.69Unverified
3GTR-XXLV-Measure42.42Unverified
4MiniLM-L6V-Measure42.35Unverified
5ST5-XLV-Measure42.34Unverified
6MiniLM-L12V-Measure41.81Unverified
7ST5-LargeV-Measure41.65Unverified
8GTR-LargeV-Measure41.6Unverified
9GTR-XLV-Measure41.51Unverified
10ContrieverV-Measure41.1Unverified
#ModelMetricClaimedVerifiedStatus
1G-BATAccuracy41.25Unverified
2BATAccuracy35.66Unverified
#ModelMetricClaimedVerifiedStatus
1Vector Space ModelRelated Headlines85Unverified