SOTAVerified

Semantic Textual Similarity

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Image source: Learning Semantic Textual Similarity from Conversations

Papers

Showing 251300 of 2381 papers

TitleStatusHype
Prompt Obfuscation for Large Language Models0
Cross-Lingual News Event Correlation for Stock Market Trend Prediction0
beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender SystemsCode2
Distilling Monolingual and Crosslingual Word-in-Context RepresentationsCode0
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift GeneralizationCode0
An Unsupervised Dialogue Topic Segmentation Model Based on Utterance Rewriting0
SubRegWeigh: Effective and Efficient Annotation Weighing with Subword RegularizationCode0
Ethereum Fraud Detection via Joint Transaction Language Model and Graph Representation Learning0
Self-Judge: Selective Instruction Following with Alignment Self-EvaluationCode0
DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective PartitioningCode1
LanguaShrink: Reducing Token Overhead with Psycholinguistics0
GMFL-Net: A Global Multi-geometric Feature Learning Network for Repetitive Action CountingCode0
FlowRetrieval: Flow-Guided Data Retrieval for Few-Shot Imitation LearningCode0
ConCSE: Unified Contrastive Learning and Augmentation for Code-Switched EmbeddingsCode0
Contrastive Learning Subspace for Text Clustering0
HTS-Attack: Heuristic Token Search for Jailbreaking Text-to-Image Models0
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design0
GSTran: Joint Geometric and Semantic Coherence for Point Cloud SegmentationCode0
Improving embedding with contrastive fine-tuning on small datasets with expert-augmented scores0
Distinguish Confusion in Legal Judgment Prediction via Revised Relation KnowledgeCode1
KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment0
Extracting Sentence Embeddings from Pretrained Transformer Models0
Unsupervised Episode Detection for Large-Scale News EventsCode1
reCSE: Portable Reshaping Features for Sentence Embedding in Self-supervised Contrastive LearningCode0
Semantics or spelling? Probing contextual word embeddings with orthographic noiseCode0
A Semi-supervised Multi-channel Graph Convolutional Network for Query Classification in E-commerce0
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual LearningCode0
Towards Flexible Evaluation for Generative Visual Question AnsweringCode0
Ontological Relations from Word Embeddings0
Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning0
Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent0
Urban Traffic Accident Risk Prediction Revisited: Regionality, Proximity, Similarity and SparsityCode0
Enhancing Taobao Display Advertising with Multimodal Representations: Challenges, Approaches and Insights0
FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts0
NeuSemSlice: Towards Effective DNN Model Maintenance via Neuron-level Semantic Slicing0
Learning Robust Named Entity Recognizers From Noisy Data With Retrieval Augmentation0
A Large-Scale Sensitivity Analysis on Latent Embeddings and Dimensionality Reductions for Text SpatializationsCode0
Robust Privacy Amidst Innovation with Large Language Models Through a Critical Assessment of the RisksCode0
Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language ModelsCode0
DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous DrivingCode1
Automatic Real-word Error Correction in Persian Text0
BERTer: The Efficient One0
Check-Eval: A Checklist-based Approach for Evaluating Text Quality0
Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented GenerationCode2
FarFetched: Entity-centric Reasoning and Claim Validation for the Greek Language based on Textually Represented EnvironmentsCode0
Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text0
Uncovering Semantics and Topics Utilized by Threat Actors to Deliver Malicious Attachments and URLs0
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing0
HyCIR: Boosting Zero-Shot Composed Image Retrieval with Synthetic Labels0
Towards Bridging the Cross-modal Semantic Gap for Multi-modal RecommendationCode1
Show:102550
← PrevPage 6 of 48Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SMARTRoBERTaDev Pearson Correlation92.8Unverified
2DeBERTa (large)Accuracy92.5Unverified
3SMART-BERTDev Pearson Correlation90Unverified
4MT-DNN-SMARTPearson Correlation0.93Unverified
5StructBERTRoBERTa ensemblePearson Correlation0.93Unverified
6Mnet-SimPearson Correlation0.93Unverified
7XLNet (single model)Pearson Correlation0.93Unverified
8ALBERTPearson Correlation0.93Unverified
9T5-11BPearson Correlation0.93Unverified
10RoBERTaPearson Correlation0.92Unverified
#ModelMetricClaimedVerifiedStatus
1AnglE-UAESpearman Correlation84.54Unverified
2ST5-XXLSpearman Correlation82.63Unverified
3ST5-LargeSpearman Correlation81.83Unverified
4ST5-XLSpearman Correlation81.66Unverified
5ST5-BaseSpearman Correlation81.14Unverified
6MPNet-multilingualSpearman Correlation80.73Unverified
7SGPT-5.8B-nliSpearman Correlation80.53Unverified
8MPNetSpearman Correlation80.28Unverified
9MiniLM-L12Spearman Correlation79.8Unverified
10SimCSE-BERT-supSpearman Correlation79.12Unverified
#ModelMetricClaimedVerifiedStatus
1MT-DNN-SMARTAccuracy93.7Unverified
2ALBERTAccuracy93.4Unverified
3RoBERTa (ensemble)Accuracy92.3Unverified
4BigBirdF191.5Unverified
5StructBERTRoBERTa ensembleAccuracy91.5Unverified
6FLOATER-largeAccuracy91.4Unverified
7SMARTAccuracy91.3Unverified
8RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)Accuracy91Unverified
9RoBERTa-large 355M + Entailment as Few-shot LearnerF191Unverified
10SpanBERTAccuracy90.9Unverified
#ModelMetricClaimedVerifiedStatus
1PromCSE-RoBERTa-large (0.355B)Spearman Correlation0.82Unverified
2PromptEOL+CSE+LLaMA-30BSpearman Correlation0.82Unverified
3PromptEOL+CSE+OPT-13BSpearman Correlation0.82Unverified
4SimCSE-RoBERTalargeSpearman Correlation0.82Unverified
5PromptEOL+CSE+OPT-2.7BSpearman Correlation0.81Unverified
6SentenceBERTSpearman Correlation0.75Unverified
7SRoBERTa-NLI-baseSpearman Correlation0.74Unverified
8SRoBERTa-NLI-largeSpearman Correlation0.74Unverified
9Dino (STS/̄🦕)Spearman Correlation0.74Unverified
10SBERT-NLI-largeSpearman Correlation0.74Unverified
#ModelMetricClaimedVerifiedStatus
1AnglE-LLaMA-7BSpearman Correlation0.91Unverified
2AnglE-LLaMA-7B-v2Spearman Correlation0.91Unverified
3PromptEOL+CSE+LLaMA-30BSpearman Correlation0.9Unverified
4PromptEOL+CSE+OPT-13BSpearman Correlation0.9Unverified
5PromptEOL+CSE+OPT-2.7BSpearman Correlation0.9Unverified
6PromCSE-RoBERTa-large (0.355B)Spearman Correlation0.89Unverified
7Trans-Encoder-BERT-large-bi (unsup.)Spearman Correlation0.89Unverified
8Trans-Encoder-BERT-large-cross (unsup.)Spearman Correlation0.88Unverified
9Trans-Encoder-RoBERTa-large-cross (unsup.)Spearman Correlation0.88Unverified
10SimCSE-RoBERTa-largeSpearman Correlation0.87Unverified