SOTAVerified

Semantic Textual Similarity

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Image source: Learning Semantic Textual Similarity from Conversations

Papers

Showing 451500 of 2381 papers

TitleStatusHype
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs0
Quantifying Positional Biases in Text Embedding ModelsCode0
Single-View Graph Contrastive Learning with Soft Neighborhood AwarenessCode0
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images0
Multilingual LLMs Inherently Reward In-Language Time-Sensitive Semantic Alignment for Low-Resource LanguagesCode0
Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT0
SiReRAG: Indexing Similar and Related Information for Multihop Reasoning0
Detecting Redundant Health Survey Questions Using Language-agnostic BERT Sentence Embedding (LaBSE)0
Human Variability vs. Machine Consistency: A Linguistic Analysis of Texts Generated by Humans and Large Language Models0
VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding0
Interpretable Company Similarity with Sparse Autoencoders0
TSCheater: Generating High-Quality Tibetan Adversarial Texts via Visual SimilarityCode0
Quantifying perturbation impacts for large language models0
Generative Semantic Communication for Joint Image Transmission and Segmentation0
Isolating authorship from content with semantic embeddings and contrastive learning0
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models0
BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques0
FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting0
HNCSE: Advancing Sentence Embeddings via Hybrid Contrastive Learning with Hard Negatives0
Advancing Large Language Models for Spatiotemporal and Semantic Association Mining of Similar Environmental Events0
Membership Inference Attack against Long-Context Large Language Models0
Everyone deserves their voice to be heard: Analyzing Predictive Gender Bias in ASR Models Applied to Dutch Speech Data0
Leveraging LLMs to Enable Natural Language Search on Go-to-market Platforms0
Securing from Unseen: Connected Pattern Kernels (CoPaK) for Zero-Day Intrusion Detection0
GASE: Generatively Augmented Sentence Encoding0
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation0
Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation0
A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients0
Continual Audio-Visual Sound SeparationCode0
HACD: Harnessing Attribute Semantics and Mesoscopic Structure for Community DetectionCode0
NLP and Education: using semantic similarity to evaluate filled gaps in a large-scale Cloze test in the classroom0
FedDTPT: Federated Discrete and Transferable Prompt Tuning for Black-Box Large Language Models0
Phonology-Guided Speech-to-Speech Translation for African Languages0
Decoupling Semantic Similarity from Spatial Alignment for Neural NetworksCode0
EF-LLM: Energy Forecasting LLM with AI-assisted Automation, Enhanced Sparse Prediction, Hallucination Detection0
BIS: NL2SQL Service Evaluation Benchmark for Business Intelligence ScenariosCode0
Conjuring Semantic Similarity0
Toeing the Party Line: Election Manifestos as a Key to Understand Political Discourse on TwitterCode0
Improving General Text Embedding Model: Tackling Task Conflict and Data Imbalance through Model Merging0
Few-Shot Joint Multimodal Entity-Relation Extraction via Knowledge-Enhanced Cross-modal Prompt Model0
Optimizing Retrieval-Augmented Generation with Elasticsearch for Enhanced Question-Answering Systems0
Boosting Imperceptibility of Stable Diffusion-based Adversarial Examples Generation with MomentumCode0
SemSim: Revisiting Weak-to-Strong Consistency from a Semantic Similarity Perspective for Semi-supervised Medical Image Segmentation0
PromptExp: Multi-granularity Prompt Explanation of Large Language Models0
Back-of-the-Book Index Automation for Arabic Documents0
Improving Legal Entity Recognition Using a Hybrid Transformer Model and Semantic Filtering Approach0
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks0
Graded Suspiciousness of Adversarial Texts to Human0
Metadata-based Data Exploration with Retrieval-Augmented Generation for Large Language Models0
Evaluating Deduplication Techniques for Economic Research Paper Titles with a Focus on Semantic Similarity using NLP and LLMs0
Show:102550
← PrevPage 10 of 48Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SMARTRoBERTaDev Pearson Correlation92.8Unverified
2DeBERTa (large)Accuracy92.5Unverified
3SMART-BERTDev Pearson Correlation90Unverified
4MT-DNN-SMARTPearson Correlation0.93Unverified
5StructBERTRoBERTa ensemblePearson Correlation0.93Unverified
6Mnet-SimPearson Correlation0.93Unverified
7XLNet (single model)Pearson Correlation0.93Unverified
8ALBERTPearson Correlation0.93Unverified
9T5-11BPearson Correlation0.93Unverified
10RoBERTaPearson Correlation0.92Unverified
#ModelMetricClaimedVerifiedStatus
1AnglE-UAESpearman Correlation84.54Unverified
2ST5-XXLSpearman Correlation82.63Unverified
3ST5-LargeSpearman Correlation81.83Unverified
4ST5-XLSpearman Correlation81.66Unverified
5ST5-BaseSpearman Correlation81.14Unverified
6MPNet-multilingualSpearman Correlation80.73Unverified
7SGPT-5.8B-nliSpearman Correlation80.53Unverified
8MPNetSpearman Correlation80.28Unverified
9MiniLM-L12Spearman Correlation79.8Unverified
10SimCSE-BERT-supSpearman Correlation79.12Unverified
#ModelMetricClaimedVerifiedStatus
1MT-DNN-SMARTAccuracy93.7Unverified
2ALBERTAccuracy93.4Unverified
3RoBERTa (ensemble)Accuracy92.3Unverified
4BigBirdF191.5Unverified
5StructBERTRoBERTa ensembleAccuracy91.5Unverified
6FLOATER-largeAccuracy91.4Unverified
7SMARTAccuracy91.3Unverified
8RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)Accuracy91Unverified
9RoBERTa-large 355M + Entailment as Few-shot LearnerF191Unverified
10SpanBERTAccuracy90.9Unverified
#ModelMetricClaimedVerifiedStatus
1PromCSE-RoBERTa-large (0.355B)Spearman Correlation0.82Unverified
2PromptEOL+CSE+LLaMA-30BSpearman Correlation0.82Unverified
3PromptEOL+CSE+OPT-13BSpearman Correlation0.82Unverified
4SimCSE-RoBERTalargeSpearman Correlation0.82Unverified
5PromptEOL+CSE+OPT-2.7BSpearman Correlation0.81Unverified
6SentenceBERTSpearman Correlation0.75Unverified
7SRoBERTa-NLI-baseSpearman Correlation0.74Unverified
8SRoBERTa-NLI-largeSpearman Correlation0.74Unverified
9Dino (STS/̄🦕)Spearman Correlation0.74Unverified
10SBERT-NLI-largeSpearman Correlation0.74Unverified
#ModelMetricClaimedVerifiedStatus
1AnglE-LLaMA-7BSpearman Correlation0.91Unverified
2AnglE-LLaMA-7B-v2Spearman Correlation0.91Unverified
3PromptEOL+CSE+LLaMA-30BSpearman Correlation0.9Unverified
4PromptEOL+CSE+OPT-13BSpearman Correlation0.9Unverified
5PromptEOL+CSE+OPT-2.7BSpearman Correlation0.9Unverified
6PromCSE-RoBERTa-large (0.355B)Spearman Correlation0.89Unverified
7Trans-Encoder-BERT-large-bi (unsup.)Spearman Correlation0.89Unverified
8Trans-Encoder-BERT-large-cross (unsup.)Spearman Correlation0.88Unverified
9Trans-Encoder-RoBERTa-large-cross (unsup.)Spearman Correlation0.88Unverified
10SimCSE-RoBERTa-largeSpearman Correlation0.87Unverified