Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers, Iryna Gurevych
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/UKPLab/sentence-transformersOfficialIn paperpytorch★ 0
- github.com/princeton-nlp/SimCSEpytorch★ 3,646
- github.com/InsaneLife/dssmtf★ 667
- github.com/law-ai/summarizationpytorch★ 220
- github.com/valdecy/pybibxtf★ 195
- github.com/BinWang28/BERT_Sentence_Embeddingpytorch★ 182
- github.com/BinWang28/SBERT-WK-Sentence-Embeddingpytorch★ 182
- github.com/BM-K/KoSentenceBERT_ETRIpytorch★ 162
- github.com/BM-K/KoSentenceBERTpytorch★ 162
- github.com/BM-K/KoSentenceBERT_SKTpytorch★ 142
Abstract
BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| SICK | SBERT-NLI-large | Spearman Correlation | 0.74 | — | Unverified |
| SICK | SBERT-NLI-base | Spearman Correlation | 0.73 | — | Unverified |
| SICK | SentenceBERT | Spearman Correlation | 0.75 | — | Unverified |
| SICK | SRoBERTa-NLI-base | Spearman Correlation | 0.74 | — | Unverified |
| SICK | SRoBERTa-NLI-large | Spearman Correlation | 0.74 | — | Unverified |
| STS12 | SRoBERTa-NLI-large | Spearman Correlation | 0.75 | — | Unverified |
| STS13 | SBERT-NLI-large | Spearman Correlation | 0.78 | — | Unverified |
| STS14 | SBERT-NLI-large | Spearman Correlation | 0.75 | — | Unverified |
| STS15 | SRoBERTa-NLI-large | Spearman Correlation | 0.82 | — | Unverified |
| STS16 | SRoBERTa-NLI-large | Spearman Correlation | 0.77 | — | Unverified |
| STS Benchmark | SBERT-NLI-base | Spearman Correlation | 0.77 | — | Unverified |
| STS Benchmark | SRoBERTa-NLI-base | Spearman Correlation | 0.78 | — | Unverified |
| STS Benchmark | SRoBERTa-NLI-STSb-large | Spearman Correlation | 0.86 | — | Unverified |
| STS Benchmark | SBERT-STSb-base | Spearman Correlation | 0.85 | — | Unverified |
| STS Benchmark | SBERT-NLI-large | Spearman Correlation | 0.79 | — | Unverified |
| STS Benchmark | SBERT-STSb-large | Spearman Correlation | 0.84 | — | Unverified |