| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| 2D Matryoshka Sentence Embeddings | Feb 22, 2024 | RAGRepresentation Learning | CodeCode Available | 4 |
| Bridging Language and Items for Retrieval and Recommendation | Mar 6, 2024 | RetrievalSentence | CodeCode Available | 3 |
| Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models | Feb 19, 2025 | Contrastive LearningSentence | CodeCode Available | 2 |
| SONAR: Sentence-Level Multimodal and Language-Agnostic Representations | Aug 22, 2023 | DecoderMachine Translation | CodeCode Available | 2 |
| RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder | May 24, 2022 | DecoderInformation Retrieval | CodeCode Available | 2 |
| PromptBERT: Improving BERT Sentence Embeddings with Prompts | Jan 12, 2022 | Contrastive LearningDenoising | CodeCode Available | 2 |
| FanChuan: A Multilingual and Graph-Structured Benchmark For Parody Detection and Analysis | Feb 23, 2025 | SentenceSentence Embedding | CodeCode Available | 1 |
| CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research | Nov 2, 2024 | Line DetectionSemantic Similarity | CodeCode Available | 1 |
| Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes | Oct 8, 2024 | ArticlesClassification | CodeCode Available | 1 |
| Simple Techniques for Enhancing Sentence Embeddings in Generative Language Models | Apr 5, 2024 | Prompt EngineeringSentence | CodeCode Available | 1 |
| SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity | Apr 1, 2024 | SentenceSentence Embedding | CodeCode Available | 1 |
| KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular margin Contrastive Learning | Mar 26, 2024 | Contrastive LearningKnowledge Distillation | CodeCode Available | 1 |
| Some Like It Small: Czech Semantic Embedding Models for Industry Applications | Nov 23, 2023 | Image RetrievalKnowledge Distillation | CodeCode Available | 1 |
| An Efficient Self-Supervised Cross-View Training For Sentence Embedding | Nov 6, 2023 | Contrastive LearningLanguage Modeling | CodeCode Available | 1 |
| AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification | Nov 1, 2023 | ClassificationLanguage Modelling | CodeCode Available | 1 |
| Japanese SimCSE Technical Report | Oct 30, 2023 | SentenceSentence Embedding | CodeCode Available | 1 |
| Sentence Embedding Models for Ancient Greek Using Multilingual Knowledge Distillation | Aug 24, 2023 | Authorship AttributionKnowledge Distillation | CodeCode Available | 1 |
| Scaling Sentence Embeddings with Large Language Models | Jul 31, 2023 | Contrastive LearningIn-Context Learning | CodeCode Available | 1 |
| Whitening-based Contrastive Learning of Sentence Embeddings | May 28, 2023 | Contrastive LearningDiversity | CodeCode Available | 1 |
| Dual-Alignment Pre-training for Cross-lingual Sentence Embedding | May 16, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence | May 4, 2023 | DecoderLanguage Modeling | CodeCode Available | 1 |
| TagGPT: Large Language Models are Zero-shot Multimodal Taggers | Apr 6, 2023 | Optical Character Recognition (OCR)Prompt Engineering | CodeCode Available | 1 |
| Improving Continuous Sign Language Recognition with Consistency Constraints and Signer Removal | Dec 26, 2022 | DisentanglementSentence | CodeCode Available | 1 |
| Relational Sentence Embedding for Flexible Semantic Matching | Dec 17, 2022 | RelationSemantic Textual Similarity | CodeCode Available | 1 |