| Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs | Dec 16, 2024 | Prompt EngineeringSemantic Textual Similarity | —Unverified | 0 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Detecting Redundant Health Survey Questions Using Language-agnostic BERT Sentence Embedding (LaBSE) | Dec 5, 2024 | Computational EfficiencyQuestion Similarity | —Unverified | 0 |
| LuxEmbedder: A Cross-Lingual Approach to Enhanced Luxembourgish Sentence Embeddings | Dec 4, 2024 | Recommendation SystemsSentence | CodeCode Available | 0 |
| Pralekha: An Indic Document Alignment Evaluation Benchmark | Nov 28, 2024 | SentenceSentence Embedding | CodeCode Available | 0 |
| DoubleCCA: Improving Foundation Model Group Robustness with Random Sentence Embeddings | Nov 25, 2024 | SentenceSentence Embedding | —Unverified | 0 |
| BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques | Nov 22, 2024 | Hate Speech DetectionKnowledge Distillation | —Unverified | 0 |
| CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research | Nov 2, 2024 | Line DetectionSemantic Similarity | CodeCode Available | 1 |
| Dialectal and Low-Resource Machine Translation for Aromanian | Oct 23, 2024 | Machine TranslationSentence | —Unverified | 0 |
| GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings | Oct 18, 2024 | Contrastive LearningMTEB Benchmark | CodeCode Available | 0 |