| Intrinsic vs. Extrinsic Evaluation of Czech Sentence Embeddings: Semantic Relevance Doesn't Help with MT Evaluation | Jun 25, 2025 | Machine TranslationSemantic Similarity | —Unverified | 0 |
| Factors affecting the in-context learning abilities of LLMs for dialogue state tracking | Jun 10, 2025 | Dialogue State TrackingIn-Context Learning | —Unverified | 0 |
| Quality-Diversity Red-Teaming: Automated Generation of High-Quality and Diverse Attackers for Large Language Models | Jun 8, 2025 | DiversityRed Teaming | —Unverified | 0 |
| Mechanistic Decomposition of Sentence Representations | Jun 4, 2025 | Dictionary LearningSentence | —Unverified | 0 |
| Rethinking the Understanding Ability across LLMs through Mutual Information | May 25, 2025 | SentenceSentence Embedding | —Unverified | 0 |
| Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering | May 19, 2025 | Prompt EngineeringSemantic Textual Similarity | CodeCode Available | 0 |
| Semantic Probabilistic Control of Language Models | May 4, 2025 | AttributeSentence | —Unverified | 0 |
| Scalable Unit Harmonization in Medical Informatics Using Bi-directional Transformers and Bayesian-Optimized BM25 and Sentence Embedding Retrieval | May 1, 2025 | Bayesian OptimizationInformation Retrieval | —Unverified | 0 |
| Information Leakage of Sentence Embeddings via Generative Embedding Inversion Attacks | Apr 23, 2025 | SentenceSentence Embedding | CodeCode Available | 0 |
| sEEG-based Encoding for Sentence Retrieval: A Contrastive Learning Approach to Brain-Language Alignment | Apr 20, 2025 | Contrastive LearningSentence | —Unverified | 0 |
| MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters | Mar 28, 2025 | ClusteringFact Checking | —Unverified | 0 |
| CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement | Mar 21, 2025 | Dimensionality ReductionLanguage Modeling | —Unverified | 0 |
| IPCGRL: Language-Instructed Reinforcement Learning for Procedural Level Generation | Mar 16, 2025 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 0 |
| FanChuan: A Multilingual and Graph-Structured Benchmark For Parody Detection and Analysis | Feb 23, 2025 | SentenceSentence Embedding | CodeCode Available | 1 |
| Evolutionary Algorithms Approach For Search Based On Semantic Document Similarity | Feb 20, 2025 | Cloud ComputingDistributed Computing | —Unverified | 0 |
| Exploring RWKV for Sentence Embeddings: Layer-wise Analysis and Baseline Comparison for Semantic Similarity | Feb 20, 2025 | GPULanguage Modeling | CodeCode Available | 0 |
| Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models | Feb 19, 2025 | Contrastive LearningSentence | CodeCode Available | 2 |
| Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor | Feb 19, 2025 | SentenceSentence Embedding | —Unverified | 0 |
| Performance Evaluation of Sentiment Analysis on Text and Emoji Data Using End-to-End, Transfer Learning, Distributed and Explainable AI Models | Feb 18, 2025 | MarketingSentence | —Unverified | 0 |
| Optimizing Sentence Embedding with Pseudo-Labeling and Model Ensembles: A Hierarchical Framework for Enhanced NLP Tasks | Jan 27, 2025 | Data AugmentationPseudo Label | —Unverified | 0 |
| Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs | Dec 16, 2024 | Prompt EngineeringSemantic Textual Similarity | —Unverified | 0 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Detecting Redundant Health Survey Questions Using Language-agnostic BERT Sentence Embedding (LaBSE) | Dec 5, 2024 | Computational EfficiencyQuestion Similarity | —Unverified | 0 |
| LuxEmbedder: A Cross-Lingual Approach to Enhanced Luxembourgish Sentence Embeddings | Dec 4, 2024 | Recommendation SystemsSentence | CodeCode Available | 0 |
| Pralekha: An Indic Document Alignment Evaluation Benchmark | Nov 28, 2024 | SentenceSentence Embedding | CodeCode Available | 0 |
| DoubleCCA: Improving Foundation Model Group Robustness with Random Sentence Embeddings | Nov 25, 2024 | SentenceSentence Embedding | —Unverified | 0 |
| BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques | Nov 22, 2024 | Hate Speech DetectionKnowledge Distillation | —Unverified | 0 |
| CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research | Nov 2, 2024 | Line DetectionSemantic Similarity | CodeCode Available | 1 |
| Dialectal and Low-Resource Machine Translation for Aromanian | Oct 23, 2024 | Machine TranslationSentence | —Unverified | 0 |
| GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings | Oct 18, 2024 | Contrastive LearningMTEB Benchmark | CodeCode Available | 0 |
| A new approach for fine-tuning sentence transformers for intent classification and out-of-scope detection tasks | Oct 17, 2024 | Classificationintent-classification | CodeCode Available | 0 |
| Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes | Oct 8, 2024 | ArticlesClassification | CodeCode Available | 1 |
| Black-Box Segmentation of Electronic Medical Records | Sep 29, 2024 | SegmentationSentence | —Unverified | 0 |
| An Effective Approach to Embedding Source Code by Combining Large Language and Sentence Embedding Models | Sep 23, 2024 | Clone DetectionDomain Adaptation | —Unverified | 0 |
| Towards Building Efficient Sentence BERT Models using Layer Pruning | Sep 21, 2024 | Natural Language InferenceSemantic Textual Similarity | —Unverified | 0 |
| Enhancing Unsupervised Sentence Embeddings via Knowledge-Driven Data Augmentation and Gaussian-Decayed Contrastive Learning | Sep 19, 2024 | Contrastive LearningData Augmentation | —Unverified | 0 |
| ConCSE: Unified Contrastive Learning and Augmentation for Code-Switched Embeddings | Aug 28, 2024 | Contrastive LearningNatural Language Inference | CodeCode Available | 0 |
| Practical token pruning for foundation models in few-shot conversational virtual assistant systems | Aug 21, 2024 | ClassificationContrastive Learning | —Unverified | 0 |
| Extracting Sentence Embeddings from Pretrained Transformer Models | Aug 15, 2024 | ClusteringRetrieval-augmented Generation | —Unverified | 0 |
| Sign Language Translation with Sentence Embedding Supervision | Aug 14, 2024 | Gloss-free Sign Language TranslationSentence | CodeCode Available | 0 |
| reCSE: Portable Reshaping Features for Sentence Embedding in Self-supervised Contrastive Learning | Aug 9, 2024 | Contrastive LearningData Augmentation | CodeCode Available | 0 |
| In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation | Aug 1, 2024 | DiversityIn-Context Learning | CodeCode Available | 0 |
| QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval | Jul 29, 2024 | Answer GenerationEvent Extraction | —Unverified | 0 |
| Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification | Jul 25, 2024 | SentenceSentence Embedding | CodeCode Available | 0 |
| Whitening Not Recommended for Classification Tasks in LLMs | Jul 16, 2024 | ClassificationSentence | —Unverified | 0 |
| Unveiling the Potential of BERTopic for Multilingual Fake News Analysis -- Use Case: Covid-19 | Jul 11, 2024 | ArticlesClustering | —Unverified | 0 |
| Are there identifiable structural parts in the sentence embedding whole? | Jun 24, 2024 | SentenceSentence Embedding | —Unverified | 0 |
| Towards Understanding Domain Adapted Sentence Embeddings for Document Retrieval | Jun 18, 2024 | Domain AdaptationQuestion Answering | —Unverified | 0 |
| Space Decomposition for Sentence Embedding | Jun 5, 2024 | Semantic Textual SimilaritySentence | CodeCode Available | 0 |
| MTEB-French: Resources for French Sentence Embedding Evaluation and Analysis | May 30, 2024 | SentenceSentence Embedding | —Unverified | 0 |