SOTAVerified

Diversity

Diversity in data sampling is crucial across various use cases, including search, recommendation systems, and more. Ensuring diverse samples means capturing a wide range of variations and perspectives, which leads to more robust, unbiased, and comprehensive models. In search use cases, for instance, diversity helps avoid redundancy, ensuring that users are exposed to a broader set of relevant information rather than repeated similar results.

Papers

Showing 651700 of 9051 papers

TitleStatusHype
Efficient Dataset Distillation via Minimax DiffusionCode1
Metric Space Magnitude for Evaluating the Diversity of Latent RepresentationsCode1
Cerbero-7B: A Leap Forward in Language-Specific LLMs Through Enhanced Chat Corpus Generation and EvaluationCode1
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text RecognizerCode1
Generating Progressive Images from Pathological Transitions via Diffusion ModelCode1
Multi-Task Reinforcement Learning with Mixture of Orthogonal ExpertsCode1
Safer-Instruct: Aligning Language Models with Automated Preference DataCode1
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet AccuracyCode1
Towards Reasoning in Large Language Models via Multi-Agent Peer Review CollaborationCode1
MC^2: Towards Transparent and Culturally-Aware NLP for Minority Languages in ChinaCode1
Self-Evolved Diverse Data Sampling for Efficient Instruction TuningCode1
AI-generated text boundary detection with RoFTCode1
Can LLMs Patch Security Issues?Code1
IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion ModelsCode1
Florence-2: Advancing a Unified Representation for a Variety of Vision TasksCode1
CloudEval-YAML: A Practical Benchmark for Cloud Configuration GenerationCode1
Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysisCode1
DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object SegmentationCode1
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language ModelsCode1
Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient SelectionCode1
SimMMDG: A Simple and Effective Framework for Multi-modal Domain GeneralizationCode1
Controllable Group Choreography using Contrastive DiffusionCode1
TarGEN: Targeted Data Generation with Large Language ModelsCode1
Chain-of-Choice Hierarchical Policy Learning for Conversational RecommendationCode1
Generative Fractional Diffusion ModelsCode1
Semantic Generative Augmentations for Few-Shot CountingCode1
AlpaCare:Instruction-tuned Large Language Models for Medical ApplicationCode1
Diversify Question Generation with Retrieval-Augmented Style TransferCode1
Invariant Feature Regularization for Fair Face RecognitionCode1
Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection BiasCode1
Neural Multi-Objective Combinatorial Optimization with Diversity EnhancementCode1
FLAIR: a Country-Scale Land Cover Semantic Segmentation Dataset From Multi-Source Optical ImageryCode1
CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource LanguagesCode1
AutoMix: Automatically Mixing Language ModelsCode1
Blending gradient boosted trees and neural networks for point and probabilistic forecasting of hierarchical time seriesCode1
Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion ModelsCode1
Cousins Of The Vendi Score: A Family Of Similarity-Based Diversity Metrics For Science And Machine LearningCode1
PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly DetectionCode1
MixEdit: Revisiting Data Augmentation and Beyond for Grammatical Error CorrectionCode1
Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven OptimizationCode1
Knowledge Extraction and Distillation from Large-Scale Image-Text Colonoscopy Records Leveraging Large Language and Vision ModelsCode1
MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework DesignCode1
Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active ExplorationCode1
Towards Evaluating Generalist Agents: An Automated Benchmark in Open WorldCode1
Learning a Cross-modality Anomaly Detector for Remote Sensing ImageryCode1
GMOCAT: A Graph-Enhanced Multi-Objective Method for Computerized Adaptive TestingCode1
TabLib: A Dataset of 627M Tables with ContextCode1
D2 Pruning: Message Passing for Balancing Diversity and Difficulty in Data PruningCode1
ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and Multispectral Data FusionCode1
Score-Based Generative Models for Designing Binding Peptide BackbonesCode1
Show:102550
← PrevPage 14 of 182Next →

No leaderboard results yet.