SOTAVerified

Diversity

Diversity in data sampling is crucial across various use cases, including search, recommendation systems, and more. Ensuring diverse samples means capturing a wide range of variations and perspectives, which leads to more robust, unbiased, and comprehensive models. In search use cases, for instance, diversity helps avoid redundancy, ensuring that users are exposed to a broader set of relevant information rather than repeated similar results.

Papers

Showing 17511800 of 9051 papers

TitleStatusHype
Hybrid Disagreement-Diversity Active Learning for Bioacoustic Sound Event DetectionCode0
Towards Pretraining Robust ASR Foundation Model with Acoustic-Aware Data Augmentation0
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning0
CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data SynthesisCode0
EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition0
ReDDiT: Rehashing Noise for Discrete Visual Generation0
DiffVLA: Vision-Language Guided Diffusion Planning for Autonomous Driving0
Token-Importance Guided Direct Preference Optimization0
We Need to Measure Data Diversity in NLP -- Better and Broader0
The Role of Diversity in In-Context Learning for Large Language Models0
Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals0
Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory0
Holes in Latent Space: Topological Signatures Under Adversarial Influence0
An Out-Of-Distribution Membership Inference Attack Approach for Cross-Domain Graph Attacks0
The NaijaVoices Dataset: Cultivating Large-Scale, High-Quality, Culturally-Rich Speech Data for African Languages0
Kuramoto-FedAvg: Using Synchronization Dynamics to Improve Federated Learning Optimization under Statistical Heterogeneity0
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection0
The Price of Format: Diversity Collapse in LLMsCode0
PIGPVAE: Physics-Informed Gaussian Process Variational Autoencoders0
MGD^3: Mode-Guided Dataset Distillation using Diffusion Models0
Beyond Editing Pairs: Fine-Grained Instructional Image Editing via Multi-Scale Learnable Regions0
Less is More: Efficient Point Cloud Reconstruction via Multi-Head Decoders0
Pan-tropical plant functional trait variation from space0
SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs0
MoMBS: Mixed-order minibatch sampling enhances model training from diverse-quality images0
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs0
Voice of a Continent: Mapping Africa's Speech Technology Frontier0
ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models0
Measuring diversity of synthetic prompts and data generated with fine-grained persona prompting0
High-Fidelity Functional Ultrasound Reconstruction via A Visual Auto-Regressive Framework0
Large language model as user daily behavior data generator: balancing population diversity and individual personality0
JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language ModelsCode0
CrashAgent: Crash Scenario Generation via Multi-modal Reasoning0
Generative AI and Creativity: A Systematic Literature Review and Meta-AnalysisCode0
Exploring the Relationship Between Diversity and Quality in Ad Text Generation0
Position of Uncertainty: A Cross-Linguistic Study of Positional Bias in Large Language Models0
Robust Invariant Representation Learning by Distribution Extrapolation0
Sudoku-Bench: Evaluating creative reasoning with Sudoku variantsCode0
Diverse, not Short: A Length-Controlled Self-Learning Framework for Improving Response Diversity of Language Models0
Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task0
AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners0
LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions0
Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization0
Aug2Search: Enhancing Facebook Marketplace Search with LLM-Generated Synthetic Data Augmentation0
GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation0
Multilingual Prompting for Improving LLM Generation Diversity0
FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion0
Loss-Guided Auxiliary Agents for Overcoming Mode Collapse in GFlowNets0
GAMA++: Disentangled Geometric Alignment with Adaptive Contrastive Perturbation for Reliable Domain Transfer0
An Inclusive Foundation Model for Generalizable Cytogenetics in Precision Oncology0
Show:102550
← PrevPage 36 of 182Next →

No leaderboard results yet.