SOTAVerified

Diversity

Diversity in data sampling is crucial across various use cases, including search, recommendation systems, and more. Ensuring diverse samples means capturing a wide range of variations and perspectives, which leads to more robust, unbiased, and comprehensive models. In search use cases, for instance, diversity helps avoid redundancy, ensuring that users are exposed to a broader set of relevant information rather than repeated similar results.

Papers

Showing 201250 of 9051 papers

TitleStatusHype
MoMBS: Mixed-order minibatch sampling enhances model training from diverse-quality images0
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs0
ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models0
Large language model as user daily behavior data generator: balancing population diversity and individual personality0
Measuring diversity of synthetic prompts and data generated with fine-grained persona prompting0
High-Fidelity Functional Ultrasound Reconstruction via A Visual Auto-Regressive Framework0
BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting ModelsCode5
CrashAgent: Crash Scenario Generation via Multi-modal Reasoning0
JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language ModelsCode0
LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions0
Generative AI and Creativity: A Systematic Literature Review and Meta-AnalysisCode0
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory SynthesisCode4
Robust Invariant Representation Learning by Distribution Extrapolation0
Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task0
Sudoku-Bench: Evaluating creative reasoning with Sudoku variantsCode0
Exploring the Relationship Between Diversity and Quality in Ad Text Generation0
Diverse, not Short: A Length-Controlled Self-Learning Framework for Improving Response Diversity of Language Models0
AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners0
Position of Uncertainty: A Cross-Linguistic Study of Positional Bias in Large Language Models0
Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization0
OpenEthics: A Comprehensive Ethical Evaluation of Open-Source Generative Large Language ModelsCode0
Aug2Search: Enhancing Facebook Marketplace Search with LLM-Generated Synthetic Data Augmentation0
Ensembling Sparse Autoencoders0
An Inclusive Foundation Model for Generalizable Cytogenetics in Precision Oncology0
Multilingual Prompting for Improving LLM Generation Diversity0
A Distributed Local Energy Market Clearing Framework Using a Two-Loop ADMM Method0
Towards Pre-training an Effective Respiratory Audio Foundation ModelCode0
GAMA++: Disentangled Geometric Alignment with Adaptive Contrastive Perturbation for Reliable Domain Transfer0
FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion0
GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation0
Loss-Guided Auxiliary Agents for Overcoming Mode Collapse in GFlowNets0
SSR: Speculative Parallel Scaling Reasoning in Test-time0
WebNovelBench: Placing LLM Novelists on the Web Novel DistributionCode1
KO: Kinetics-inspired Neural Optimizer with PDE Simulation Approaches0
The Achilles Heel of AI: Fundamentals of Risk-Aware Training Data for High-Consequence Models0
Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models0
CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring0
Creative Preference Optimization0
Success is in the Details: Evaluate and Enhance Details Sensitivity of Code LLMs through CounterfactualsCode0
Algorithmic Hiring and Diversity: Reducing Human-Algorithm Similarity for Better Outcomes0
ReactDiff: Latent Diffusion for Facial Reaction GenerationCode0
SQLForge: Synthesizing Reliable and Diverse Data to Enhance Text-to-SQL Reasoning in LLMs0
GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization0
The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation0
Towards A Generalist Code Embedding Model Based On Massive Data SynthesisCode0
Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping0
AD-AGENT: A Multi-agent Framework for End-to-end Anomaly DetectionCode2
Active Learning on Synthons for Molecular Design0
Few-Step Diffusion via Score identity DistillationCode0
EuLearn: A 3D database for learning Euler characteristicsCode0
Show:102550
← PrevPage 5 of 182Next →

No leaderboard results yet.