SOTAVerified

Diversity

Diversity in data sampling is crucial across various use cases, including search, recommendation systems, and more. Ensuring diverse samples means capturing a wide range of variations and perspectives, which leads to more robust, unbiased, and comprehensive models. In search use cases, for instance, diversity helps avoid redundancy, ensuring that users are exposed to a broader set of relevant information rather than repeated similar results.

Papers

Showing 150 of 9051 papers

TitleStatusHype
MinerU: An Open-Source Solution for Precise Document Content ExtractionCode16
olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language ModelsCode11
Depth Anything V2Code9
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image AnimationCode9
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait AnimationCode9
Is Diversity All You Need for Scalable Robotic Manipulation?Code7
Flow-GRPO: Training Flow Matching Models via Online RLCode7
FoundationStereo: Zero-Shot Stereo MatchingCode7
Adaptive In-conversation Team Building for Language Model AgentsCode7
PromptWizard: Task-Aware Prompt Optimization FrameworkCode7
Better Synthetic Data by Retrieving and Transforming Existing DatasetsCode7
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language ModelsCode7
From Audio to Photoreal Embodiment: Synthesizing Humans in ConversationsCode7
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation DatasetCode7
MaskSketch: Unpaired Structure-guided Masked Image GenerationCode7
Improving Sample Quality of Diffusion Models Using Self-Attention GuidanceCode7
Automatic Chain of Thought Prompting in Large Language ModelsCode6
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-TuningCode5
BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting ModelsCode5
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive AnnotationsCode5
GAPartManip: A Large-scale Part-centric Dataset for Material-Agnostic Articulated Object ManipulationCode5
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal ModelsCode5
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world VideosCode5
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification BenchmarkCode5
Fake News Detection: It's All in the Data!Code5
MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge AggregationCode5
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity PreservingCode5
MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter ExpertsCode5
Efficient Part-level 3D Object Generation via Dual Volume PackingCode4
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory SynthesisCode4
3D Scene Generation: A SurveyCode4
ActionStudio: A Lightweight Framework for Data and Training of Large Action ModelsCode4
Distill Any Depth: Distillation Creates a Stronger Monocular Depth EstimatorCode4
A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANsCode4
A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQLCode4
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent ExplorationCode4
Expressive Whole-Body 3D Gaussian AvatarCode4
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy PredictionCode4
Quality-aware Masked Diffusion Transformer for Enhanced Music GenerationCode4
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single ImageCode4
AlphaFold Meets Flow Matching for Generating Protein EnsemblesCode4
Enhancing Chat Language Models by Scaling High-quality Instructional ConversationsCode4
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from ImagesCode4
Elucidating the Design Space of Multimodal Protein Language ModelsCode3
VideoGen-Eval: Agent-based System for Video Generation EvaluationCode3
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking PortraitCode3
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics AwarenessCode3
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task SynthesisCode3
UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous DrivingCode3
SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG GenerationCode3
Show:102550
← PrevPage 1 of 182Next →

No leaderboard results yet.