SOTAVerified

Diversity

Diversity in data sampling is crucial across various use cases, including search, recommendation systems, and more. Ensuring diverse samples means capturing a wide range of variations and perspectives, which leads to more robust, unbiased, and comprehensive models. In search use cases, for instance, diversity helps avoid redundancy, ensuring that users are exposed to a broader set of relevant information rather than repeated similar results.

Papers

Showing 51100 of 9051 papers

TitleStatusHype
Improving Model Evaluation using SMART Filtering of Benchmark DatasetsCode3
Results of the Big ANN: NeurIPS'23 competitionCode3
Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language ModelsCode3
SkillMimic: Learning Basketball Interaction Skills from DemonstrationsCode3
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2Code3
AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha FactorsCode3
Visible-Thermal Tiny Object Detection: A Benchmark Dataset and BaselinesCode3
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone GenerationCode3
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative WarpingCode3
FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse LandscapesCode3
Taming Diffusion Probabilistic Models for Character ControlCode3
UniMERNet: A Universal Network for Real-World Mathematical Expression RecognitionCode3
Addressing the Abstraction and Reasoning Corpus via Procedural Example GenerationCode3
UniTraj: A Unified Framework for Scalable Vehicle Trajectory PredictionCode3
ThemeStation: Generating Theme-Aware 3D Assets from Few ExemplarsCode3
Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene UnderstandingCode3
LongAlign: A Recipe for Long Context Alignment of Large Language ModelsCode3
INTERS: Unlocking the Power of Large Language Models in Search with Instruction TuningCode3
Improved motif-scaffolding with SE(3) flow matchingCode3
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture ModelingCode3
Improving Text Embeddings with Large Language ModelsCode3
Sequential Modeling Enables Scalable Learning for Large Vision ModelsCode3
Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language ModelCode3
SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masksCode3
CRITERIA: a New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous DrivingCode3
Objaverse-XL: A Universe of 10M+ 3D ObjectsCode3
SVIT: Scaling up Visual Instruction TuningCode3
Self-QA: Unsupervised Knowledge Guided Language Model AlignmentCode3
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human SupervisionCode3
Anything-3D: Towards Single-view Anything Reconstruction in the WildCode3
RT-1: Robotics Transformer for Real-World Control at ScaleCode3
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative ModelsCode3
MiniViT: Compressing Vision Transformers with Weight MultiplexingCode3
Hierarchical Text-Conditional Image Generation with CLIP LatentsCode3
MNN: A Universal and Efficient Inference EngineCode3
Generating Long Sequences with Sparse TransformersCode3
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics EmulationCode2
MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based AgentsCode2
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMsCode2
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian SplattingCode2
MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary ProgrammingCode2
ZIPA: A family of efficient models for multilingual phone recognitionCode2
AD-AGENT: A Multi-agent Framework for End-to-end Anomaly DetectionCode2
HISTAI: An Open-Source, Large-Scale Whole Slide Image Dataset for Computational PathologyCode2
NoisyRollout: Reinforcing Visual Reasoning with Data AugmentationCode2
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World UsersCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
Dereflection Any Image with Diffusion Priors and Diversified DataCode2
Modifying Large Language Model Post-Training for Diverse Creative WritingCode2
PET-MAD, a universal interatomic potential for advanced materials modelingCode2
Show:102550
← PrevPage 2 of 182Next →

No leaderboard results yet.