SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 251300 of 658356 papers

TitleStatusHype
Fast Timing-Conditioned Latent Audio DiffusionCode7
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence SegmentationCode7
PuLID: Pure and Lightning ID Customization via Contrastive AlignmentCode7
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for ReasoningCode7
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse DomainsCode7
xLSTM 7B: A Recurrent LLM for Fast and Efficient InferenceCode7
SageAttention2++: A More Efficient Implementation of SageAttention2Code7
NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and BenchmarkingCode7
InternVideo2: Scaling Foundation Models for Multimodal Video UnderstandingCode7
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation ModelCode7
X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics SimulationCode7
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative PretrainingCode7
CALE: Continuous Arcade Learning EnvironmentCode7
From Bytes to Ideas: Language Modeling with Autoregressive U-NetsCode7
ViDoRe Benchmark V2: Raising the Bar for Visual RetrievalCode7
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding PreferencesCode7
VMamba: Visual State Space ModelCode7
Dynamic data sampler for cross-language transfer learning in large language modelsCode7
Rethinking the Sample Relations for Few-Shot ClassificationCode7
Qwen2-Audio Technical ReportCode7
DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering SimulationsCode7
M&M VTO: Multi-Garment Virtual Try-On and EditingCode7
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiTCode7
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time TestCode7
LLaMA-Omni: Seamless Speech Interaction with Large Language ModelsCode7
MambaVision: A Hybrid Mamba-Transformer Vision BackboneCode7
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale ModelsCode7
Is Diversity All You Need for Scalable Robotic Manipulation?Code7
VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context VideosCode7
Let Them Talk: Audio-Driven Multi-Person Conversational Video GenerationCode7
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech InteractionCode7
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine LearningCode7
Agentless: Demystifying LLM-based Software Engineering AgentsCode7
DSP: Dynamic Sequence Parallelism for Multi-Dimensional TransformersCode7
SEW: Self-Evolving Agentic Workflows for Automated Code GenerationCode7
SoftTiger: A Clinical Foundation Model for Healthcare WorkflowsCode7
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language ModelsCode7
AFlow: Automating Agentic Workflow GenerationCode7
Enhancing Fourier Neural Operators with Local Spatial FeaturesCode7
MambaOut: Do We Really Need Mamba for Vision?Code7
PowerPM: Foundation Model for Power SystemsCode7
Visual-RFT: Visual Reinforcement Fine-TuningCode7
From Audio to Photoreal Embodiment: Synthesizing Humans in ConversationsCode7
Open Deep Search: Democratizing Search with Open-source Reasoning AgentsCode7
Domain Expansion of Image GeneratorsCode7
Speechless: Speech Instruction Training Without Speech for Low Resource LanguagesCode7
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale FusionCode7
MiniCheck: Efficient Fact-Checking of LLMs on Grounding DocumentsCode7
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single ImageCode7
Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian SplattingCode7
Show:102550
← PrevPage 6 of 13168Next →