SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 301350 of 474278 papers

TitleStatusHype
DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering SimulationsCode7
M&M VTO: Multi-Garment Virtual Try-On and EditingCode7
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiTCode7
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time TestCode7
LLaMA-Omni: Seamless Speech Interaction with Large Language ModelsCode7
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference AccelerationCode7
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale ModelsCode7
Is Diversity All You Need for Scalable Robotic Manipulation?Code7
VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context VideosCode7
Let Them Talk: Audio-Driven Multi-Person Conversational Video GenerationCode7
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech InteractionCode7
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine LearningCode7
Agentless: Demystifying LLM-based Software Engineering AgentsCode7
DSP: Dynamic Sequence Parallelism for Multi-Dimensional TransformersCode7
SEW: Self-Evolving Agentic Workflows for Automated Code GenerationCode7
SoftTiger: A Clinical Foundation Model for Healthcare WorkflowsCode7
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language ModelsCode7
AFlow: Automating Agentic Workflow GenerationCode7
Enhancing Fourier Neural Operators with Local Spatial FeaturesCode7
MambaOut: Do We Really Need Mamba for Vision?Code7
ComfyUI-R1: Exploring Reasoning Models for Workflow GenerationCode7
Open Deep Search: Democratizing Search with Open-source Reasoning AgentsCode7
Pyramidal Flow Matching for Efficient Video Generative ModelingCode7
Speechless: Speech Instruction Training Without Speech for Low Resource LanguagesCode7
From Audio to Photoreal Embodiment: Synthesizing Humans in ConversationsCode7
Visual-RFT: Visual Reinforcement Fine-TuningCode7
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale FusionCode7
MiniCheck: Efficient Fact-Checking of LLMs on Grounding DocumentsCode7
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single ImageCode7
TextGrad: Automatic "Differentiation" via TextCode7
Efficient multi-prompt evaluation of LLMsCode7
TTRL: Test-Time Reinforcement LearningCode7
Elixir: Train a Large Language Model on a Small GPU ClusterCode7
Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and BeyondCode7
PerceptionLM: Open-Access Data and Models for Detailed Visual UnderstandingCode7
Tulu 3: Pushing Frontiers in Open Language Model Post-TrainingCode7
Measuring Massive Multitask Chinese UnderstandingCode7
In-Context LoRA for Diffusion TransformersCode7
FoundationStereo: Zero-Shot Stereo MatchingCode7
Mirage: A Multi-Level Superoptimizer for Tensor ProgramsCode7
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous VariablesCode7
Visual Agentic Reinforcement Fine-TuningCode7
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object DetectionCode7
Align Anything: Training All-Modality Models to Follow Instructions with Language FeedbackCode7
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal ModelsCode7
Measuring short-form factuality in large language modelsCode7
RedPajama: an Open Dataset for Training Large Language ModelsCode7
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling LibraryCode7
BrowseComp: A Simple Yet Challenging Benchmark for Browsing AgentsCode7
xLSTM: Extended Long Short-Term MemoryCode7
Show:102550
← PrevPage 7 of 9486Next →