SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 701725 of 659983 papers

TitleStatusHype
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to PosttrainingCode5
UniVLA: Learning to Act Anywhere with Task-centric Latent ActionsCode5
Continuous Thought MachinesCode5
Generating Physically Stable and Buildable LEGO Designs from TextCode5
ZeroSearch: Incentivize the Search Capability of LLMs without SearchingCode5
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video GenerationCode5
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and OpportunitiesCode5
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal DecompositionCode5
WebThinker: Empowering Large Reasoning Models with Deep Research CapabilityCode5
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble ScorersCode5
Reservoir-enhanced Segment Anything Model for Subsurface DiagnosisCode5
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse AttentionCode5
Reinforcement Learning from Human FeedbackCode5
InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer FrameworkCode5
Pixel-SAIL: Single Transformer For Pixel-Grounded UnderstandingCode5
Kimi-VL Technical ReportCode5
M-Prometheus: A Suite of Open Multilingual LLM JudgesCode5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video SegmentationCode5
PaperBench: Evaluating AI's Ability to Replicate AI ResearchCode5
Less-to-More Generalization: Unlocking More Controllability by In-Context GenerationCode5
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIOCode5
4th PVUW MeViS 3rd Place Report: Sa2VACode5
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic FaithfulnessCode5
Understanding R1-Zero-Like Training: A Critical PerspectiveCode5
ReSearch: Learning to Reason with Search for LLMs via Reinforcement LearningCode5
Show:102550
← PrevPage 29 of 26400Next →