SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 13011350 of 659983 papers

TitleStatusHype
s3: You Don't Need That Much Data to Train a Search Agent via RLCode4
Scaling Law for Quantization-Aware TrainingCode4
VideoEval-Pro: Robust and Realistic Long Video Understanding EvaluationCode4
DreamGen: Unlocking Generalization in Robot Learning through Video World ModelsCode4
Mean Flows for One-step Generative ModelingCode4
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level SupervisionCode4
Multi-head Temporal Latent AttentionCode4
CPGD: Toward Stable Rule-based Reinforcement Learning for Language ModelsCode4
Kornia-rs: A Low-Level 3D Computer Vision Library In RustCode4
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement LearningCode4
Attention on the SphereCode4
Accelerating Visual-Policy Learning through Parallel Differentiable SimulationCode4
OnPrem.LLM: A Privacy-Conscious Document Intelligence ToolkitCode4
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-FreeCode4
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning ModelsCode4
FG-CLIP: Fine-Grained Visual and Textual AlignmentCode4
3D Scene Generation: A SurveyCode4
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language ModelCode4
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-TuningCode4
Towards One-shot Federated Learning: Advances, Challenges, and Future DirectionsCode4
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal InteractionCode4
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and ModalityCode4
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoTCode4
Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of LightCode4
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning datasetCode4
High-performance training and inference for deep equivariant interatomic potentialsCode4
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video GenerationCode4
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language ModelsCode4
RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the WildCode4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length FloatCode4
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion TransformerCode4
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation ModelsCode4
Revisiting Self-Attentive Sequential RecommendationCode4
LLMMapReduce-V2: Entropy-Driven Convolutional Test-Time Scaling for Generating Long-Form Articles from Extremely Long ResourcesCode4
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human InterplayCode4
MedSAM2: Segment Anything in 3D Medical Images and VideosCode4
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world EnvironmentsCode4
SkyReels-A2: Compose Anything in Video Diffusion TransformersCode4
Easi3R: Estimating Disentangled Motion from DUSt3R Without TrainingCode4
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action ModelCode4
ActionStudio: A Lightweight Framework for Data and Training of Large Action ModelsCode4
Lumina-Image 2.0: A Unified and Efficient Image Generative FrameworkCode4
Video-R1: Reinforcing Video Reasoning in MLLMsCode4
X^2-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic ReconstructionCode4
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern LanguagesCode4
TerraTorch: The Geospatial Foundation Models ToolkitCode4
Your ViT is Secretly an Image Segmentation ModelCode4
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching ModelsCode4
OvercookedV2: Rethinking Overcooked for Zero-Shot CoordinationCode4
Show:102550
← PrevPage 27 of 13200Next →