SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 301325 of 177339 papers

TitleStatusHype
PerceptionLM: Open-Access Data and Models for Detailed Visual UnderstandingCode7
Tulu 3: Pushing Frontiers in Open Language Model Post-TrainingCode7
Measuring Massive Multitask Chinese UnderstandingCode7
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectorsCode7
FoundationStereo: Zero-Shot Stereo MatchingCode7
Mirage: A Multi-Level Superoptimizer for Tensor ProgramsCode7
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous VariablesCode7
Visual Agentic Reinforcement Fine-TuningCode7
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object DetectionCode7
Align Anything: Training All-Modality Models to Follow Instructions with Language FeedbackCode7
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal ModelsCode7
Measuring short-form factuality in large language modelsCode7
RedPajama: an Open Dataset for Training Large Language ModelsCode7
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling LibraryCode7
BrowseComp: A Simple Yet Challenging Benchmark for Browsing AgentsCode7
Easy Begun is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum DropoutCode7
Paper2Code: Automating Code Generation from Scientific Papers in Machine LearningCode7
On the Vulnerability of LLM/VLM-Controlled RoboticsCode7
Grounding Image Matching in 3D with MASt3RCode7
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-SlidesCode7
VACE: All-in-One Video Creation and EditingCode7
Revisiting PCA for time series reduction in temporal dimensionCode7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image AnalysisCode7
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement LearningCode7
Flow-GRPO: Training Flow Matching Models via Online RLCode7
Show:102550
← PrevPage 13 of 7094Next →