SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 26012625 of 661570 papers

TitleStatusHype
AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous AgentsCode3
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous DrivingCode3
VideoGen-Eval: Agent-based System for Video Generation EvaluationCode3
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual VideosCode3
ToRL: Scaling Tool-Integrated RLCode3
From Panels to Prose: Generating Literary Narratives from ComicsCode3
Efficient Inference for Large Reasoning Models: A SurveyCode3
LSNet: See Large, Focus SmallCode3
WeatherMesh-3: Fast and accurate operational global weather forecastingCode3
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and BeyondCode3
Vision-to-Music Generation: A SurveyCode3
Exploring the Evolution of Physics Cognition in Video Generation: A SurveyCode3
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single VideoCode3
Optimal Stepsize for Diffusion SamplingCode3
HyperGraphRAG: Retrieval-Augmented Generation with Hypergraph-Structured Knowledge RepresentationCode3
Vision as LoRACode3
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIsCode3
Reason-RFT: Reinforcement Fine-Tuning for Visual ReasoningCode3
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal ConsistencyCode3
Long-Context Autoregressive Video Modeling with Next-Frame PredictionCode3
iNatAg: Multi-Class Classification Models Enabled by a Large-Scale Benchmark Dataset with 4.7M Images of 2,959 Crop and Weed SpeciesCode3
ExCoT: Optimizing Reasoning for Text-to-SQL with Execution FeedbackCode3
Frequency Dynamic Convolution for Dense Image PredictionCode3
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena PerspectiveCode3
Defeating Prompt Injections by DesignCode3
Show:102550
← PrevPage 105 of 26463Next →