SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 26012650 of 659983 papers

TitleStatusHype
AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous AgentsCode3
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous DrivingCode3
VideoGen-Eval: Agent-based System for Video Generation EvaluationCode3
From Panels to Prose: Generating Literary Narratives from ComicsCode3
ToRL: Scaling Tool-Integrated RLCode3
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual VideosCode3
Efficient Inference for Large Reasoning Models: A SurveyCode3
LSNet: See Large, Focus SmallCode3
WeatherMesh-3: Fast and accurate operational global weather forecastingCode3
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single VideoCode3
Vision-to-Music Generation: A SurveyCode3
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and BeyondCode3
Optimal Stepsize for Diffusion SamplingCode3
HyperGraphRAG: Retrieval-Augmented Generation with Hypergraph-Structured Knowledge RepresentationCode3
Exploring the Evolution of Physics Cognition in Video Generation: A SurveyCode3
Reason-RFT: Reinforcement Fine-Tuning for Visual ReasoningCode3
Vision as LoRACode3
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIsCode3
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal ConsistencyCode3
Long-Context Autoregressive Video Modeling with Next-Frame PredictionCode3
ExCoT: Optimizing Reasoning for Text-to-SQL with Execution FeedbackCode3
iNatAg: Multi-Class Classification Models Enabled by a Large-Scale Benchmark Dataset with 4.7M Images of 2,959 Crop and Weed SpeciesCode3
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena PerspectiveCode3
Frequency Dynamic Convolution for Dense Image PredictionCode3
AdaWorld: Learning Adaptable World Models with Latent ActionsCode3
Defeating Prompt Injections by DesignCode3
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the MetaverseCode3
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion ModelsCode3
PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from VideosCode3
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language PretrainingCode3
Retrieval Augmented Generation and Understanding in Vision: A Survey and New OutlookCode3
Multi-Modality Representation Learning for Antibody-Antigen Interactions PredictionCode3
Halton Scheduler For Masked Generative Image TransformerCode3
NdLinear Is All You Need for Representation LearningCode3
Unreal-MAP: Unreal-Engine-Based General Platform for Multi-Agent Reinforcement LearningCode3
XAttention: Block Sparse Attention with Antidiagonal ScoringCode3
A Comprehensive Survey on Long Context Language ModelingCode3
NeuralFoil: An Airfoil Aerodynamics Analysis Tool Using Physics-Informed Machine LearningCode3
Unleashing Vecset Diffusion Model for Fast Shape GenerationCode3
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn'tCode3
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning TasksCode3
Vision-Speech Models: Teaching Speech Models to Converse about ImagesCode3
TripNet: Learning Large-scale High-fidelity 3D Car Aerodynamics with Triplane NetworksCode3
Measuring AI Ability to Complete Long TasksCode3
MDocAgent: A Multi-Modal Multi-Agent Framework for Document UnderstandingCode3
MoonCast: High-Quality Zero-Shot Podcast GenerationCode3
A Survey on Human Interaction Motion GenerationCode3
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking PortraitCode3
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy OptimizationCode3
VideoMind: A Chain-of-LoRA Agent for Long Video ReasoningCode3
Show:102550
← PrevPage 53 of 13200Next →