SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 62766300 of 474278 papers

TitleStatusHype
LongCat-Image Technical Report0
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models0
DeepCode: Open Agentic Coding0
FLEX: Continuous Agent Evolution via Forward Learning from Experience0
Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task PromptsCode0
Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability?Code0
How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline0
Relational Visual Similarity0
ReasonBENCH: Benchmarking the (In)Stability of LLM ReasoningCode0
A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and ReasoningCode0
FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and DeploymentCode0
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy OptimizationCode0
LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the WildCode0
QiMeng-SALV: Signal-Aware Learning for Verilog Code GenerationCode0
MCMoE: Completing Missing Modalities with Mixture of Experts for Incomplete Multimodal Action Quality AssessmentCode0
Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language ModelsCode0
MM-ACT: Learn from Multimodal Parallel Generation to ActCode0
Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded ReasoningCode0
VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph GenerationCode0
PlantBiMoE: A Bidirectional Foundation Model with SparseMoE for Plant GenomesCode0
Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous DrivingCode0
Unified Camera Positional Encoding for Controlled Video GenerationCode0
M-STAR: Multi-Scale Spatiotemporal Autoregression for Human Mobility ModelingCode0
ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing PointsCode0
Unified Video Editing with Temporal ReasonerCode0
Show:102550
← PrevPage 252 of 18972Next →