SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 651675 of 659983 papers

TitleStatusHype
Helios: Real Real-Time Long Video Generation Model5
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters5
Rethinking the Design of Reinforcement Learning-Based Deep Research Agents5
World Action Models are Zero-shot Policies5
OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text- and Time-Series Data5
FireRed-Image-Edit-1.0 Technical Report5
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery5
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning5
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE5
Kimi K2.5: Visual Agentic Intelligence5
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey5
SAMTok: Representing Any Mask with Two Words5
UQLM: A Python Package for Uncertainty Quantification in Large Language ModelsCode5
skfolio: Portfolio Optimization in PythonCode5
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future FrontiersCode5
RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query ParallelismCode5
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and EditingCode5
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement LearningCode5
Matrix-Game: Interactive World Foundation ModelCode5
YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual PerceptionCode5
Show-o2: Improved Native Unified Multimodal ModelsCode5
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech ModelCode5
SoundMind: RL-Incentivized Logic Reasoning for Audio-Language ModelsCode5
A quantum semantic framework for natural language processingCode5
τ^2-Bench: Evaluating Conversational Agents in a Dual-Control EnvironmentCode5
Show:102550
← PrevPage 27 of 26400Next →