SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,196 code links4,818 tasks

Papers

Showing 2650 of 658356 papers

TitleStatusHype
Qwen2.5 Technical ReportCode13
Qwen2 Technical ReportCode13
Autonomous Agents for Collaborative Task under Information AsymmetryCode13
Zep: A Temporal Knowledge Graph Architecture for Agent MemoryCode12
MiniCPM-V: A GPT-4V Level MLLM on Your PhoneCode12
OmniParser for Pure Vision Based GUI AgentCode12
Qwen3-Coder-Next Technical Report11
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints11
WebSailor: Navigating Super-human Reasoning for Web AgentCode11
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient RoboticsCode11
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task AutomationCode11
WebDancer: Towards Autonomous Information Seeking AgencyCode11
CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-trainingCode11
Absolute Zero: Reinforced Self-play Reasoning with Zero DataCode11
Packing Input Frame Context in Next-Frame Prediction Models for Video GenerationCode11
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use AgentsCode11
Wan: Open and Advanced Large-Scale Video Generative ModelsCode11
VGGT: Visual Geometry Grounded TransformerCode11
YOLOE: Real-Time Seeing AnythingCode11
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language ModelsCode11
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech TokensCode11
SCORE: Systematic COnsistency and Robustness Evaluation for Large Language ModelsCode11
olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language ModelsCode11
Qwen2.5-VL Technical ReportCode11
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech SystemCode11
Show:102550
← PrevPage 2 of 26335Next →