SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 3140 of 474278 papers

TitleStatusHype
OmniParser for Pure Vision Based GUI AgentCode12
SAM 2: Segment Anything in Images and VideosCode12
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precisionCode12
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient RoboticsCode12
Qwen3-Coder-Next Technical Report11
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints11
EAP4EMSIG -- Experiment Automation Pipeline for Event-Driven Microscopy to Smart Microfluidic Single-Cells AnalysisCode11
NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive SecurityCode11
WebWalker: Benchmarking LLMs in Web TraversalCode11
AgentScope: A Flexible yet Robust Multi-Agent PlatformCode11
Show:102550
← PrevPage 4 of 47428Next →