SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 37113720 of 474278 papers

TitleStatusHype
AI2Apps: A Visual IDE for Building LLM-based AI Agent ApplicationsCode3
On-Demand Earth System Data CubesCode3
Findings of the WMT 2024 Shared Task on Discourse-Level Literary TranslationCode3
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the WildCode3
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery AgentsCode3
RVT-2: Learning Precise Manipulation from Few DemonstrationsCode3
OmniTokenizer: A Joint Image-Video Tokenizer for Visual GenerationCode3
Adam-mini: Use Fewer Learning Rates To Gain MoreCode3
Point-SAM: Promptable 3D Segmentation Model for Point CloudsCode3
Retrieval-augmented generation in multilingual settingsCode3
Show:102550
← PrevPage 372 of 47428Next →