SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 42764300 of 177340 papers

TitleStatusHype
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion ModelCode3
CRAG -- Comprehensive RAG BenchmarkCode3
Major TOM: Expandable Datasets for Earth ObservationCode3
Uni-QSAR: an Auto-ML Tool for Molecular Property PredictionCode3
Optimal Variable Speed Limit Control Strategy on Freeway Segments under Fog ConditionsCode3
Towards General-purpose Infrastructure for Protecting Scientific Data Under StudyCode3
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement LearningCode3
Genie: Generative Interactive EnvironmentsCode3
Exploring Regional Clues in CLIP for Zero-Shot Semantic SegmentationCode3
Efficiently Serving LLM Reasoning Programs with CertaindexCode3
SPO: Sequential Monte Carlo Policy OptimisationCode3
AgentStudio: A Toolkit for Building General Virtual AgentsCode3
Is Value Learning Really the Main Bottleneck in Offline RL?Code3
DANA: Domain-Aware Neurosymbolic Agents for Consistency and AccuracyCode3
Compact 3D Gaussian Splatting for Static and Dynamic Radiance FieldsCode3
MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAMCode3
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2Code3
DPLM-2: A Multimodal Diffusion Protein Language ModelCode3
Automated Formulaic Alpha Generation for Quantitative Investing using Evolutionary AlgorithmsCode3
The False Promise of Imitating Proprietary LLMsCode3
Visual Geometry Grounded Deep Structure From MotionCode3
A Foundation Model for the Earth SystemCode3
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement LearningCode3
Human-level play in the game of Diplomacy by combining language models with strategic reasoningCode3
Improving Text Embeddings with Large Language ModelsCode3
Show:102550
← PrevPage 172 of 7094Next →