SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 41514160 of 474278 papers

TitleStatusHype
AgentTuning: Enabling Generalized Agent Abilities for LLMsCode3
Safe RLHF: Safe Reinforcement Learning from Human FeedbackCode3
Take the aTrain. Introducing an Interface for the Accessible Transcription of InterviewsCode3
Llemma: An Open Language Model For MathematicsCode3
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving ResearchCode3
Lag-Llama: Towards Foundation Models for Probabilistic Time Series ForecastingCode3
MotionDirector: Motion Customization of Text-to-Video Diffusion ModelsCode3
NoMaD: Goal Masked Diffusion Policies for Navigation and ExplorationCode3
CRITERIA: a New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous DrivingCode3
MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative AgentsCode3
Show:102550
← PrevPage 416 of 47428Next →