SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 41514175 of 661570 papers

TitleStatusHype
Putting the Object Back into Video Object SegmentationCode3
AgentTuning: Enabling Generalized Agent Abilities for LLMsCode3
Take the aTrain. Introducing an Interface for the Accessible Transcription of InterviewsCode3
Llemma: An Open Language Model For MathematicsCode3
MotionDirector: Motion Customization of Text-to-Video Diffusion ModelsCode3
Lag-Llama: Towards Foundation Models for Probabilistic Time Series ForecastingCode3
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving ResearchCode3
NoMaD: Goal Masked Diffusion Policies for Navigation and ExplorationCode3
CRITERIA: a New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous DrivingCode3
MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative AgentsCode3
Text Embeddings Reveal (Almost) As Much As TextCode3
Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity AnalysisCode3
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data CompositionCode3
Evaluating Hallucinations in Chinese Large Language ModelsCode3
T^3Bench: Benchmarking Current Progress in Text-to-3D GenerationCode3
MagicDrive: Street View Generation with Diverse 3D Geometry ControlCode3
Conceptual Framework for Autonomous Cognitive EntitiesCode3
OceanGPT: A Large Language Model for Ocean Science TasksCode3
UltraFeedback: Boosting Language Models with Scaled AI FeedbackCode3
AutoAgents: A Framework for Automatic Agent GenerationCode3
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem SolvingCode3
Data Filtering NetworksCode3
SMPLer-X: Scaling Up Expressive Human Pose and Shape EstimationCode3
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationCode3
Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene ReconstructionCode3
Show:102550
← PrevPage 167 of 26463Next →