SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 41264150 of 177340 papers

TitleStatusHype
ACEGEN: Reinforcement learning of generative chemical agents for drug discoveryCode3
Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and PlanningCode3
RiNALMo: General-Purpose RNA Language Models Can Generalize Well on Structure Prediction TasksCode3
Embodied Understanding of Driving ScenariosCode3
Personalized Image Generation with Deep Generative Models: A Decade SurveyCode3
R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPOCode3
Datasheet for the PileCode3
UniMERNet: A Universal Network for Real-World Mathematical Expression RecognitionCode3
imitation: Clean Imitation Learning ImplementationsCode3
Efficient Video Action Detection with Token Dropout and Context RefinementCode3
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of GeneralizationCode3
LLM-Pruner: On the Structural Pruning of Large Language ModelsCode3
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter ModelCode3
HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene ReconstructionCode3
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone TrainingCode3
White-Box Transformers via Sparse Rate ReductionCode3
SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in StructuresCode3
Fine-Tuning Language Models from Human PreferencesCode3
GuardT2I: Defending Text-to-Image Models from Adversarial PromptsCode3
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion ModelCode3
Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender EstimationCode3
EvoTorch: Scalable Evolutionary Computation in PythonCode3
Leveraging Vision-Centric Multi-Modal Expertise for 3D Object DetectionCode3
Are We Done with MMLU?Code3
Does End-to-End Autonomous Driving Really Need Perception Tasks?Code3
Show:102550
← PrevPage 166 of 7094Next →