The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 876–900 of 177339 papers

Title	Date	Tasks	Status	Hype	Score
The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey	Feb 14, 2025	Autonomous DrivingSurvey	CodeCode Available	5	5
SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning	Sep 9, 2024	AI AgentKnowledge Graphs	CodeCode Available	5	5
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions	Nov 21, 2024	Reinforcement Learning (RL)	CodeCode Available	5	5
Fake News Detection: It's All in the Data!	Jul 2, 2024	AllDiversity	CodeCode Available	5	5
The BrowserGym Ecosystem for Web Agent Research	Dec 6, 2024	Benchmarking	CodeCode Available	5	5
SCBench: A KV Cache-Centric Analysis of Long-Context Methods	Dec 13, 2024	MambaQuantization	CodeCode Available	5	5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation	Apr 7, 2025	Inference OptimizationReferring Video Object Segmentation	CodeCode Available	5	5
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation	Jan 28, 2022	Image CaptioningImage-text matching	CodeCode Available	5	5
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding	Jun 27, 2024	DecoderSegmentation	CodeCode Available	5	5
Can Foundation Models Wrangle Your Data?	May 20, 2022	Entity ResolutionImputation	CodeCode Available	5	5
Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer	Apr 8, 2024	MuJoCoPhysical Simulations	CodeCode Available	5	5
Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid	Aug 4, 2024	document understanding	CodeCode Available	5	5
Tora: Trajectory-oriented Diffusion Transformer for Video Generation	Jul 31, 2024	Video CompressionVideo Generation	CodeCode Available	5	5
FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation	Oct 16, 2024	Audio GenerationGPU	CodeCode Available	5	5
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?	Mar 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	5	5
SuperAnimal pretrained pose estimation models for behavioral analysis	Mar 14, 2022	2D Pose EstimationAnimal Pose Estimation	CodeCode Available	5	5
Visual Identification of Problematic Bias in Large Label Spaces	Jan 17, 2022	Fairness	CodeCode Available	5	5
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models	Jun 21, 2023		CodeCode Available	5	5
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers	Jun 30, 2025	Multimodal Reasoning	CodeCode Available	5	5
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research	Jan 31, 2024	Language ModelingLanguage Modelling	CodeCode Available	5	5
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning	Jun 2, 2025	AI AgentDiversity	CodeCode Available	5	5
FeatUp: A Model-Agnostic Framework for Features at Any Resolution	Mar 15, 2024	Depth EstimationDepth Prediction	CodeCode Available	5	5
DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding	Nov 21, 2024	Long-tailed Object DetectionObject	CodeCode Available	5	5
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data	Jun 24, 2024	Data AugmentationOptical Character Recognition (OCR)	CodeCode Available	5	5
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention	Mar 28, 2023	Instruction FollowingLanguage Modelling	CodeCode Available	5	5