The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

510,095 papers251,776 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 177341 papers

Title	Date	Tasks	Status	Hype	Score
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test	Mar 3, 2025	Prediction	CodeCode Available	7	5
LLaMA-Omni: Seamless Speech Interaction with Large Language Models	Sep 10, 2024		CodeCode Available	7	5
MambaVision: A Hybrid Mamba-Transformer Vision Backbone	Jul 10, 2024	Image ClassificationInstance Segmentation	CodeCode Available	7	5
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models	Feb 6, 2023	Scheduling	CodeCode Available	7	5
Is Diversity All You Need for Scalable Robotic Manipulation?	Jul 8, 2025	AllDiversity	CodeCode Available	7	5
VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos	Feb 3, 2025	Knowledge GraphsRAG	CodeCode Available	7	5
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation	May 28, 2025	Human AnimationInstruction Following	CodeCode Available	7	5
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction	Jan 3, 2025		CodeCode Available	7	5
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning	Nov 30, 2024		CodeCode Available	7	5
Agentless: Demystifying LLM-based Software Engineering Agents	Jul 1, 2024	Program Repair	CodeCode Available	7	5
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers	Mar 15, 2024	Text GenerationVideo Generation	CodeCode Available	7	5
SEW: Self-Evolving Agentic Workflows for Automated Code Generation	May 24, 2025	Code Generation	CodeCode Available	7	5
SoftTiger: A Clinical Foundation Model for Healthcare Workflows	Mar 1, 2024	Language ModellingLarge Language Model	CodeCode Available	7	5
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models	Feb 8, 2024	BenchmarkingDiversity	CodeCode Available	7	5
AFlow: Automating Agentic Workflow Generation	Oct 14, 2024	Code Generation	CodeCode Available	7	5
Enhancing Fourier Neural Operators with Local Spatial Features	Mar 22, 2025	Computational Efficiency	CodeCode Available	7	5
MambaOut: Do We Really Need Mamba for Vision?	May 13, 2024	image-classificationImage Classification	CodeCode Available	7	5
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation	Jun 11, 2025	4k	CodeCode Available	7	5
The Road Less Scheduled	May 24, 2024	Scheduling	CodeCode Available	7	5
Pyramidal Flow Matching for Efficient Video Generative Modeling	Oct 8, 2024	GPUText-to-Video Generation	CodeCode Available	7	5
Speechless: Speech Instruction Training Without Speech for Low Resource Languages	May 23, 2025	speech-recognitionSpeech Recognition	CodeCode Available	7	5
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations	Jan 3, 2024	DiversityQuantization	CodeCode Available	7	5
Visual-RFT: Visual Reinforcement Fine-Tuning	Mar 3, 2025	Few-Shot Object DetectionFine-Grained Image Classification	CodeCode Available	7	5
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion	Feb 21, 2022	BinarizationModel Optimization	CodeCode Available	7	5
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents	Apr 16, 2024	Fact CheckingRetrieval-augmented Generation	CodeCode Available	7	5
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image	May 30, 2024	Image to 3DSingle-View 3D Reconstruction	CodeCode Available	7	5
Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting	Jan 29, 2024	Depth EstimationDynamic Reconstruction	CodeCode Available	7	5
Efficient multi-prompt evaluation of LLMs	May 27, 2024	MMLU	CodeCode Available	7	5
TTRL: Test-Time Reinforcement Learning	Apr 22, 2025	Mathreinforcement-learning	CodeCode Available	7	5
Elixir: Train a Large Language Model on a Small GPU Cluster	Dec 10, 2022	CPUGPU	CodeCode Available	7	5
Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond	Jan 19, 2025	Deep LearningMulti-Task Learning	CodeCode Available	7	5
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding	Apr 17, 2025	Video Question AnsweringVideo Understanding	CodeCode Available	7	5
Tulu 3: Pushing Frontiers in Open Language Model Post-Training	Nov 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	7	5
Measuring Massive Multitask Chinese Understanding	Apr 25, 2023	All	CodeCode Available	7	5
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors	Jul 6, 2022	2D Object DetectionGPU	CodeCode Available	7	5
FoundationStereo: Zero-Shot Stereo Matching	Jan 17, 2025	Depth EstimationDiversity	CodeCode Available	7	5
Mirage: A Multi-Level Superoptimizer for Tensor Programs	May 9, 2024	GPUNavigate	CodeCode Available	7	5
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables	Feb 29, 2024	Time SeriesTime Series Forecasting	CodeCode Available	7	5
Visual Agentic Reinforcement Fine-Tuning	May 20, 2025	Image Manipulation	CodeCode Available	7	5
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection	May 16, 2024	Edge-computingFew-Shot Object Detection	CodeCode Available	7	5
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback	Dec 20, 2024	AllInstruction Following	CodeCode Available	7	5
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Jul 10, 2024	Video Question AnsweringZero-Shot Video Question Answer	CodeCode Available	7	5
Measuring short-form factuality in large language models	Nov 7, 2024	Form	CodeCode Available	7	5
RedPajama: an Open Dataset for Training Large Language Models	Nov 19, 2024		CodeCode Available	7	5
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library	Jun 6, 2025	Management	CodeCode Available	7	5
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents	Apr 16, 2025		CodeCode Available	7	5
Easy Begun is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum Dropout	Nov 28, 2022		CodeCode Available	7	5
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning	Apr 24, 2025	Code Generation	CodeCode Available	7	5
On the Vulnerability of LLM/VLM-Controlled Robotics	Feb 15, 2024	Language ModellingRobot Manipulation	CodeCode Available	7	5
Grounding Image Matching in 3D with MASt3R	Jun 14, 2024	3D Reconstruction	CodeCode Available	7	5