The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 474278 papers

Title	Date	Tasks	Status	Hype
DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations	Jan 23, 2025		CodeCode Available	7
M&M VTO: Multi-Garment Virtual Try-On and Editing	Jun 6, 2024	DenoisingSuper-Resolution	CodeCode Available	7
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT	Jun 5, 2024	Image GenerationPoint Cloud Generation	CodeCode Available	7
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test	Mar 3, 2025	Prediction	CodeCode Available	7
LLaMA-Omni: Seamless Speech Interaction with Large Language Models	Sep 10, 2024		CodeCode Available	7
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration	Oct 3, 2024	Image GenerationQuantization	CodeCode Available	7
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models	Feb 6, 2023	Scheduling	CodeCode Available	7
Is Diversity All You Need for Scalable Robotic Manipulation?	Jul 8, 2025	AllDiversity	CodeCode Available	7
VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos	Feb 3, 2025	Knowledge GraphsRAG	CodeCode Available	7
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation	May 28, 2025	Human AnimationInstruction Following	CodeCode Available	7
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction	Jan 3, 2025		CodeCode Available	7
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning	Nov 30, 2024		CodeCode Available	7
Agentless: Demystifying LLM-based Software Engineering Agents	Jul 1, 2024	Program Repair	CodeCode Available	7
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers	Mar 15, 2024	Text GenerationVideo Generation	CodeCode Available	7
SEW: Self-Evolving Agentic Workflows for Automated Code Generation	May 24, 2025	Code Generation	CodeCode Available	7
SoftTiger: A Clinical Foundation Model for Healthcare Workflows	Mar 1, 2024	Language ModellingLarge Language Model	CodeCode Available	7
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models	Feb 8, 2024	BenchmarkingDiversity	CodeCode Available	7
AFlow: Automating Agentic Workflow Generation	Oct 14, 2024	Code Generation	CodeCode Available	7
Enhancing Fourier Neural Operators with Local Spatial Features	Mar 22, 2025	Computational Efficiency	CodeCode Available	7
MambaOut: Do We Really Need Mamba for Vision?	May 13, 2024	image-classificationImage Classification	CodeCode Available	7
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation	Jun 11, 2025	4k	CodeCode Available	7
Open Deep Search: Democratizing Search with Open-source Reasoning Agents	Mar 26, 2025	10-shot image generation	CodeCode Available	7
Pyramidal Flow Matching for Efficient Video Generative Modeling	Oct 8, 2024	GPUText-to-Video Generation	CodeCode Available	7
Speechless: Speech Instruction Training Without Speech for Low Resource Languages	May 23, 2025	speech-recognitionSpeech Recognition	CodeCode Available	7
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations	Jan 3, 2024	DiversityQuantization	CodeCode Available	7
Visual-RFT: Visual Reinforcement Fine-Tuning	Mar 3, 2025	Few-Shot Object DetectionFine-Grained Image Classification	CodeCode Available	7
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion	Feb 21, 2022	BinarizationModel Optimization	CodeCode Available	7
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents	Apr 16, 2024	Fact CheckingRetrieval-augmented Generation	CodeCode Available	7
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image	May 30, 2024	Image to 3DSingle-View 3D Reconstruction	CodeCode Available	7
TextGrad: Automatic "Differentiation" via Text	Jun 11, 2024	Question AnsweringSpecificity	CodeCode Available	7
Efficient multi-prompt evaluation of LLMs	May 27, 2024	MMLU	CodeCode Available	7
TTRL: Test-Time Reinforcement Learning	Apr 22, 2025	Mathreinforcement-learning	CodeCode Available	7
Elixir: Train a Large Language Model on a Small GPU Cluster	Dec 10, 2022	CPUGPU	CodeCode Available	7
Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond	Jan 19, 2025	Deep LearningMulti-Task Learning	CodeCode Available	7
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding	Apr 17, 2025	Video Question AnsweringVideo Understanding	CodeCode Available	7
Tulu 3: Pushing Frontiers in Open Language Model Post-Training	Nov 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
Measuring Massive Multitask Chinese Understanding	Apr 25, 2023	All	CodeCode Available	7
In-Context LoRA for Diffusion Transformers	Oct 31, 2024	Image Generation	CodeCode Available	7
FoundationStereo: Zero-Shot Stereo Matching	Jan 17, 2025	Depth EstimationDiversity	CodeCode Available	7
Mirage: A Multi-Level Superoptimizer for Tensor Programs	May 9, 2024	GPUNavigate	CodeCode Available	7
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables	Feb 29, 2024	Time SeriesTime Series Forecasting	CodeCode Available	7
Visual Agentic Reinforcement Fine-Tuning	May 20, 2025	Image Manipulation	CodeCode Available	7
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection	May 16, 2024	Edge-computingFew-Shot Object Detection	CodeCode Available	7
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback	Dec 20, 2024	AllInstruction Following	CodeCode Available	7
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Jul 10, 2024	Video Question AnsweringZero-Shot Video Question Answer	CodeCode Available	7
Measuring short-form factuality in large language models	Nov 7, 2024	Form	CodeCode Available	7
RedPajama: an Open Dataset for Training Large Language Models	Nov 19, 2024		CodeCode Available	7
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library	Jun 6, 2025	Management	CodeCode Available	7
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents	Apr 16, 2025		CodeCode Available	7
xLSTM: Extended Long Short-Term Memory	May 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	7