The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers247,172 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–275 of 658356 papers

Title	Date	Tasks	Status	Hype
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation	Jun 24, 2024	parameter-efficient fine-tuningSentence	CodeCode Available	7
PuLID: Pure and Lightning ID Customization via Contrastive Alignment	Apr 24, 2024	Image GenerationText to Image Generation	CodeCode Available	7
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning	Apr 23, 2025	Multimodal Reasoningreinforcement-learning	CodeCode Available	7
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains	Jul 18, 2024		CodeCode Available	7
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference	Mar 17, 2025	MambaMath	CodeCode Available	7
SageAttention2++: A More Efficient Implementation of SageAttention2	May 27, 2025	QuantizationVideo Generation	CodeCode Available	7
NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking	Jun 21, 2024	Autonomous DrivingBenchmarking	CodeCode Available	7
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding	Mar 22, 2024	Action ClassificationAction Recognition	CodeCode Available	7
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model	Feb 14, 2025	Video GenerationVideo Reconstruction	CodeCode Available	7
X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation	Nov 26, 2024		CodeCode Available	7
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining	Aug 5, 2024	DecoderDepth Estimation	CodeCode Available	7
CALE: Continuous Arcade Learning Environment	Oct 31, 2024	Atari GamesBenchmarking	CodeCode Available	7
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets	Jun 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	7
ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval	May 22, 2025	Retrieval	CodeCode Available	7
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences	Mar 14, 2024	HumanEval	CodeCode Available	7
VMamba: Visual State Space Model	Jan 18, 2024	Computational EfficiencyLanguage Modeling	CodeCode Available	7
Dynamic data sampler for cross-language transfer learning in large language models	May 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
Rethinking the Sample Relations for Few-Shot Classification	Jan 23, 2025	ClassificationContrastive Learning	CodeCode Available	7
Qwen2-Audio Technical Report	Jul 15, 2024	Instruction FollowingLanguage Modelling	CodeCode Available	7
DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations	Jan 23, 2025		CodeCode Available	7
M&M VTO: Multi-Garment Virtual Try-On and Editing	Jun 6, 2024	DenoisingSuper-Resolution	CodeCode Available	7
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT	Jun 5, 2024	Image GenerationPoint Cloud Generation	CodeCode Available	7
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test	Mar 3, 2025	Prediction	CodeCode Available	7
LLaMA-Omni: Seamless Speech Interaction with Large Language Models	Sep 10, 2024		CodeCode Available	7
PowerPM: Foundation Model for Power Systems	Aug 7, 2024	Contrastive Learningmodel	CodeCode Available	7