The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
xLSTM: Extended Long Short-Term Memory	May 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	7	5
Full Scaling Automation for Sustainable Development of Green Data Centers	May 1, 2023	Cloud ComputingCPU	CodeCode Available	7	5
LLaMA: Open and Efficient Foundation Language Models	Feb 27, 2023	Arithmetic ReasoningCode Generation	CodeCode Available	7	5
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds	Mar 13, 2025	3D Human Reconstruction	CodeCode Available	7	5
SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization	Nov 17, 2024	Image GenerationQuantization	CodeCode Available	7	5
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers	Jan 21, 2024	Image Generation	CodeCode Available	7	5
Transparent Image Layer Diffusion using Latent Transparency	Feb 27, 2024		CodeCode Available	7	5
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding	Mar 22, 2024	Action ClassificationAction Recognition	CodeCode Available	7	5
AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation Pipeline	Oct 28, 2024	RAGRetrieval	CodeCode Available	7	5
Robust Inverse Graphics via Probabilistic Inference	Feb 2, 2024	NeRF	CodeCode Available	7	5
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	7	5
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets	Jun 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	7	5
One-Step Image Translation with Text-to-Image Models	Mar 18, 2024	DenoisingTranslation	CodeCode Available	7	5
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation	Jan 20, 2025	Language ModelingLanguage Modelling	CodeCode Available	7	5
2D Gaussian Splatting for Geometrically Accurate Radiance Fields	Mar 26, 2024	3DGSNovel View Synthesis	CodeCode Available	7	5
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models	Apr 20, 2023	Image DescriptionLanguage Modelling	CodeCode Available	7	5
In-Context LoRA for Diffusion Transformers	Oct 31, 2024	Image Generation	CodeCode Available	7	5
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration	Oct 3, 2024	Image GenerationQuantization	CodeCode Available	7	5
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism	Nov 4, 2024	GPU	CodeCode Available	7	5
Domain Expansion of Image Generators	Jan 12, 2023		CodeCode Available	7	5
CALE: Continuous Arcade Learning Environment	Oct 31, 2024	Atari GamesBenchmarking	CodeCode Available	7	5
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model	Feb 14, 2025	Video GenerationVideo Reconstruction	CodeCode Available	7	5
FourierKAN outperforms MLP on Text Classification Head Fine-tuning	Aug 16, 2024	ClassificationKolmogorov-Arnold Networks	CodeCode Available	7	5
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models	Oct 12, 2023	Language ModellingLarge Language Model	CodeCode Available	7	5
HealthBench: Evaluating Large Language Models Towards Improved Human Health	May 13, 2025	Instruction FollowingMultiple-choice	CodeCode Available	7	5
OmniGen: Unified Image Generation	Sep 17, 2024	Edge DetectionImage Generation	CodeCode Available	7	5
Fast Timing-Conditioned Latent Audio Diffusion	Feb 7, 2024	Audio GenerationGPU	CodeCode Available	7	5
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation	Jun 24, 2024	parameter-efficient fine-tuningSentence	CodeCode Available	7	5
PuLID: Pure and Lightning ID Customization via Contrastive Alignment	Apr 24, 2024	Image GenerationText to Image Generation	CodeCode Available	7	5
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning	Apr 23, 2025	Multimodal Reasoningreinforcement-learning	CodeCode Available	7	5
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains	Jul 18, 2024		CodeCode Available	7	5
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference	Mar 17, 2025	MambaMath	CodeCode Available	7	5
SageAttention2++: A More Efficient Implementation of SageAttention2	May 27, 2025	QuantizationVideo Generation	CodeCode Available	7	5
NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking	Jun 21, 2024	Autonomous DrivingBenchmarking	CodeCode Available	7	5
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization	Mar 26, 2025	CPUGPU	CodeCode Available	7	5
TextGrad: Automatic "Differentiation" via Text	Jun 11, 2024	Question AnsweringSpecificity	CodeCode Available	7	5
Open Deep Search: Democratizing Search with Open-source Reasoning Agents	Mar 26, 2025	10-shot image generation	CodeCode Available	7	5
PowerPM: Foundation Model for Power Systems	Aug 7, 2024	Contrastive Learningmodel	CodeCode Available	7	5
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining	Aug 5, 2024	DecoderDepth Estimation	CodeCode Available	7	5
X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation	Nov 26, 2024		CodeCode Available	7	5
ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval	May 22, 2025	Retrieval	CodeCode Available	7	5
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences	Mar 14, 2024	HumanEval	CodeCode Available	7	5
VMamba: Visual State Space Model	Jan 18, 2024	Computational EfficiencyLanguage Modeling	CodeCode Available	7	5
Dynamic data sampler for cross-language transfer learning in large language models	May 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	7	5
Rethinking the Sample Relations for Few-Shot Classification	Jan 23, 2025	ClassificationContrastive Learning	CodeCode Available	7	5
Qwen2-Audio Technical Report	Jul 15, 2024	Instruction FollowingLanguage Modelling	CodeCode Available	7	5
DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations	Jan 23, 2025		CodeCode Available	7	5
M&M VTO: Multi-Garment Virtual Try-On and Editing	Jun 6, 2024	DenoisingSuper-Resolution	CodeCode Available	7	5
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT	Jun 5, 2024	Image GenerationPoint Cloud Generation	CodeCode Available	7	5
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test	Mar 3, 2025	Prediction	CodeCode Available	7	5