The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2401–2450 of 659983 papers

Title	Date	Tasks	Status	Hype
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking	Jan 22, 2026		—Unverified	3
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders	Jan 22, 2026		—Unverified	3
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion	Jan 22, 2026		—Unverified	3
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale	Jul 16, 2025	Computational EfficiencyGPU	CodeCode Available	3
PhysX: Physical-Grounded 3D Asset Generation	Jul 16, 2025	3D GenerationImage to 3D	CodeCode Available	3
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI	Jul 16, 2025	GPU	CodeCode Available	3
A Survey on Latent Reasoning	Jul 8, 2025	Survey	CodeCode Available	3
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving	Jul 8, 2025	Code RepairTransfer Learning	CodeCode Available	3
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge	Jul 6, 2025	Image GenerationMultimodal Reasoning	CodeCode Available	3
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents	Jul 3, 2025	Emotional Intelligencereinforcement-learning	CodeCode Available	3
No time to train! Training-Free Reference-Based Instance Segmentation	Jul 3, 2025	Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection	CodeCode Available	3
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams	Jun 30, 2025	cross-modal alignmentEgoSchema	CodeCode Available	3
L0: Reinforcement Learning to Become General Agents	Jun 30, 2025	Question Answeringreinforcement-learning	CodeCode Available	3
Epona: Autoregressive Diffusion World Model for Autonomous Driving	Jun 30, 2025	Autonomous Drivingmodel	CodeCode Available	3
Ovis-U1 Technical Report	Jun 29, 2025	Image GenerationText to Image Generation	CodeCode Available	3
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language	Jun 26, 2025	All	CodeCode Available	3
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas	Jun 25, 2025		CodeCode Available	3
MMSearch-R1: Incentivizing LMMs to Search	Jun 25, 2025	RAGRetrieval-augmented Generation	CodeCode Available	3
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models	Jun 23, 2025	Domain AdaptationGPU	CodeCode Available	3
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation	Jun 22, 2025	GPUImage Generation	CodeCode Available	3
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens	Jun 20, 2025	Image GenerationMultimodal Reasoning	CodeCode Available	3
Camera Calibration via Circular Patterns: A Comprehensive Framework with Measurement Uncertainty and Unbiased Projection Model	Jun 20, 2025	Camera Calibration	CodeCode Available	3
TabArena: A Living Benchmark for Machine Learning on Tabular Data	Jun 20, 2025	Benchmarking	CodeCode Available	3
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details	Jun 19, 2025	Texture Synthesis	CodeCode Available	3
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning	Jun 16, 2025	Action GenerationAutonomous Driving	CodeCode Available	3
Vine Copulas as Differentiable Computational Graphs	Jun 16, 2025	GPUScheduling	CodeCode Available	3
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences	Jun 16, 2025	Document SummarizationGPU	CodeCode Available	3
Discrete Diffusion in Large Language and Multimodal Models: A Survey	Jun 16, 2025	Denoising	CodeCode Available	3
ANIRA: An Architecture for Neural Network Inference in Real-Time Audio Applications	Jun 14, 2025	Benchmarking	CodeCode Available	3
FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation	Jun 14, 2025	Language ModelingLanguage Modelling	CodeCode Available	3
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications	Jun 14, 2025	Information RetrievalSurvey	CodeCode Available	3
The Diffusion Duality	Jun 12, 2025	Text Generation	CodeCode Available	3
Spurious Rewards: Rethinking Training Signals in RLVR	Jun 12, 2025	MathMathematical Reasoning	CodeCode Available	3
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation	Jun 12, 2025	Video Generation	CodeCode Available	3
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree	Jun 12, 2025	Continual Learning	CodeCode Available	3
JAFAR: Jack up Any Feature at Any Resolution	Jun 10, 2025	Feature Upsampling	CodeCode Available	3
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models	Jun 10, 2025	3D Lane Detection3D Object Detection	CodeCode Available	3
MagCache: Fast Video Generation with Magnitude-Aware Cache	Jun 10, 2025	SSIMVideo Generation	CodeCode Available	3
Highly Compressed Tokenizer Can Generate Without Training	Jun 9, 2025	Image GenerationQuantization	CodeCode Available	3
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems	Jun 9, 2025	Large Language Model	CodeCode Available	3
Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval	Jun 9, 2025	Dataset GenerationRAG	CodeCode Available	3
Real-Time Execution of Action Chunking Flow Policies	Jun 9, 2025	ChunkingVision-Language-Action	CodeCode Available	3
Generalized Trajectory Scoring for End-to-end Multimodal Planning	Jun 7, 2025	Autonomous DrivingDomain Generalization	CodeCode Available	3
When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation	Jun 6, 2025	RAGRetrieval	CodeCode Available	3
FlashDMoE: Fast Distributed MoE in a Single Kernel	Jun 5, 2025	16kCPU	CodeCode Available	3
SupeRANSAC: One RANSAC to Rule Them All	Jun 5, 2025	AllPose Estimation	CodeCode Available	3
INP-Former++: Advancing Universal Anomaly Detection via Intrinsic Normal Prototypes and Residual Learning	Jun 4, 2025	Anomaly DetectionMedical Diagnosis	CodeCode Available	3
HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark	Jun 4, 2025	Federated LearningTransfer Learning	CodeCode Available	3
A Smart Multimodal Healthcare Copilot with Powerful LLM Reasoning	Jun 3, 2025	Decision MakingDiagnostic	CodeCode Available	3
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation	Jun 2, 2025	4kDescriptive	CodeCode Available	3