The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2576–2600 of 661570 papers

Title	Date	Tasks	Status	Hype
Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction	Apr 10, 2025	3D Reconstruction4D reconstruction	CodeCode Available	3
PixelFlow: Pixel-Space Generative Models with Flow	Apr 10, 2025	Conditional Image GenerationImage Generation	CodeCode Available	3
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory	Apr 10, 2025	MathMMLU	CodeCode Available	3
Perception-R1: Pioneering Perception Policy with Reinforcement Learning	Apr 10, 2025	reinforcement-learningReinforcement Learning	CodeCode Available	3
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning	Apr 9, 2025	MVBenchObject Tracking	CodeCode Available	3
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution	Apr 9, 2025	2kDecision Making	CodeCode Available	3
PromptHMR: Promptable Human Mesh Recovery	Apr 8, 2025	3D Human Pose EstimationHuman Mesh Recovery	CodeCode Available	3
DDT: Decoupled Diffusion Transformer	Apr 8, 2025	DenoisingImage Generation	CodeCode Available	3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III	Apr 8, 2025	Computational EfficiencyCPU	CodeCode Available	3
SEA-LION: Southeast Asian Languages in One Network	Apr 8, 2025		CodeCode Available	3
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation	Apr 7, 2025	3D geometryRGBD Semantic Segmentation	CodeCode Available	3
Playing Non-Embedded Card-Based Games with Reinforcement Learning	Apr 7, 2025	Board GamesDecision Making	CodeCode Available	3
Video4DGen: Enhancing Video and 4D Generation through Mutual Optimization	Apr 5, 2025	3D GenerationVideo Alignment	CodeCode Available	3
TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic Representation	Apr 5, 2025		CodeCode Available	3
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving	Apr 3, 2025	Reinforcement Learning (RL)	CodeCode Available	3
Affordable AI Assistants with Knowledge Graph of Thoughts	Apr 3, 2025	Knowledge GraphsLLM real-life tasks	CodeCode Available	3
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation	Apr 3, 2025	Image GenerationWorld Knowledge	CodeCode Available	3
Scaling Analysis of Interleaved Speech-Text Language Models	Apr 3, 2025	Transfer Learning	CodeCode Available	3
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation	Apr 3, 2025	MambaTalking Head Generation	CodeCode Available	3
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning	Apr 3, 2025	Image GenerationInstruction Following	CodeCode Available	3
End-to-End Driving with Online Trajectory Evaluation via BEV World Model	Apr 2, 2025	Autonomous DrivingBench2Drive	CodeCode Available	3
YourBench: Easy Custom Evaluation Sets for Everyone	Apr 2, 2025	MMLU	CodeCode Available	3
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs	Apr 1, 2025	Knowledge GraphsMathematical Reasoning	CodeCode Available	3
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction	Apr 1, 2025	Image Generation	CodeCode Available	3
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB	Apr 1, 2025	Decision MakingRAG	CodeCode Available	3