The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1351–1400 of 659983 papers

Title	Date	Tasks	Status	Hype
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models	Mar 20, 2025	BenchmarkingReinforcement Learning (RL)	CodeCode Available	4
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning	Mar 20, 2025	Decision MakingLanguage Modeling	CodeCode Available	4
UniK3D: Universal Camera Monocular 3D Estimation	Mar 20, 2025	3D ReconstructionDisentanglement	CodeCode Available	4
Sonata: Self-Supervised Learning of Reliable Point Representations	Mar 20, 2025	3D Semantic SegmentationSelf-Supervised Learning	CodeCode Available	4
Cube: A Roblox View of 3D Intelligence	Mar 19, 2025	Scene GenerationText Generation	CodeCode Available	4
DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework	Mar 19, 2025	8kAction Recognition	CodeCode Available	4
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control	Mar 18, 2025		CodeCode Available	4
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning	Mar 18, 2025	3D Face AnimationCommon Sense Reasoning	CodeCode Available	4
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey	Mar 16, 2025	Autonomous Drivingmultimodal generation	CodeCode Available	4
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond	Mar 13, 2025	Domain GeneralizationMath	CodeCode Available	4
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization	Mar 13, 2025	Multimodal Reasoning	CodeCode Available	4
Retrieval-Augmented Generation with Hierarchical Knowledge	Mar 13, 2025	Multi-hop Question AnsweringQuestion Answering	CodeCode Available	4
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary	Mar 12, 2025	EgoSchemaRetrieval	CodeCode Available	4
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models	Mar 12, 2025	DenoisingLanguage Modeling	CodeCode Available	4
LocAgent: Graph-Guided LLM Agents for Code Localization	Mar 12, 2025	GitHub issue resolutionNavigate	CodeCode Available	4
PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation	Mar 12, 2025	AllDenoising	CodeCode Available	4
Towards All-in-One Medical Image Re-Identification	Mar 11, 2025	All	CodeCode Available	4
Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models	Mar 11, 2025	FormInformation Retrieval	CodeCode Available	4
LBM: Latent Bridge Matching for Fast Image-to-Image Translation	Mar 10, 2025	Depth EstimationImage Relighting	CodeCode Available	4
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning	Mar 10, 2025	Multimodal ReasoningReinforcement Learning (RL)	CodeCode Available	4
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation	Mar 10, 2025	Common Sense ReasoningImage Generation	CodeCode Available	4
Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms	Mar 10, 2025		CodeCode Available	4
Inductive Moment Matching	Mar 10, 2025		CodeCode Available	4
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL	Mar 10, 2025	Logical ReasoningMultimodal Reasoning	CodeCode Available	4
PointVLA: Injecting the 3D World into Vision-Language-Action Models	Mar 10, 2025	Imitation LearningSpatial Reasoning	CodeCode Available	4
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement	Mar 9, 2025	Domain GeneralizationObject Detection	CodeCode Available	4
VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control	Mar 7, 2025	Image InpaintingOptical Flow Estimation	CodeCode Available	4
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning	Mar 7, 2025	RAGReinforcement Learning (RL)	CodeCode Available	4
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model	Mar 7, 2025	Multimodal Reasoningreinforcement-learning	CodeCode Available	4
Unified Reward Model for Multimodal Understanding and Generation	Mar 7, 2025	Image Generationmodel	CodeCode Available	4
Factorio Learning Environment	Mar 6, 2025	Program SynthesisSpatial Reasoning	CodeCode Available	4
ReasonGraph: Visualisation of Reasoning Paths	Mar 6, 2025		CodeCode Available	4
DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning	Feb 28, 2025	Information Retrievalreinforcement-learning	CodeCode Available	4
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels	Feb 27, 2025	Image ClassificationInstance Segmentation	CodeCode Available	4
UniTok: A Unified Tokenizer for Visual Generation and Understanding	Feb 27, 2025	Quantization	CodeCode Available	4
HVI: A New color space for Low-light Image Enhancement	Feb 27, 2025	Image EnhancementLow-Light Image Enhancement	CodeCode Available	4
Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator	Feb 26, 2025	Depth EstimationDiversity	CodeCode Available	4
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents	Feb 25, 2025	Question AnsweringRAG	CodeCode Available	4
SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference	Feb 25, 2025	modelVideo Generation	CodeCode Available	4
R1-Onevision：An Open-Source Multimodal Large Language Model Capable of Deep Reasoning	Feb 24, 2025	Language ModelingLanguage Modelling	CodeCode Available	4
LettuceDetect: A Hallucination Detection Framework for RAG Applications	Feb 24, 2025	8kGPU	CodeCode Available	4
TDMPBC: Self-Imitative Reinforcement Learning for Humanoid Robot Control	Feb 24, 2025	reinforcement-learningReinforcement Learning	CodeCode Available	4
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation	Feb 23, 2025	Benchmarking	CodeCode Available	4
REFINE: Inversion-Free Backdoor Defense via Model Reprogramming	Feb 22, 2025	backdoor defense	CodeCode Available	4
Natural Language Generation	Feb 20, 2025	Text Generation	CodeCode Available	4
SurveyX: Academic Survey Automation via Large Language Models	Feb 20, 2025	Survey	CodeCode Available	4
LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention	Feb 20, 2025		CodeCode Available	4
Building reliable sim driving agents by scaling self-play	Feb 20, 2025	Autonomous VehiclesBenchmarking	CodeCode Available	4
Craw4LLM: Efficient Web Crawling for LLM Pretraining	Feb 19, 2025	10-shot image generation	CodeCode Available	4
A deep learning framework for efficient pathology image analysis	Feb 18, 2025	BenchmarkingDeep Learning	CodeCode Available	4