The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2426–2450 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step	May 23, 2024	GSM8K	CodeCode Available	3	5
Diffusion Feedback Helps CLIP See Better	Jul 29, 2024	image-classificationImage Classification	CodeCode Available	3	5
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems	Nov 5, 2024	HallucinationRAG	CodeCode Available	3	5
CAX: Cellular Automata Accelerated in JAX	Oct 3, 2024	ARCArtificial Life	CodeCode Available	3	5
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?	Jun 19, 2024	RAGRetrieval	CodeCode Available	3	5
Anything-3D: Towards Single-view Anything Reconstruction in the Wild	Apr 19, 2023	3D ReconstructionDiversity	CodeCode Available	3	5
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait	Mar 17, 2025	Computational EfficiencyDiversity	CodeCode Available	3	5
Simplifying Deep Temporal Difference Learning	Jul 5, 2024	Q-LearningReinforcement Learning (RL)	CodeCode Available	3	5
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation	Feb 3, 2025	Graph Neural NetworkKnowledge Graphs	CodeCode Available	3	5
XAttention: Block Sparse Attention with Antidiagonal Scoring	Mar 20, 2025	Video GenerationVideo Understanding	CodeCode Available	3	5
4M: Massively Multimodal Masked Modeling	Dec 11, 2023	Decoder	CodeCode Available	3	5
Unifying Flow, Stereo and Depth Estimation	Nov 10, 2022	Depth EstimationOptical Flow Estimation	CodeCode Available	3	5
EgoLife: Towards Egocentric Life Assistant	Mar 5, 2025	Question AnsweringVideo Understanding	CodeCode Available	3	5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback	May 22, 2023	Instruction Following	CodeCode Available	3	5
Planning with Diffusion for Flexible Behavior Synthesis	May 20, 2022	Decision MakingDenoising	CodeCode Available	3	5
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs	Oct 25, 2023	Autonomous DrivingGPU	CodeCode Available	3	5
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs	Feb 24, 2025	Question AnsweringVisual Question Answering	CodeCode Available	3	5
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models	May 15, 2023	Multiple-choice	CodeCode Available	3	5
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment	Apr 27, 2021	Analog Video RestorationSnow Removal	CodeCode Available	3	5
Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding	Feb 14, 2025	3D Object Detection3D visual grounding	CodeCode Available	3	5
Data Engineering for Scaling Language Models to 128K Context	Feb 15, 2024	4kContinual Pretraining	CodeCode Available	3	5
A Multiscale Visualization of Attention in the Transformer Model	Jun 12, 2019		CodeCode Available	3	5
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping	Feb 21, 2024	Decision MakingDecoder	CodeCode Available	3	5
RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection	Mar 25, 2024	3D Object Detection3D Object Detection (RoI)	CodeCode Available	3	5
Streaming Deep Reinforcement Learning Finally Works	Oct 18, 2024	Atari GamesDeep Reinforcement Learning	CodeCode Available	3	5