The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 658356 papers

Title	Date	Tasks	Status	Hype
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning	Mar 12, 2025	Question AnsweringRAG	CodeCode Available	7
VACE: All-in-One Video Creation and Editing	Mar 10, 2025	AllHuman-Domain Subject-to-Video	CodeCode Available	7
HuixiangDou2: A Robustly Optimized GraphRAG Approach	Mar 9, 2025	RetrievalRetrieval-augmented Generation	CodeCode Available	7
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems	Mar 9, 2025		CodeCode Available	7
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test	Mar 3, 2025	Prediction	CodeCode Available	7
Visual-RFT: Visual Reinforcement Fine-Tuning	Mar 3, 2025	Few-Shot Object DetectionFine-Grained Image Classification	CodeCode Available	7
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion	Mar 3, 2025	Music Generation	CodeCode Available	7
LLM Post-Training: A Deep Dive into Reasoning Large Language Models	Feb 28, 2025		CodeCode Available	7
Muon is Scalable for LLM Training	Feb 24, 2025	Computational Efficiency	CodeCode Available	7
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning	Feb 20, 2025	Mathreinforcement-learning	CodeCode Available	7
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models	Feb 20, 2025	Continual LearningKnowledge Graphs	CodeCode Available	7
S*: Test Time Scaling for Code Generation	Feb 20, 2025	Code GenerationMath	CodeCode Available	7
YOLOv12: Attention-Centric Real-Time Object Detectors	Feb 18, 2025	GPUObject	CodeCode Available	7
MoBA: Mixture of Block Attention for Long-Context LLMs	Feb 18, 2025	Mixture-of-Experts	CodeCode Available	7
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction	Feb 17, 2025	Instruction FollowingVoice Cloning	CodeCode Available	7
pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM	Feb 17, 2025	Depth EstimationDepth Prediction	CodeCode Available	7
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model	Feb 14, 2025	Video GenerationVideo Reconstruction	CodeCode Available	7
Large Language Diffusion Models	Feb 14, 2025	In-Context LearningInstruction Following	CodeCode Available	7
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!	Feb 11, 2025	Large Language ModelMath	CodeCode Available	7
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile	Feb 10, 2025	Video Generation	CodeCode Available	7
Goku: Flow Based Video Generative Foundation Models	Feb 7, 2025	Image GenerationText to Image Generation	CodeCode Available	7
Fast Video Generation with Sliding Tile Attention	Feb 6, 2025	Video Generation	CodeCode Available	7
VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos	Feb 3, 2025	Knowledge GraphsRAG	CodeCode Available	7
LLM-AutoDiff: Auto-Differentiate Any LLM Workflow	Jan 28, 2025	Prompt EngineeringQuestion Answering	CodeCode Available	7
Training AI to be Loyal	Jan 27, 2025		CodeCode Available	7
EvoRL: A GPU-accelerated Framework for Evolutionary Reinforcement Learning	Jan 25, 2025	BenchmarkingEvolutionary Algorithms	CodeCode Available	7
Rethinking the Sample Relations for Few-Shot Classification	Jan 23, 2025	ClassificationContrastive Learning	CodeCode Available	7
DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations	Jan 23, 2025		CodeCode Available	7
Kimi k1.5: Scaling Reinforcement Learning with LLMs	Jan 22, 2025	Mathreinforcement-learning	CodeCode Available	7
A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models	Jan 21, 2025	RAGRetrieval	CodeCode Available	7
EvoGP: A GPU-accelerated Framework for Tree-based Genetic Programming	Jan 21, 2025	Feature EngineeringGPU	CodeCode Available	7
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation	Jan 20, 2025	Language ModelingLanguage Modelling	CodeCode Available	7
Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond	Jan 19, 2025	Deep LearningMulti-Task Learning	CodeCode Available	7
FoundationStereo: Zero-Shot Stereo Matching	Jan 17, 2025	Depth EstimationDiversity	CodeCode Available	7
MiniMax-01: Scaling Foundation Models with Lightning Attention	Jan 14, 2025	Mixture-of-Experts	CodeCode Available	7
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking	Jan 8, 2025	Math	CodeCode Available	7
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides	Jan 7, 2025		CodeCode Available	7
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction	Jan 3, 2025		CodeCode Available	7
Simulating 500 million years of evolution with a language model	Dec 31, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
Revisiting PCA for time series reduction in temporal dimension	Dec 27, 2024	Computational EfficiencyDimensionality Reduction	CodeCode Available	7
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback	Dec 20, 2024	AllInstruction Following	CodeCode Available	7
Efficient MedSAMs: Segment Anything in Medical Images on Laptop	Dec 20, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	7
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis	Dec 19, 2024	Audio GenerationAudio Synthesis	CodeCode Available	7
3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting	Dec 17, 2024	3DGSNovel View Synthesis	CodeCode Available	7
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors	Dec 16, 2024	3D Reconstructiongraph construction	CodeCode Available	7
A Library for Learning Neural Operators	Dec 13, 2024	Operator learning	CodeCode Available	7
Byte Latent Transformer: Patches Scale Better Than Tokens	Dec 13, 2024		CodeCode Available	7
AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era	Dec 13, 2024	Image to Video GenerationVideo Generation	CodeCode Available	7
Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems	Dec 12, 2024		CodeCode Available	7
Large Concept Models: Language Modeling in a Sentence Representation Space	Dec 11, 2024	Language ModelingLanguage Modelling	CodeCode Available	7