The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1401–1450 of 659983 papers

Title	Date	Tasks	Status	Hype
Atom of Thoughts for Markov LLM Test-Time Scaling	Feb 17, 2025		CodeCode Available	4
A-MEM: Agentic Memory for LLM Agents	Feb 17, 2025	Large Language Model	CodeCode Available	4
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention	Feb 16, 2025		CodeCode Available	4
SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers	Feb 15, 2025	Image AnimationPortrait Animation	CodeCode Available	4
KernelBench: Can LLMs Write Efficient GPU Kernels?	Feb 14, 2025	GPU	CodeCode Available	4
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models	Feb 13, 2025	Question AnsweringRAG	CodeCode Available	4
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion	Feb 12, 2025	Image Relighting	CodeCode Available	4
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society	Feb 12, 2025		CodeCode Available	4
Enhance-A-Video: Better Generated Video for Free	Feb 11, 2025	Video Generation	CodeCode Available	4
Training Sparse Mixture Of Experts Text Embedding Models	Feb 11, 2025	Mixture-of-ExpertsRAG	CodeCode Available	4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction	Feb 11, 2025	Code GenerationMath	CodeCode Available	4
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates	Feb 10, 2025	Hierarchical Reinforcement LearningLanguage Modeling	CodeCode Available	4
Accelerating Data Processing and Benchmarking of AI Models for Pathology	Feb 10, 2025	Benchmarking	CodeCode Available	4
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM	Feb 10, 2025	Language ModelingLanguage Modelling	CodeCode Available	4
Self-Supervised Prompt Optimization	Feb 7, 2025		CodeCode Available	4
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach	Feb 7, 2025	Language ModelingLanguage Modelling	CodeCode Available	4
Latent Swap Joint Diffusion for 2D Long-Form Latent Generation	Feb 7, 2025	Audio GenerationDenoising	CodeCode Available	4
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound	Feb 7, 2025	Benchmarking	CodeCode Available	4
Identify Critical KV Cache in LLM Inference from an Output Perturbation Perspective	Feb 6, 2025		CodeCode Available	4
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis	Feb 6, 2025	Speech Synthesis	CodeCode Available	4
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation	Feb 4, 2025	BenchmarkingInformation Retrieval	CodeCode Available	4
Sundial: A Family of Highly Capable Time Series Foundation Models	Feb 2, 2025	Representation LearningTime Series	CodeCode Available	4
Transcoders Beat Sparse Autoencoders for Interpretability	Jan 31, 2025		CodeCode Available	4
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models	Jan 31, 2025	Caption GenerationLanguage Modeling	CodeCode Available	4
Molecular-driven Foundation Model for Oncologic Pathology	Jan 28, 2025	BenchmarkingDiagnostic	CodeCode Available	4
A foundation model for human-AI collaboration in medical literature mining	Jan 27, 2025	Literature MiningSystematic Literature Review	CodeCode Available	4
Diffusion-Based Planning for Autonomous Driving with Flexible Guidance	Jan 26, 2025	Autonomous DrivingImitation Learning	CodeCode Available	4
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step	Jan 23, 2025	Image GenerationText-to-Image Generation	CodeCode Available	4
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data	Jan 21, 2025	FairnessImputation	CodeCode Available	4
Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models	Jan 20, 2025		CodeCode Available	4
A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs	Jan 20, 2025	DiversityImage Generation	CodeCode Available	4
Generating Structured Outputs from Language Models: Benchmark and Studies	Jan 18, 2025		CodeCode Available	4
DiffuEraser: A Diffusion Model for Video Inpainting	Jan 17, 2025	modelOptical Flow Estimation	CodeCode Available	4
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment	Jan 16, 2025	Causal Inferencecounterfactual	CodeCode Available	4
MonSter: Marry Monodepth to Stereo Unleashes Power	Jan 15, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	4
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models	Jan 14, 2025	BenchmarkingText-to-Video Generation	CodeCode Available	4
ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding	Jan 14, 2025	RAGRetrieval	CodeCode Available	4
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding	Jan 14, 2025	Embodied Question AnsweringHallucination	CodeCode Available	4
3DGS-to-PC: Convert a 3D Gaussian Splatting Scene into a Dense Point Cloud or Mesh	Jan 13, 2025	3DGSSurface Reconstruction	CodeCode Available	4
EdgeTAM: On-Device Track Anything Model	Jan 13, 2025	modelVideo Segmentation	CodeCode Available	4
Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset	Jan 9, 2025	Human Mesh RecoveryMotion Generation	CodeCode Available	4
The GAN is dead; long live the GAN! A Modern GAN Baseline	Jan 9, 2025	Image Generation	CodeCode Available	4
RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark	Jan 8, 2025	object-detectionObject Detection	CodeCode Available	4
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control	Jan 7, 2025	Video Generation	CodeCode Available	4
Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection	Jan 7, 2025	Objectobject-detection	CodeCode Available	4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token	Jan 7, 2025	GPUVisual Question Answering (VQA)	CodeCode Available	4
TransPixeler: Advancing Text-to-Video Generation with Transparency	Jan 6, 2025	Text-to-Video GenerationVideo Generation	CodeCode Available	4
A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges	Jan 4, 2025	FairnessHallucination	CodeCode Available	4
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models	Jan 2, 2025	Scene Understandingtext annotation	CodeCode Available	4
SVFR: A Unified Framework for Generalized Video Face Restoration	Jan 2, 2025	ColorizationRepresentation Learning	CodeCode Available	4