The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1451–1475 of 659983 papers

Title	Date	Tasks	Status	Hype
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning	May 6, 2025	Image Generation	CodeCode Available	4
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters	Oct 30, 2024	model	CodeCode Available	4
A Closer Look at Deep Learning Methods on Tabular Datasets	Jul 1, 2024	AttributeDeep Learning	CodeCode Available	4
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking	Mar 14, 2024	GSM8KLanguage Modelling	CodeCode Available	4
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference	Jul 16, 2024		CodeCode Available	4
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models	Apr 15, 2025	Humanoid ControlReinforcement Learning (RL)	CodeCode Available	4
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference	Oct 6, 2023	GPUImage Generation	CodeCode Available	4
XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL	Jul 7, 2025	Text to SQLText-To-SQL	CodeCode Available	4
VM-UNet: Vision Mamba UNet for Medical Image Segmentation	Feb 4, 2024	Image SegmentationMamba	CodeCode Available	4
FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy	Jul 1, 2023	Federated LearningPersonalized Federated Learning	CodeCode Available	4
Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering	Feb 26, 2024	Evidence SelectionOpen-Ended Question Answering	CodeCode Available	4
NExT-GPT: Any-to-Any Multimodal LLM	Sep 11, 2023	AI Agent	CodeCode Available	4
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning	Jun 3, 2025	Code Generationreinforcement-learning	CodeCode Available	4
Eliminating Domain Bias for Federated Learning in Representation Space	Nov 25, 2023	Federated LearningPrivacy Preserving	CodeCode Available	4
MotionClone: Training-Free Motion Cloning for Controllable Video Generation	Jun 8, 2024	DenoisingMotion Generation	CodeCode Available	4
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation	Feb 23, 2025	Benchmarking	CodeCode Available	4
GIM: Learning Generalizable Image Matcher From Internet Videos	Feb 16, 2024	3D ReconstructionCamera Pose Estimation	CodeCode Available	4
Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach	Dec 4, 2024	Image Super-ResolutionSuper-Resolution	CodeCode Available	4
Pearl: A Production-ready Reinforcement Learning Agent	Dec 6, 2023	Benchmarkingreinforcement-learning	CodeCode Available	4
Towards All-in-One Medical Image Re-Identification	Mar 11, 2025	All	CodeCode Available	4
FinBen: A Holistic Financial Benchmark for Large Language Models	Feb 20, 2024	Question AnsweringRAG	CodeCode Available	4
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation	Feb 4, 2025	BenchmarkingInformation Retrieval	CodeCode Available	4
Attention Mesh: High-fidelity Face Mesh Prediction in Real-time	Jun 19, 2020	Vocal Bursts Intensity Prediction	CodeCode Available	4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token	Jan 7, 2025	GPUVisual Question Answering (VQA)	CodeCode Available	4
Scaling and evaluating sparse autoencoders	Jun 6, 2024	Language Modelling	CodeCode Available	4