The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4751–4800 of 661570 papers

Title	Date	Tasks	Status	Hype
Is Value Learning Really the Main Bottleneck in Offline RL?	Jun 13, 2024	Imitation LearningOffline RL	CodeCode Available	3
DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy	Sep 27, 2024	Financial Analysis	CodeCode Available	3
Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields	Aug 7, 2024	3DGSModel Compression	CodeCode Available	3
MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM	Nov 25, 2024	Autonomous DrivingNovel View Synthesis	CodeCode Available	3
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2	Aug 9, 2024	All	CodeCode Available	3
DPLM-2: A Multimodal Diffusion Protein Language Model	Oct 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Automated Formulaic Alpha Generation for Quantitative Investing using Evolutionary Algorithms	Mar 13, 2022	Evolutionary Algorithms	CodeCode Available	3
The False Promise of Imitating Proprietary LLMs	May 25, 2023	Language Modelling	CodeCode Available	3
Visual Geometry Grounded Deep Structure From Motion	Dec 7, 2023	Point Tracking	CodeCode Available	3
A Foundation Model for the Earth System	May 20, 2024	Computational EfficiencyDeep Learning	CodeCode Available	3
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning	Jun 14, 2024	Offline RL	CodeCode Available	3
Human-level play in the game of Diplomacy by combining language models with strategic reasoning	Nov 22, 2022	AI AgentLanguage Modeling	CodeCode Available	3
Improving Text Embeddings with Large Language Models	Dec 31, 2023	DecoderDiversity	CodeCode Available	3
Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes	Aug 29, 2017	BIG-bench Machine LearningCPU	CodeCode Available	3
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models	Oct 3, 2024		CodeCode Available	3
RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control	May 27, 2024		CodeCode Available	3
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models	Dec 18, 2024	Representation LearningRobot Manipulation	CodeCode Available	3
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation	Mar 8, 2024	Code GenerationHallucination	CodeCode Available	3
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders	Jul 19, 2024		CodeCode Available	3
DataDecide: How to Predict Best Pretraining Data with Small Experiments	Apr 15, 2025	ARCHellaSwag	CodeCode Available	3
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry	Feb 6, 2024		CodeCode Available	3
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection	Jun 8, 2020	Dense Object DetectionGeneral Classification	CodeCode Available	3
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs	Mar 8, 2025		CodeCode Available	3
DRCT: Saving Image Super-resolution away from Information Bottleneck	Mar 31, 2024	Image Super-ResolutionSuper-Resolution	CodeCode Available	3
TopoX: A Suite of Python Packages for Machine Learning on Topological Domains	Feb 4, 2024		CodeCode Available	3
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia	Jan 23, 2025	Emotion RecognitionEvent Detection	CodeCode Available	3
Emu3: Next-Token Prediction is All You Need	Sep 27, 2024	All	CodeCode Available	3
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving	Apr 3, 2025	Reinforcement Learning (RL)	CodeCode Available	3
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation	Mar 3, 2026		—Unverified	2
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data	Mar 10, 2026		—Unverified	2
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images	Mar 3, 2026		—Unverified	2
Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models	Mar 4, 2026		—Unverified	2
Phi-4-reasoning-vision-15B Technical Report	Mar 4, 2026		—Unverified	2
Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator	Mar 5, 2026		—Unverified	2
NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation	Mar 5, 2026		—Unverified	2
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding	Mar 5, 2026		—Unverified	2
From Word to World: Can Large Language Models be Implicit Text-based World Models?	Mar 5, 2026		—Unverified	2
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling	Mar 4, 2026		—Unverified	2
Physical Simulator In-the-Loop Video Generation	Mar 6, 2026		—Unverified	2
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation	Feb 9, 2026		—Unverified	2
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight	Feb 23, 2026		—Unverified	2
Endless Terminals: Scaling RL Environments for Terminal Agents	Feb 14, 2026		—Unverified	2
Experiential Reinforcement Learning	Feb 15, 2026		—Unverified	2
PyVision-RL: Forging Open Agentic Vision Models via RL	Feb 24, 2026		—Unverified	2
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics	Feb 22, 2026		—Unverified	2
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents	Feb 16, 2026		—Unverified	2
Should We Still Pretrain Encoders with Masked Language Modeling?	Feb 24, 2026		—Unverified	2
Streaming Autoregressive Video Generation via Diagonal Distillation	Mar 11, 2026		—Unverified	2
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression	Feb 11, 2026		—Unverified	2
LLM2Vec-Gen: Generative Embeddings from Large Language Models	Mar 11, 2026		—Unverified	2