The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5201–5250 of 661570 papers

Title	Date	Tasks	Status	Hype
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering	May 30, 2025	Denoising	CodeCode Available	2
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation	May 30, 2025	Hallucination	CodeCode Available	2
Optimal Density Functions for Weighted Convolution in Learning Models	May 30, 2025	DenoisingImage Denoising	CodeCode Available	2
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents	May 30, 2025	BenchmarkingBlocking	CodeCode Available	2
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations	May 30, 2025		CodeCode Available	2
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL	May 30, 2025	Image GenerationLanguage Modeling	CodeCode Available	2
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization	May 30, 2025	Story Visualization	CodeCode Available	2
Tackling View-Dependent Semantics in 3D Language Gaussian Splatting	May 30, 2025	3D Scene ReconstructionScene Understanding	CodeCode Available	2
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models	May 30, 2025	ClassificationDisaster Response	CodeCode Available	2
Logits-Based Finetuning	May 30, 2025	Out of Distribution (OOD) Detection	CodeCode Available	2
TC-GS: A Faster Gaussian Splatting Module Utilizing Tensor Cores	May 30, 2025	3DGS	CodeCode Available	2
Optimal Weighted Convolution for Classification and Denosing	May 30, 2025	ClassificationDenoising	CodeCode Available	2
When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways	May 30, 2025	Continual LearningImage Augmentation	CodeCode Available	2
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory	May 29, 2025	Contrastive LearningText Retrieval	CodeCode Available	2
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation	May 29, 2025	Portrait AnimationVideo Alignment	CodeCode Available	2
SWE-bench Goes Live!	May 29, 2025		CodeCode Available	2
ZeroGUI: Automating Online GUI Learning at Zero Human Cost	May 29, 2025		CodeCode Available	2
Diffusion Guidance Is a Controllable Policy Improvement Operator	May 29, 2025	Offline RL	CodeCode Available	2
D-AR: Diffusion via Autoregressive Models	May 29, 2025	Denoising	CodeCode Available	2
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models	May 29, 2025	Self-Supervised LearningVideo Generation	CodeCode Available	2
ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks	May 29, 2025	Spatial Reasoning	CodeCode Available	2
HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions	May 29, 2025	Image AnimationVideo Generation	CodeCode Available	2
MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming	May 29, 2025	DiversityEfficient Exploration	CodeCode Available	2
Vision Language Models are Biased	May 29, 2025	Board Gamescounterfactual	CodeCode Available	2
UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes	May 29, 2025	Texture Synthesis	CodeCode Available	2
VERINA: Benchmarking Verifiable Code Generation	May 29, 2025	BenchmarkingCode Generation	CodeCode Available	2
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning	May 29, 2025		CodeCode Available	2
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model	May 29, 2025	DecoderImage Generation	CodeCode Available	2
OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation	May 29, 2025		CodeCode Available	2
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents	May 29, 2025		CodeCode Available	2
ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering	May 29, 2025	Large Language ModelPrompt Engineering	CodeCode Available	2
TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models	May 29, 2025	Referring ExpressionReferring Expression Comprehension	CodeCode Available	2
Securing AI Agents with Information-Flow Control	May 29, 2025		CodeCode Available	2
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning	May 29, 2025	Anomaly DetectionDescriptive	CodeCode Available	2
Model-Preserving Adaptive Rounding	May 29, 2025	modelQuantization	CodeCode Available	2
ZIPA: A family of efficient models for multilingual phone recognition	May 29, 2025	Diversity	CodeCode Available	2
DRO: A Python Library for Distributionally Robust Optimization in Machine Learning	May 29, 2025		CodeCode Available	2
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS	May 29, 2025	3DGSGPU	CodeCode Available	2
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO	May 28, 2025	MathReinforcement Learning (RL)	CodeCode Available	2
Zero-Shot Vision Encoder Grafting via LLM Surrogates	May 28, 2025	DecoderLanguage Modeling	CodeCode Available	2
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control	May 28, 2025	3D geometryAutonomous Driving	CodeCode Available	2
cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning	May 28, 2025	CAD ReconstructionLarge Language Model	CodeCode Available	2
DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials	May 28, 2025	Drug Discoverygraph partitioning	CodeCode Available	2
DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction	May 27, 2025	Image Generation	CodeCode Available	2
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment	May 27, 2025	Adversarial AttackClustering	CodeCode Available	2
Improved Representation Steering for Language Models	May 27, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
SPA-RL: Reinforcing LLM Agents via Stepwise Progress Attribution	May 27, 2025	Reinforcement Learning (RL)	CodeCode Available	2
Reinforcing General Reasoning without Verifiers	May 27, 2025	MathMathematical Reasoning	CodeCode Available	2
TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state	May 27, 2025	MambaTime Series	CodeCode Available	2
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents	May 27, 2025	16k	CodeCode Available	2