The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5201–5225 of 661570 papers

Title	Date	Tasks	Status	Hype
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations	May 30, 2025		CodeCode Available	2
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering	May 30, 2025	Denoising	CodeCode Available	2
Optimal Weighted Convolution for Classification and Denosing	May 30, 2025	ClassificationDenoising	CodeCode Available	2
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation	May 30, 2025	Hallucination	CodeCode Available	2
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models	May 30, 2025	ClassificationDisaster Response	CodeCode Available	2
Optimal Density Functions for Weighted Convolution in Learning Models	May 30, 2025	DenoisingImage Denoising	CodeCode Available	2
Logits-Based Finetuning	May 30, 2025	Out of Distribution (OOD) Detection	CodeCode Available	2
Tackling View-Dependent Semantics in 3D Language Gaussian Splatting	May 30, 2025	3D Scene ReconstructionScene Understanding	CodeCode Available	2
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents	May 30, 2025	BenchmarkingBlocking	CodeCode Available	2
When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways	May 30, 2025	Continual LearningImage Augmentation	CodeCode Available	2
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL	May 30, 2025	Image GenerationLanguage Modeling	CodeCode Available	2
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization	May 30, 2025	Story Visualization	CodeCode Available	2
TC-GS: A Faster Gaussian Splatting Module Utilizing Tensor Cores	May 30, 2025	3DGS	CodeCode Available	2
TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models	May 29, 2025	Referring ExpressionReferring Expression Comprehension	CodeCode Available	2
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents	May 29, 2025		CodeCode Available	2
UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes	May 29, 2025	Texture Synthesis	CodeCode Available	2
Vision Language Models are Biased	May 29, 2025	Board Gamescounterfactual	CodeCode Available	2
Diffusion Guidance Is a Controllable Policy Improvement Operator	May 29, 2025	Offline RL	CodeCode Available	2
ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering	May 29, 2025	Large Language ModelPrompt Engineering	CodeCode Available	2
SWE-bench Goes Live!	May 29, 2025		CodeCode Available	2
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models	May 29, 2025	Self-Supervised LearningVideo Generation	CodeCode Available	2
OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation	May 29, 2025		CodeCode Available	2
D-AR: Diffusion via Autoregressive Models	May 29, 2025	Denoising	CodeCode Available	2
MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming	May 29, 2025	DiversityEfficient Exploration	CodeCode Available	2
Securing AI Agents with Information-Flow Control	May 29, 2025		CodeCode Available	2