The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3251–3275 of 661570 papers

Title	Date	Tasks	Status	Hype
SAM-Med2D	Aug 30, 2023	DecoderImage Segmentation	CodeCode Available	3
Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents	Feb 8, 2024	Autonomous DrivingLanguage Modeling	CodeCode Available	3
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation	Apr 8, 2024	Image GenerationImage-to-Image Translation	CodeCode Available	3
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations	Mar 11, 2024	Disentanglement	CodeCode Available	3
GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation	Jun 10, 2024	3D GenerationNeRF	CodeCode Available	3
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details	Jun 19, 2025	Texture Synthesis	CodeCode Available	3
ResearchTown: Simulator of Human Research Community	Dec 23, 2024		CodeCode Available	3
From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision	Dec 15, 2024	Active Learning	CodeCode Available	3
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs	Jan 12, 2024		CodeCode Available	3
LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion	Nov 4, 2023	BenchmarkingImitation Learning	CodeCode Available	3
TorchDrug: A Powerful and Flexible Machine Learning Platform for Drug Discovery	Feb 16, 2022	BIG-bench Machine LearningDrug Discovery	CodeCode Available	3
MathArena: Evaluating LLMs on Uncontaminated Math Competitions	May 29, 2025	MathMathematical Reasoning	CodeCode Available	3
Frequency-aware Feature Fusion for Dense Image Prediction	Aug 23, 2024	Prediction	CodeCode Available	3
VoiceBench: Benchmarking LLM-Based Voice Assistants	Oct 22, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	3
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation	Mar 18, 2024	3D Generation3D Reconstruction	CodeCode Available	3
MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents	Jan 24, 2025	Benchmarking	CodeCode Available	3
GS-SDF: LiDAR-Augmented Gaussian Splatting and Neural SDF for Geometrically Consistent Rendering and Reconstruction	Mar 13, 2025	Autonomous DrivingSurface Reconstruction	CodeCode Available	3
Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals	Sep 29, 2022	Text Generation	CodeCode Available	3
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription	Apr 15, 2024	Music Transcription	CodeCode Available	3
PointCNN: Convolution On X-Transformed Points	Jan 23, 2018	3D Instance Segmentation3D Part Segmentation	CodeCode Available	3
OverleafCopilot: Empowering Academic Writing in Overleaf with Large Language Models	Mar 13, 2024		CodeCode Available	3
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey	Nov 14, 2024		CodeCode Available	3
Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption	Jan 18, 2025	Infrared And Visible Image Fusion	CodeCode Available	3
Game-theoretic LLM: Agent Workflow for Negotiation Games	Nov 8, 2024	Decision Making	CodeCode Available	3
Tracking Anything with Decoupled Video Segmentation	Sep 7, 2023	Open-Vocabulary Video SegmentationOpen-World Video Segmentation	CodeCode Available	3