The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 13001–13050 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Safety Alignment Should Be Made More Than Just a Few Tokens Deep	Jun 10, 2024	Safety Alignment	CodeCode Available	2	5
Transcoders Find Interpretable LLM Feature Circuits	Jun 17, 2024		CodeCode Available	2	5
OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework	Feb 21, 2025	Autonomous Driving	CodeCode Available	2	5
Humanity's Last Code Exam: Can Advanced LLMs Conquer Human's Hardest Code Competition?	Jun 15, 2025	Code Generation	CodeCode Available	2	5
Early Detection and Localization of Pancreatic Cancer by Label-Free Tumor Synthesis	Aug 6, 2023	Specificity	CodeCode Available	2	5
STAIR: Improving Safety Alignment with Introspective Reasoning	Feb 4, 2025	Safety Alignment	CodeCode Available	2	5
EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education	Aug 5, 2023	ChatbotLanguage Modeling	CodeCode Available	2	5
Multi-modal Queried Object Detection in the Wild	May 30, 2023	Few-Shot Object DetectionObject	CodeCode Available	2	5
3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data	Jul 6, 2018	General Classification	CodeCode Available	2	5
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators	Feb 19, 2024		CodeCode Available	2	5
Conditional Image-to-Video Generation with Latent Flow Diffusion Models	Mar 24, 2023	Image to Video GenerationMotion Generation	CodeCode Available	2	5
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound	Jun 6, 2024	Audio Generation	CodeCode Available	2	5
Window Function-less DFT with Reduced Noise and Latency for Real-Time Music Analysis	Oct 10, 2024		CodeCode Available	2	5
An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks	Jan 23, 2025	GPU	CodeCode Available	2	5
OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems	Jun 19, 2023	BenchmarkingDecoder	CodeCode Available	2	5
Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method	Dec 22, 2022	4k8k	CodeCode Available	2	5
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity	Mar 14, 2024	In-Context Learning	CodeCode Available	2	5
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion	Jul 3, 2023	Image Generation	CodeCode Available	2	5
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey	Apr 19, 2025	Computational EfficiencyDomain Adaptation	CodeCode Available	2	5
Hierarchical Integration Diffusion Model for Realistic Image Deblurring	May 22, 2023	DeblurringImage Deblurring	CodeCode Available	2	5
Instant Gaussian Stream: Fast and Generalizable Streaming of Dynamic Scene Reconstruction via Gaussian Splatting	Mar 21, 2025		CodeCode Available	2	5
MeshLoc: Mesh-Based Visual Localization	Jul 21, 2022	Camera Pose EstimationNeural Rendering	CodeCode Available	2	5
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models	Sep 19, 2024	Semantic SimilaritySemantic Textual Similarity	CodeCode Available	2	5
Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement	Apr 14, 2023	Image EnhancementLow-Light Image Enhancement	CodeCode Available	2	5
Agent AI: Surveying the Horizons of Multimodal Interaction	Jan 7, 2024	multimodal interaction	CodeCode Available	2	5
β-DPO: Direct Preference Optimization with Dynamic β	Jul 11, 2024	Informativeness	CodeCode Available	2	5
RedCode: Risky Code Execution and Generation Benchmark for Code Agents	Nov 12, 2024		CodeCode Available	2	5
Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench	Oct 29, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model	Jun 3, 2024	geo-localizationLanguage Modeling	CodeCode Available	2	5
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech	Feb 8, 2023	Code GenerationDiversity	CodeCode Available	2	5
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning	May 28, 2024		CodeCode Available	2	5
FreeInit: Bridging Initialization Gap in Video Diffusion Models	Dec 12, 2023	DenoisingText-to-Video Generation	CodeCode Available	2	5
GUICourse: From General Vision Language Models to Versatile GUI Agents	Jun 17, 2024	Natural Language Visual GroundingOptical Character Recognition (OCR)	CodeCode Available	2	5
The CLRS Algorithmic Reasoning Benchmark	May 31, 2022	Learning to Execute	CodeCode Available	2	5
Language Models are Realistic Tabular Data Generators	Oct 12, 2022	Tabular Data Generation	CodeCode Available	2	5
Video Quality Assessment: A Comprehensive Survey	Dec 4, 2024	BenchmarkingSurvey	CodeCode Available	2	5
BEBLID: Boosted efficient binary local image descriptor	Feb 7, 2024	Computational EfficiencyRetrieval	CodeCode Available	2	5
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention	May 17, 2023	DenoisingDiffusion Personalization	CodeCode Available	2	5
AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios	Oct 18, 2024	Anomaly ClassificationAnomaly Detection	CodeCode Available	2	5
End-to-End Ontology Learning with Large Language Models	Oct 31, 2024		CodeCode Available	2	5
TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection	Mar 31, 2025	Fraud DetectionLarge Language Model	CodeCode Available	2	5
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time	Oct 26, 2023	In-Context Learning	CodeCode Available	2	5
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding	Apr 14, 2025		CodeCode Available	2	5
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions	Feb 24, 2025	Data AugmentationImage Generation	CodeCode Available	2	5
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents	Apr 22, 2025	Knowledge GraphsMinecraft	CodeCode Available	2	5
Evaluating Large Language Models: A Comprehensive Survey	Oct 30, 2023	Survey	CodeCode Available	2	5
Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions	Dec 20, 2022	Automated Theorem ProvingCode Generation	CodeCode Available	2	5
MolTC: Towards Molecular Relational Modeling In Language Models	Feb 6, 2024	Relational Reasoning	CodeCode Available	2	5
FastReID: A Pytorch Toolbox for General Instance Re-identification	Jun 4, 2020	Face RecognitionGPU	CodeCode Available	2	5
DEGAS: Detailed Expressions on Full-Body Gaussian Avatars	Aug 20, 2024	3DGSNeural Rendering	CodeCode Available	2	5