The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2501–2550 of 659983 papers

Title	Date	Tasks	Status	Hype
Retrieval-augmented generation in multilingual settings	Jul 1, 2024	Prompt EngineeringRAG	CodeCode Available	3
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective	Jul 9, 2024	Information RetrievalRetrieval	CodeCode Available	3
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights	Jul 11, 2024	Motion GenerationSurvey	CodeCode Available	3
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models	Jul 12, 2024	Image EnhancementLow-Light Image Enhancement	CodeCode Available	3
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases	Jul 15, 2024	Attributecounterfactual	CodeCode Available	3
Learning Dynamics of LLM Finetuning	Jul 15, 2024	Hallucination	CodeCode Available	3
Reinforcement Learning Meets Visual Odometry	Jul 22, 2024	Decision Makingreinforcement-learning	CodeCode Available	3
Comgra: A Tool for Analyzing and Debugging Neural Networks	Jul 31, 2024		CodeCode Available	3
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models	Jul 31, 2024	Domain GeneralizationPrompt Learning	CodeCode Available	3
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents	Aug 12, 2024		CodeCode Available	3
SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners	Aug 29, 2024	Segmentation	CodeCode Available	3
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters	Aug 30, 2024	Image ReconstructionTime Series	CodeCode Available	3
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching	Sep 5, 2024		CodeCode Available	3
SpatialBot: Precise Spatial Understanding with Vision Language Models	Jun 19, 2024	Spatial Reasoning	CodeCode Available	3
Colorful Diffuse Intrinsic Image Decomposition in the Wild	Sep 20, 2024	Color ConstancyIntrinsic Image Decomposition	CodeCode Available	3
Generative Modeling of Molecular Dynamics Trajectories	Sep 26, 2024		CodeCode Available	3
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios	Oct 2, 2024	Speech EnhancementSpeech Separation	CodeCode Available	3
Multi-Level Speaker Representation for Target Speaker Extraction	Oct 21, 2024	Target Speaker Extraction	CodeCode Available	3
PDL: A Declarative Prompt Programming Language	Oct 24, 2024	RAG	CodeCode Available	3
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders	Oct 27, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training	Nov 20, 2024	Computational EfficiencyPosition	CodeCode Available	3
OSDFace: One-Step Diffusion Model for Face Restoration	Nov 26, 2024	Face RecognitionGenerative Adversarial Network	CodeCode Available	3
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos	Nov 26, 2024	Common Sense ReasoningImitation Learning	CodeCode Available	3
Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking	May 16, 2025	BenchmarkingManagement	CodeCode Available	3
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications	Dec 3, 2024	BenchmarkingDisaster Response	CodeCode Available	3
Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization	Dec 11, 2024	Pose EstimationVisual Localization	CodeCode Available	3
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance	Dec 17, 2024	Image GenerationObject	CodeCode Available	3
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up	Dec 20, 2024	8kGPU	CodeCode Available	3
UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility	Jan 4, 2025		CodeCode Available	3
LLMs can see and hear without any training	Jan 30, 2025	Audio captioningImage Generation	CodeCode Available	3
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs	Jan 10, 2025	4kVisual Reasoning	CodeCode Available	3
PETR: Position Embedding Transformation for Multi-View 3D Object Detection	Mar 10, 2022	3D Object DetectionObject	CodeCode Available	3
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models	Aug 14, 2023	knowledge editing	CodeCode Available	3
Improved Denoising Diffusion Probabilistic Models	Feb 18, 2021	DenoisingImage Generation	CodeCode Available	3
Pareto Front Approximation for Multi-Objective Session-Based Recommender Systems	Jul 23, 2024	Recommendation Systems	CodeCode Available	3
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving	Feb 11, 2025	Automated Theorem ProvingLarge Language Model	CodeCode Available	3
Stonefish: Supporting Machine Learning Research in Marine Robotics	Feb 17, 2025	Optical Flow Estimation	CodeCode Available	3
Soundwave: Less is More for Speech-Text Alignment in LLMs	Feb 18, 2025		CodeCode Available	3
Slamming: Training a Speech Language Model on One GPU in a Day	Feb 19, 2025	GPULanguage Modeling	CodeCode Available	3
AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha Decay	Feb 24, 2025		CodeCode Available	3
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs	Feb 24, 2025	Computer Security	CodeCode Available	3
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction	Feb 24, 2025	Language ModelingLanguage Modelling	CodeCode Available	3
CrossOver: 3D Scene Cross-Modal Alignment	Feb 20, 2025	cross-modal alignmentObject	CodeCode Available	3
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble	Feb 25, 2025	Survey	CodeCode Available	3
BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction	Feb 26, 2025	BenchmarkingTime Series	CodeCode Available	3
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving	Mar 7, 2025	Autonomous DrivingDenoising	CodeCode Available	3
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering	Mar 14, 2025	Audio Question AnsweringQuestion Answering	CodeCode Available	3
Falcon: A Remote Sensing Vision-Language Foundation Model	Mar 14, 2025	Image Captioningimage-classification	CodeCode Available	3
A Survey on Latent Reasoning	Jul 8, 2025	Survey	CodeCode Available	3
Vision-Speech Models: Teaching Speech Models to Converse about Images	Mar 19, 2025	parameter-efficient fine-tuning	CodeCode Available	3