The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3301–3350 of 659983 papers

Title	Date	Tasks	Status	Hype
ACE2: Accurately learning subseasonal to decadal atmospheric variability and forced responses	Nov 18, 2024		CodeCode Available	3
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech	Sep 24, 2024	Audio Generation	CodeCode Available	3
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents	Mar 5, 2024	HallucinationSelf-Learning	CodeCode Available	3
Scaling Analysis of Interleaved Speech-Text Language Models	Apr 3, 2025	Transfer Learning	CodeCode Available	3
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities	Aug 1, 2024	MathMM-Vet	CodeCode Available	3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III	Apr 8, 2025	Computational EfficiencyCPU	CodeCode Available	3
Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models	Jun 5, 2024	Data Integrationgraph construction	CodeCode Available	3
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark	Jun 11, 2024	Cross-corpusEmotion Recognition	CodeCode Available	3
A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks	Jan 17, 2025	Survey	CodeCode Available	3
PyThaiNLP: Thai Natural Language Processing in Python	Dec 7, 2023		CodeCode Available	3
FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes	May 7, 2024	3D Point Cloud Classification3D Semantic Segmentation	CodeCode Available	3
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge	Nov 9, 2023		CodeCode Available	3
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation	Feb 16, 2023	Image GenerationText to Image Generation	CodeCode Available	3
Rule Based Rewards for Language Model Safety	Nov 2, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Hyper-parameter tuning for text guided image editing	Jul 31, 2024	text-guided-image-editing	CodeCode Available	3
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph	Oct 3, 2024	Code Generation	CodeCode Available	3
Efficient Large Language Models: A Survey	Dec 6, 2023	Natural Language UnderstandingSurvey	CodeCode Available	3
Navigating Eukaryotic Genome Annotation Pipelines: A Route Map to BRAKER, Galba, and TSEBRA	Mar 28, 2024		CodeCode Available	3
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World	Dec 23, 2024	AI Agent	CodeCode Available	3
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series	Mar 22, 2024	Inductive BiasMamba	CodeCode Available	3
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models	Apr 3, 2024	GPUMath	CodeCode Available	3
Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQL	May 22, 2025	Natural Language UnderstandingReinforcement Learning (RL)	CodeCode Available	3
DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input	Sep 19, 2024		CodeCode Available	3
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree	Jun 12, 2025	Continual Learning	CodeCode Available	3
VidTwin: Video VAE with Decoupled Structure and Dynamics	Dec 23, 2024	DecoderVideo Generation	CodeCode Available	3
Probabilistic Weather Forecasting with Hierarchical Graph Neural Networks	Jun 7, 2024	graph constructionWeather Forecasting	CodeCode Available	3
Dataset and Baseline System for Multi-lingual Extraction and Normalization of Temporal and Numerical Expressions	Mar 31, 2023	Date UnderstandingInformation Retrieval	CodeCode Available	3
How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks?	Jan 20, 2025	Computed Tomography (CT)GPU	CodeCode Available	3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale	Aug 10, 2024	GPULanguage Modelling	CodeCode Available	3
OctoPack: Instruction Tuning Code Large Language Models	Aug 14, 2023	Code GenerationCode Repair	CodeCode Available	3
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained Policies	Oct 15, 2024		CodeCode Available	3
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape	Jan 20, 2022		CodeCode Available	3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts	May 2, 2024	Combinatorial OptimizationMixture-of-Experts	CodeCode Available	3
On the use of deep learning for phase recovery	Aug 2, 2023	Deep Learning	CodeCode Available	3
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models	Mar 19, 2024	Hallucination	CodeCode Available	3
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer	May 24, 2024	Novel View Synthesis	CodeCode Available	3
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model	Jan 28, 2022	Few-Shot LearningLanguage Modeling	CodeCode Available	3
MAPIE: an open-source library for distribution-free uncertainty quantification	Jul 25, 2022	Conformal PredictionMulti-class Classification	CodeCode Available	3
PhysX: Physical-Grounded 3D Asset Generation	Jul 16, 2025	3D GenerationImage to 3D	CodeCode Available	3
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation	Apr 5, 2024	DecoderMamba	CodeCode Available	3
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale	Sep 9, 2024	Code GenerationFault localization	CodeCode Available	3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages	May 7, 2023	AttributeInstruction Following	CodeCode Available	3
DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes	Nov 18, 2024	Autonomous DrivingSurface Reconstruction	CodeCode Available	3
Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning	Jan 26, 2023	BenchmarkingDeep Reinforcement Learning	CodeCode Available	3
LLM4CP: Adapting Large Language Models for Channel Prediction	Jun 20, 2024	PredictionTime Series Analysis	CodeCode Available	3
Universal Actions for Enhanced Embodied Foundation Models	Jan 17, 2025		CodeCode Available	3
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding	Nov 27, 2024		CodeCode Available	3
DROID-Splat: Combining end-to-end SLAM with 3D Gaussian Splatting	Nov 26, 2024	Camera CalibrationDepth Estimation	CodeCode Available	3
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection	Feb 27, 2025	Action DetectionBenchmarking	CodeCode Available	3
Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting	Mar 14, 2024	3DGS3D Reconstruction	CodeCode Available	3