The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6901–6950 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback	Feb 6, 2024	Video-based Generative Performance Benchmarking	CodeCode Available	2	5
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain	Jan 30, 2024	Image ComprehensionInstruction Following	CodeCode Available	2	5
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs	Oct 10, 2024	Active LearningLanguage Modeling	CodeCode Available	2	5
Evaluating Quantized Large Language Models	Feb 28, 2024	MambaQuantization	CodeCode Available	2	5
Edu-ConvoKit: An Open-Source Library for Education Conversation Data	Feb 7, 2024		CodeCode Available	2	5
Calibrated Self-Rewarding Vision Language Models	May 23, 2024	HallucinationLanguage Modelling	CodeCode Available	2	5
PERT: Pre-training BERT with Permuted Language Model	Mar 14, 2022	Language ModelingLanguage Modelling	CodeCode Available	2	5
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement	Mar 1, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
Training Diffusion Models with Reinforcement Learning	May 22, 2023	Decision MakingDenoising	CodeCode Available	2	5
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction	Oct 5, 2023	Event Argument ExtractionEvent Extraction	CodeCode Available	2	5
All in One: Exploring Unified Video-Language Pre-training	Mar 14, 2022	AllLanguage Modelling	CodeCode Available	2	5
A Survey on Multimodal Large Language Models for Autonomous Driving	Nov 21, 2023	Autonomous Driving	CodeCode Available	2	5
Towards A Unified Conformer Structure: from ASR to ASV Task	Nov 14, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2	5
DocPrompting: Generating Code by Retrieving the Docs	Jul 13, 2022	Code Generation	CodeCode Available	2	5
AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation	Mar 4, 2024	Semantic SegmentationSemi-Supervised Semantic Segmentation	CodeCode Available	2	5
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents	May 30, 2025	BenchmarkingBlocking	CodeCode Available	2	5
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives	Nov 9, 2022	DisentanglementVideo Generation	CodeCode Available	2	5
Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic Models	Dec 26, 2022	Image ReconstructionRepresentation Learning	CodeCode Available	2	5
TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs	Mar 28, 2022	CPUGPU	CodeCode Available	2	5
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs	Oct 18, 2022	Deep LearningScheduling	CodeCode Available	2	5
Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study	Mar 13, 2024	Code Generation	CodeCode Available	2	5
REEF: Representation Encoding Fingerprints for Large Language Models	Oct 18, 2024		CodeCode Available	2	5
Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation	Mar 20, 2024	Semantic SegmentationWeakly supervised Semantic Segmentation	CodeCode Available	2	5
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model	Aug 31, 2022	DenoisingMotion Generation	CodeCode Available	2	5
Large language models surpass human experts in predicting neuroscience results	Mar 4, 2024		CodeCode Available	2	5
Owl-1: Omni World Model for Consistent Long Video Generation	Dec 12, 2024	Video Generation	CodeCode Available	2	5
Diving Deeper Into Pedestrian Behavior Understanding: Intention Estimation, Action Prediction, and Event Risk Assessment	Jun 29, 2024	Prediction	CodeCode Available	2	5
K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization	Jun 8, 2023	Language ModelingLanguage Modelling	CodeCode Available	2	5
GenSim: A General Social Simulation Platform with Large Language Model based Agents	Oct 6, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Metric Flow Matching for Smooth Interpolations on the Data Manifold	May 23, 2024	Trajectory Prediction	CodeCode Available	2	5
Harmonizer: Learning to Perform White-Box Image and Video Harmonization	Jul 4, 2022	Image HarmonizationVideo Harmonization	CodeCode Available	2	5
Android in the Zoo: Chain-of-Action-Thought for GUI Agents	Mar 5, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Knowledge Circuits in Pretrained Transformers	May 28, 2024	In-Context Learningknowledge editing	CodeCode Available	2	5
PyMIC: A deep learning toolkit for annotation-efficient medical image segmentation	Aug 19, 2022	Deep LearningImage Segmentation	CodeCode Available	2	5
PHemoNet: A Multimodal Network for Physiological Signals	Sep 13, 2024	Brain Computer InterfaceEEG	CodeCode Available	2	5
From Sparse to Soft Mixtures of Experts	Aug 2, 2023		CodeCode Available	2	5
ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text	Jan 2, 2024	ColorizationSketch Colorization	CodeCode Available	2	5
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset	Jun 10, 2024	Instance SegmentationSalient Object Detection	CodeCode Available	2	5
DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution	Mar 3, 2025	Autonomous DrivingImage Super-Resolution	CodeCode Available	2	5
nuScenes: A multimodal dataset for autonomous driving	Mar 26, 2019	3D Object DetectionAutonomous Driving	CodeCode Available	2	5
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection	Jun 10, 2024	Backdoor AttackCode Completion	CodeCode Available	2	5
Shape, Light, and Material Decomposition from Images using Monte Carlo Rendering and Denoising	Jun 7, 2022	3D ReconstructionDenoising	CodeCode Available	2	5
Video Prediction Transformers without Recurrence or Convolution	Oct 7, 2024	DecoderPrediction	CodeCode Available	2	5
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning	Apr 13, 2025	Question Answeringreinforcement-learning	CodeCode Available	2	5
DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering	Oct 11, 2021	Speech Enhancement	CodeCode Available	2	5
PoseScript: Linking 3D Human Poses and Natural Language	Oct 21, 2022	Cross-Modal RetrievalImage Captioning	CodeCode Available	2	5
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations	Aug 2, 2021	DenoisingImage Generation	CodeCode Available	2	5
Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain Shift	Jul 10, 2024	Change DetectionDisaster Response	CodeCode Available	2	5
LLaMEA: A Large Language Model Evolutionary Algorithm for Automatically Generating Metaheuristics	May 30, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Unsupervised Universal Image Segmentation	Dec 28, 2023	Image SegmentationInstance Segmentation	CodeCode Available	2	5