The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2601–2650 of 659983 papers

Title	Date	Tasks	Status	Hype
Are EEG-to-Text Models Working?	May 10, 2024	BenchmarkingEEG	CodeCode Available	3
Verdict: A Library for Scaling Judge-Time Compute	Feb 25, 2025	Fact CheckingHallucination	CodeCode Available	3
Compact 3D Scene Representation via Self-Organizing Gaussian Grids	Dec 19, 2023	3DGS	CodeCode Available	3
StyleGAN-Human: A Data-Centric Odyssey of Human Generation	Apr 25, 2022	Image Generation	CodeCode Available	3
TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning	May 29, 2025	In-Context LearningState Space Models	CodeCode Available	3
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python	Apr 9, 2024	Decision MakingLanguage Modeling	CodeCode Available	3
CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning	Nov 26, 2024		CodeCode Available	3
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models	Jun 16, 2024	HallucinationHallucination Evaluation	CodeCode Available	3
MaxViT: Multi-Axis Vision Transformer	Apr 4, 2022	image-classificationImage Classification	CodeCode Available	3
A Survey of Large Language Models for Graphs	May 10, 2024	Graph LearningLink Prediction	CodeCode Available	3
SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic	Mar 26, 2024	Motion Planning	CodeCode Available	3
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios	Mar 28, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models	Nov 14, 2023	Acoustic Scene ClassificationAudio captioning	CodeCode Available	3
Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey	Apr 25, 2024	4kImage Super-Resolution	CodeCode Available	3
Panza: Design and Analysis of a Fully-Local Personalized Text Writing Assistant	Jun 24, 2024	RAGRetrieval-augmented Generation	CodeCode Available	3
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting	Apr 23, 2024		CodeCode Available	3
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards	Jul 4, 2024	Code Completion	CodeCode Available	3
RFUAV: A Benchmark Dataset for Unmanned Aerial Vehicle Detection and Identification	Mar 12, 2025	Audio Signal RecognitionClassification	CodeCode Available	3
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts	Apr 17, 2025	Denoising	CodeCode Available	3
Detect Anything 3D in the Wild	Apr 10, 2025	3D Object DetectionAutonomous Driving	CodeCode Available	3
Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast	Nov 3, 2022		CodeCode Available	3
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation	May 30, 2024	DiversityDrug Design	CodeCode Available	3
Unlimiformer: Long-Range Transformers with Unlimited Length Input	May 2, 2023	Book summarizationCPU	CodeCode Available	3
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation	Dec 23, 2023	Emotion RecognitionSelf-Supervised Learning	CodeCode Available	3
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization	May 9, 2025	Benchmarking	CodeCode Available	3
Dataset Distillation with Neural Characteristic Function: A Minmax Perspective	Jan 1, 2025	Computational EfficiencyDataset Distillation	CodeCode Available	3
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation	May 2, 2023	Code GenerationHumanEval	CodeCode Available	3
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining	Mar 20, 2024	Aerial Scene ClassificationBuilding change detection for remote sensing images	CodeCode Available	3
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning	Apr 15, 2025	Mathematical ReasoningReinforcement Learning (RL)	CodeCode Available	3
Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel Methods	Jan 1, 2024	Image ManipulationImage Manipulation Localization	CodeCode Available	3
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix	Dec 4, 2023	Recommendation Systems	CodeCode Available	3
Language-based Audio Moment Retrieval	Sep 24, 2024	audio moment retrievalMoment Retrieval	CodeCode Available	3
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]	Aug 24, 2023	ManagementPrediction	CodeCode Available	3
A Chinese Dataset for Evaluating the Safeguards in Large Language Models	Feb 19, 2024		CodeCode Available	3
Multi-Modality Representation Learning for Antibody-Antigen Interactions Prediction	Mar 22, 2025	Graph AttentionPrediction	CodeCode Available	3
Improving Alignment and Robustness with Circuit Breakers	Jun 6, 2024	Adversarial Robustness	CodeCode Available	3
The OpenLAM Challenges	Jan 20, 2025	valid	CodeCode Available	3
TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments	Jun 5, 2023	3D Human Pose Estimationregression	CodeCode Available	3
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning	Jan 1, 2024	3D dense captioningDense Captioning	CodeCode Available	3
InstructIE: A Bilingual Instruction-based Information Extraction Dataset	May 19, 2023		CodeCode Available	3
Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward Model	Jun 2, 2024	DenoisingMixture-of-Experts	CodeCode Available	3
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally	Sep 12, 2024		CodeCode Available	3
Vision-based 3D occupancy prediction in autonomous driving: a review and outlook	May 4, 2024	Autonomous DrivingPrediction	CodeCode Available	3
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models	Jun 8, 2023	Question AnsweringVCGBench-Diverse	CodeCode Available	3
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360deg	Jan 1, 2023	Image GenerationImage Segmentation	CodeCode Available	3
Drone Data Analytics for Measuring Traffic Metrics at Intersections in High-Density Areas	Nov 4, 2024		CodeCode Available	3
A Survey on Video Action Recognition in Sports: Datasets, Methods and Applications	Jun 2, 2022	Action RecognitionSports Analytics	CodeCode Available	3
UrbanGPT: Spatio-Temporal Large Language Models	Feb 25, 2024	10-shot image generation	CodeCode Available	3
Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language Models	May 5, 2025	Policy Gradient MethodsRAG	CodeCode Available	3
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation	Apr 26, 2022	2D Human Pose EstimationKeypoint Detection	CodeCode Available	3