The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6001–6025 of 474278 papers

Title	Date	Tasks	Status	Hype
Automatic database description generation for Text-to-SQL	Feb 28, 2025	Text to SQLText-To-SQL	CodeCode Available	2
AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies	Feb 28, 2025		CodeCode Available	2
SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models	Feb 28, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available	2
UniNet: A Contrastive Learning-guided Unified Framework with Feature Selection for Anomaly Detection	Feb 28, 2025	Anomaly DetectionImage Classification	CodeCode Available	2
Digital Player: Evaluating Large Language Models based Human-like Agent in Games	Feb 28, 2025	Decision Making	CodeCode Available	2
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology	Feb 28, 2025	Multiple-choicescientific discovery	CodeCode Available	2
InsTaG: Learning Personalized 3D Talking Head from Few-Second Video	Feb 27, 2025	3DGSTalking Head Generation	CodeCode Available	2
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation	Feb 27, 2025	Contrastive LearningDiagnostic	CodeCode Available	2
CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR	Feb 27, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
Image Referenced Sketch Colorization Based on Animation Creation Workflow	Feb 27, 2025	ColorizationSketch Colorization	CodeCode Available	2
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation	Feb 27, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model	Feb 27, 2025	Portrait Animation	CodeCode Available	2
Mobius: Text to Seamless Looping Video Generation via Latent Shift	Feb 27, 2025	DenoisingVideo Generation	CodeCode Available	2
One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion	Feb 27, 2025	All	CodeCode Available	2
Sanity Checking Causal Representation Learning on a Simple Real-World System	Feb 27, 2025	Representation Learning	CodeCode Available	2
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction	Feb 27, 2025	Image GenerationPrediction	CodeCode Available	2
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think	Feb 27, 2025	Image GenerationText to Image Generation	CodeCode Available	2
One-for-More: Continual Diffusion Model for Anomaly Detection	Feb 27, 2025	Anomaly Detectioncontinual anomaly detection	CodeCode Available	2
AgentSociety Challenge: Designing LLM Agents for User Modeling and Recommendation on Web Platforms	Feb 26, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
OntologyRAG: Better and Faster Biomedical Code Mapping with Retrieval-Augmented Generation (RAG) Leveraging Ontology Knowledge Graphs and Large Language Models	Feb 26, 2025	In-Context LearningKnowledge Graphs	CodeCode Available	2
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens	Feb 26, 2025		CodeCode Available	2
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation	Feb 26, 2025	Code GenerationHumanEval	CodeCode Available	2
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems	Feb 26, 2025	Instruction Following	CodeCode Available	2
BIG-Bench Extra Hard	Feb 26, 2025		CodeCode Available	2
FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting	Feb 26, 2025	Model SelectionTime Series	CodeCode Available	2