The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6001–6050 of 661570 papers

Title	Date	Tasks	Status	Hype
Digital Player: Evaluating Large Language Models based Human-like Agent in Games	Feb 28, 2025	Decision Making	CodeCode Available	2
AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies	Feb 28, 2025		CodeCode Available	2
MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing	Feb 28, 2025	Image GenerationTransfer Learning	CodeCode Available	2
Neural Posterior Estimation for Cataloging Astronomical Images with Spatially Varying Backgrounds and Point Spread Functions	Feb 28, 2025	Variational Inference	CodeCode Available	2
UniNet: A Contrastive Learning-guided Unified Framework with Feature Selection for Anomaly Detection	Feb 28, 2025	Anomaly DetectionImage Classification	CodeCode Available	2
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference	Feb 28, 2025		CodeCode Available	2
Mobius: Text to Seamless Looping Video Generation via Latent Shift	Feb 27, 2025	DenoisingVideo Generation	CodeCode Available	2
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation	Feb 27, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction	Feb 27, 2025	Image GenerationPrediction	CodeCode Available	2
Image Referenced Sketch Colorization Based on Animation Creation Workflow	Feb 27, 2025	ColorizationSketch Colorization	CodeCode Available	2
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model	Feb 27, 2025	Portrait Animation	CodeCode Available	2
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation	Feb 27, 2025	Contrastive LearningDiagnostic	CodeCode Available	2
One-for-More: Continual Diffusion Model for Anomaly Detection	Feb 27, 2025	Anomaly Detectioncontinual anomaly detection	CodeCode Available	2
Sanity Checking Causal Representation Learning on a Simple Real-World System	Feb 27, 2025	Representation Learning	CodeCode Available	2
InsTaG: Learning Personalized 3D Talking Head from Few-Second Video	Feb 27, 2025	3DGSTalking Head Generation	CodeCode Available	2
One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion	Feb 27, 2025	All	CodeCode Available	2
CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR	Feb 27, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think	Feb 27, 2025	Image GenerationText to Image Generation	CodeCode Available	2
ArtGS: Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting	Feb 26, 2025	parameter estimation	CodeCode Available	2
FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting	Feb 26, 2025	Model SelectionTime Series	CodeCode Available	2
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation	Feb 26, 2025	Code GenerationHumanEval	CodeCode Available	2
BIG-Bench Extra Hard	Feb 26, 2025		CodeCode Available	2
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens	Feb 26, 2025		CodeCode Available	2
Medical Hallucinations in Foundation Models and Their Impact on Healthcare	Feb 26, 2025	BenchmarkingHallucination	CodeCode Available	2
AgentSociety Challenge: Designing LLM Agents for User Modeling and Recommendation on Web Platforms	Feb 26, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
OntologyRAG: Better and Faster Biomedical Code Mapping with Retrieval-Augmented Generation (RAG) Leveraging Ontology Knowledge Graphs and Large Language Models	Feb 26, 2025	In-Context LearningKnowledge Graphs	CodeCode Available	2
NeoBERT: A Next-Generation BERT	Feb 26, 2025	In-Context LearningMTEB Benchmark	CodeCode Available	2
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems	Feb 26, 2025	Instruction Following	CodeCode Available	2
Rank1: Test-Time Compute for Reranking in Information Retrieval	Feb 25, 2025	Information RetrievalInstruction Following	CodeCode Available	2
SPECTRE: An FFT-Based Efficient Drop-In Replacement to Self-Attention for Long Contexts	Feb 25, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers	Feb 25, 2025	Multi-hop Question AnsweringQuestion Answering	CodeCode Available	2
Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support	Feb 25, 2025	Decision MakingDiagnostic	CodeCode Available	2
RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts	Feb 25, 2025	RAGReranking	CodeCode Available	2
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference	Feb 25, 2025	Visual Question Answering (VQA)	CodeCode Available	2
WebGames: Challenging General-Purpose Web-Browsing AI Agents	Feb 25, 2025		CodeCode Available	2
MegaLoc: One Retrieval to Place Them All	Feb 24, 2025	3D ReconstructionAll	CodeCode Available	2
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions	Feb 24, 2025	Data AugmentationImage Generation	CodeCode Available	2
Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts	Feb 24, 2025	BenchmarkingFact Verification	CodeCode Available	2
Delta Decompression for MoE-based LLMs Compression	Feb 24, 2025	DiversityMixture-of-Experts	CodeCode Available	2
PointSea: Point Cloud Completion via Self-structure Augmentation	Feb 24, 2025	Point Cloud Completion	CodeCode Available	2
Introducing Visual Perception Token into Multimodal Large Language Model	Feb 24, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
The GigaMIDI Dataset with Features for Expressive Music Performance Detection	Feb 24, 2025	Information RetrievalMusic Information Retrieval	CodeCode Available	2
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models	Feb 24, 2025	GSM8KMath	CodeCode Available	2
Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems	Feb 24, 2025	Computational EfficiencyPDE Surrogate Modeling	CodeCode Available	2
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment	Feb 24, 2025	image-classificationImage Classification	CodeCode Available	2
LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification	Feb 24, 2025	Code Completion	CodeCode Available	2
Audio-FLAN: A Preliminary Release	Feb 23, 2025	Zero-Shot Learning	CodeCode Available	2
FreeTumor: Large-Scale Generative Tumor Synthesis in Computed Tomography Images for Improving Tumor Recognition	Feb 23, 2025	Computed Tomography (CT)	CodeCode Available	2
A Survey on Industrial Anomalies Synthesis	Feb 23, 2025	Survey	CodeCode Available	2
SalM2: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention	Feb 22, 2025	Mamba	CodeCode Available	2