The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3151–3200 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration	Jun 15, 2023	Language ModelingLanguage Modelling	CodeCode Available	3	5
AgentTuning: Enabling Generalized Agent Abilities for LLMs	Oct 19, 2023	Memorization	CodeCode Available	3	5
Hawk: Learning to Understand Open-World Video Anomalies	May 27, 2024	Anomaly DetectionQuestion Answering	CodeCode Available	3	5
PhoWhisper: Automatic Speech Recognition for Vietnamese	Mar 27, 2024	Automatic Speech Recognitionspeech-recognition	CodeCode Available	3	5
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs	Jun 26, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	3	5
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model	Apr 15, 2024	DecoderImage Segmentation	CodeCode Available	3	5
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning	Feb 26, 2024	GPUMinecraft	CodeCode Available	3	5
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection	Jun 8, 2020	Dense Object DetectionGeneral Classification	CodeCode Available	3	5
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs	Mar 8, 2025		CodeCode Available	3	5
DRCT: Saving Image Super-resolution away from Information Bottleneck	Mar 31, 2024	Image Super-ResolutionSuper-Resolution	CodeCode Available	3	5
TopoX: A Suite of Python Packages for Machine Learning on Topological Domains	Feb 4, 2024		CodeCode Available	3	5
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia	Jan 23, 2025	Emotion RecognitionEvent Detection	CodeCode Available	3	5
Emu3: Next-Token Prediction is All You Need	Sep 27, 2024	All	CodeCode Available	3	5
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving	Apr 3, 2025	Reinforcement Learning (RL)	CodeCode Available	3	5
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion	Nov 18, 2023	Video Generation	CodeCode Available	3	5
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries	Jan 27, 2024	BenchmarkingRAG	CodeCode Available	3	5
NerfAcc: A General NeRF Acceleration Toolbox	Oct 10, 2022	NeRF	CodeCode Available	3	5
Llemma: An Open Language Model For Mathematics	Oct 16, 2023	Arithmetic ReasoningAutomated Theorem Proving	CodeCode Available	3	5
Datasets: A Community Library for Natural Language Processing	Sep 7, 2021	Image ClassificationObject Recognition	CodeCode Available	3	5
Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction	Feb 15, 2023	3D Semantic Scene CompletionAutonomous Driving	CodeCode Available	3	5
ResNeSt: Split-Attention Networks	Apr 19, 2020	image-classificationImage Classification	CodeCode Available	3	5
MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer	Jan 19, 2023	Image GenerationImage Segmentation	CodeCode Available	3	5
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus	Feb 22, 2024	Zero-shot Generalization	CodeCode Available	3	5
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs	Mar 26, 2025	Benchmarking	CodeCode Available	3	5
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory	Apr 10, 2025	MathMMLU	CodeCode Available	3	5
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs	Jan 11, 2024	Representation LearningSelf-Supervised Learning	CodeCode Available	3	5
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling	Jan 9, 2023	2D Object DetectionContrastive Learning	CodeCode Available	3	5
Inferring Articulated Rigid Body Dynamics from RGBD Video	Mar 20, 2022	Contact mechanicsInverse Rendering	CodeCode Available	3	5
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension	Apr 25, 2024	BenchmarkingMultiple-choice	CodeCode Available	3	5
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters	Mar 18, 2024	Continual LearningIncremental Learning	CodeCode Available	3	5
Neural Network Verification with Branch-and-Bound for General Nonlinearities	May 31, 2024		CodeCode Available	3	5
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation	Apr 4, 2023	Cross-Modal RetrievalImage-text Retrieval	CodeCode Available	3	5
DrivAerNet: A Parametric Car Dataset for Data-Driven Aerodynamic Design and Prediction	Mar 12, 2024		CodeCode Available	3	5
Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection	Mar 4, 2025	Anomaly DetectionMulti-class Anomaly Detection	CodeCode Available	3	5
Diffusion Model-Based Video Editing: A Survey	Jun 26, 2024	modelSurvey	CodeCode Available	3	5
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer	Mar 7, 2022		CodeCode Available	3	5
BoT-SORT: Robust Associations Multi-Pedestrian Tracking	Jun 29, 2022	Multi-Object TrackingObject	CodeCode Available	3	5
TopoBench: A Framework for Benchmarking Topological Deep Learning	Jun 9, 2024	BenchmarkingDeep Learning	CodeCode Available	3	5
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation	Sep 12, 2023	GPUImage Generation	CodeCode Available	3	5
Impact of architecture on robustness and interpretability of multispectral deep neural networks	Sep 21, 2023	Deep Learning	CodeCode Available	3	5
Are Language Models Actually Useful for Time Series Forecasting?	Jun 22, 2024	Time SeriesTime Series Forecasting	CodeCode Available	3	5
PDEBENCH: An Extensive Benchmark for Scientific Machine Learning	Oct 13, 2022		CodeCode Available	3	5
Activating More Pixels in Image Super-Resolution Transformer	May 9, 2022	Image Super-ResolutionSuper-Resolution	CodeCode Available	3	5
The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and Results	Aug 18, 2024		CodeCode Available	3	5
ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing system	Jan 12, 2025	Chatbot	CodeCode Available	3	5
The Manga Whisperer: Automatically Generating Transcriptions for Comics	Jan 18, 2024		CodeCode Available	3	5
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis	Jun 10, 2024	2k3DGS	CodeCode Available	3	5
Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse Primitives	Nov 30, 2024	3D Scene ReconstructionNeRF	CodeCode Available	3	5
Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation	Jun 13, 2024	Multi-agent Reinforcement Learning	CodeCode Available	3	5
Deep Neural Networks for Rank-Consistent Ordinal Regression Based On Conditional Probabilities	Nov 17, 2021	regression	CodeCode Available	3	5