The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 851–900 of 659983 papers

Title	Date	Tasks	Status	Hype
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation	Sep 25, 2024	text-to-speechText to Speech	CodeCode Available	5
Underwater Camouflaged Object Tracking Meets Vision-Language SAM2	Sep 25, 2024	ObjectObject Tracking	CodeCode Available	5
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models	Sep 21, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion	Sep 19, 2024		CodeCode Available	5
FuXi-2.0: Advancing machine learning weather forecasting model for practical applications	Sep 11, 2024	Weather Forecasting	CodeCode Available	5
SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning	Sep 9, 2024	AI AgentKnowledge Graphs	CodeCode Available	5
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model	Sep 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos	Sep 3, 2024	Depth EstimationDiversity	CodeCode Available	5
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis	Sep 3, 2024	3D Generation3D Reconstruction	CodeCode Available	5
rerankers: A Lightweight Python Library to Unify Ranking Methods	Aug 30, 2024	Re-RankingRetrieval	CodeCode Available	5
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling	Aug 29, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
OmniRe: Omni Urban Scene Reconstruction	Aug 29, 2024	3DGS	CodeCode Available	5
3D Reconstruction with Spatial Memory	Aug 28, 2024	3D Reconstruction	CodeCode Available	5
Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning	Aug 26, 2024	Denoisingreinforcement-learning	CodeCode Available	5
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey	Aug 23, 2024	Image SegmentationSegmentation	CodeCode Available	5
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation	Aug 22, 2024	10-shot image generation	CodeCode Available	5
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale	Aug 22, 2024	ChatbotInstruction Following	CodeCode Available	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5
The Vizier Gaussian Process Bandit Algorithm	Aug 21, 2024	Bayesian Optimization	CodeCode Available	5
Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey	Aug 19, 2024	Autonomous DrivingDecision Making	CodeCode Available	5
Automated Design of Agentic Systems	Aug 15, 2024		CodeCode Available	5
RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation	Aug 15, 2024	DiagnosticRAG	CodeCode Available	5
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs	Aug 13, 2024		CodeCode Available	5
ControlNeXt: Powerful and Efficient Control for Image and Video Generation	Aug 12, 2024	Video Generation	CodeCode Available	5
A Survey of Text-to-SQL in the Era of LLMs: Where are we, and where are we going?	Aug 9, 2024	Natural Language QueriesText to SQL	CodeCode Available	5
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More	Aug 8, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	5
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters	Aug 6, 2024		CodeCode Available	5
Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid	Aug 4, 2024	document understanding	CodeCode Available	5
Active Learning for Neural PDE Solvers	Aug 2, 2024	Active Learning	CodeCode Available	5
Penzai + Treescope: A Toolkit for Interpreting, Visualizing, and Editing Models As Data	Aug 1, 2024		CodeCode Available	5
MuJoCo MPC for Humanoid Control: Evaluation on HumanoidBench	Aug 1, 2024	Humanoid ControlMuJoCo	CodeCode Available	5
Segment Anything for Videos: A Systematic Survey	Jul 31, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	5
Tora: Trajectory-oriented Diffusion Transformer for Video Generation	Jul 31, 2024	Video CompressionVideo Generation	CodeCode Available	5
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget	Jul 22, 2024	Mixture-of-Experts	CodeCode Available	5
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models	Jul 21, 2024	AllFashion Synthesis	CodeCode Available	5
Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems	Jul 17, 2024	Autonomous Web NavigationDenoising	CodeCode Available	5
IMAGDressing-v1: Customizable Virtual Dressing	Jul 17, 2024	DenoisingImage Generation	CodeCode Available	5
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark	Jul 16, 2024	DiversitySpeaker Identification	CodeCode Available	5
Semantic Operators: A Declarative Model for Rich, AI-based Data Processing	Jul 16, 2024	Extreme Multi-Label ClassificationFact Checking	CodeCode Available	5
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval	Jul 16, 2024	Question AnsweringRetrieval	CodeCode Available	5
GRUtopia: Dream General Robots in a City at Scale	Jul 15, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients	Jul 11, 2024	Quantization	CodeCode Available	5
OffsetBias: Leveraging Debiased Data for Tuning Evaluators	Jul 9, 2024		CodeCode Available	5
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI	Jul 9, 2024	Survey	CodeCode Available	5
TAPVid-3D: A Benchmark for Tracking Any Point in 3D	Jul 8, 2024	Point Tracking	CodeCode Available	5
Fast On-device LLM Inference with NPUs	Jul 8, 2024	CPUGPU	CodeCode Available	5
Structural Generalization in Autonomous Cyber Incident Response with Message-Passing Neural Networks and Reinforcement Learning	Jul 8, 2024		CodeCode Available	5
Learning to (Learn at Test Time): RNNs with Expressive Hidden States	Jul 5, 2024	16k8k	CodeCode Available	5
BM25S: Orders of magnitude faster lexical search via eager sparse scoring	Jul 4, 2024	Passage RetrievalRetrieval	CodeCode Available	5
Fake News Detection: It's All in the Data!	Jul 2, 2024	AllDiversity	CodeCode Available	5