The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 976–1000 of 177339 papers

Title	Date	Tasks	Status	Hype	Score
EfficientRep:An Efficient Repvgg-style ConvNets with Hardware-aware Neural Network Design	Feb 1, 2023	GPUobject-detection	CodeCode Available	5	5
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers	Jan 16, 2024	Image Generation	CodeCode Available	5	5
Long-context LLMs Struggle with Long In-context Learning	Apr 2, 2024	2kIn-Context Learning	CodeCode Available	5	5
Track Anything: Segment Anything Meets Videos	Apr 24, 2023	Image SegmentationObject Tracking	CodeCode Available	5	5
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling	Jun 4, 2024		CodeCode Available	5	5
AppAgent: Multimodal Agents as Smartphone Users	Dec 21, 2023	Navigate	CodeCode Available	5	5
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes	Nov 1, 2024	3DGSNovel View Synthesis	CodeCode Available	5	5
High-Fidelity Simultaneous Speech-To-Speech Translation	Feb 5, 2025	DecoderSimultaneous Speech-to-Speech Translation	CodeCode Available	5	5
ReFT: Representation Finetuning for Language Models	Apr 4, 2024	Arithmetic Reasoning	CodeCode Available	5	5
LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models	Nov 8, 2023	8kGPU	CodeCode Available	5	5
Kimi-VL Technical Report	Apr 10, 2025	Long-Context UnderstandingMathematical Reasoning	CodeCode Available	5	5
WebThinker: Empowering Large Reasoning Models with Deep Research Capability	Apr 30, 2025	Navigate	CodeCode Available	5	5
Segment Anything for Videos: A Systematic Survey	Jul 31, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	5	5
Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey	Aug 19, 2024	Autonomous DrivingDecision Making	CodeCode Available	5	5
Point Transformer V3: Simpler Faster Stronger	Jan 1, 2024	Representation Learning	CodeCode Available	5	5
DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting	Dec 14, 2024	Clusteringenergy management	CodeCode Available	5	5
TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods	Mar 29, 2024	BenchmarkingMultivariate Time Series Forecasting	CodeCode Available	5	5
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline	Jun 17, 2024	Chatbot	CodeCode Available	5	5
Watermark Anything with Localized Messages	Nov 11, 2024		CodeCode Available	5	5
PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression	May 23, 2024	Quantization	CodeCode Available	5	5
R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration	May 30, 2025	Mathematical Reasoning	CodeCode Available	5	5
Differentiable Tree Search Network	Jan 22, 2024	Decision MakingInductive Bias	CodeCode Available	5	5
A Survey of Text-to-SQL in the Era of LLMs: Where are we, and where are we going?	Aug 9, 2024	Natural Language QueriesText to SQL	CodeCode Available	5	5
LeVo: High-Quality Song Generation with Multi-Preference Alignment	Jun 9, 2025	Instruction FollowingMusic Generation	CodeCode Available	5	5
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models	Dec 10, 2024		CodeCode Available	5	5