The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts	May 20, 2024	Machine TranslationTranslation	CodeCode Available	9	5
LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model	Jun 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	9	5
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack	Jun 14, 2024	Question AnsweringRetrieval-augmented Generation	CodeCode Available	9	5
NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context?	Jul 16, 2024	4k8k	CodeCode Available	9	5
YuE: Scaling Open Foundation Models for Long-Form Music Generation	Mar 11, 2025	FormIn-Context Learning	CodeCode Available	9	5
Depth Anything V2	Jun 13, 2024	Depth EstimationDiversity	CodeCode Available	9	5
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning	Mar 26, 2024	GPUGSM8K	CodeCode Available	9	5
Visually Descriptive Language Model for Vector Graphics Reasoning	Apr 9, 2024	DescriptiveLanguage Modeling	CodeCode Available	9	5
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation	Sep 10, 2024	Knowledge GraphsQuestion Answering	CodeCode Available	9	5
World Model on Million-Length Video And Language With Blockwise RingAttention	Feb 13, 2024	4kVideo Understanding	CodeCode Available	9	5
UFO2: The Desktop AgentOS	Apr 20, 2025		CodeCode Available	9	5
LLM4Decompile: Decompiling Binary Code with Large Language Models	Mar 8, 2024	HumanEval	CodeCode Available	9	5
Do Large Language Models Need a Content Delivery Network?	Sep 16, 2024	In-Context Learning	CodeCode Available	9	5
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding	Dec 13, 2024	Chart UnderstandingMixture-of-Experts	CodeCode Available	9	5
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync	Dec 12, 2024	Portrait Animation	CodeCode Available	9	5
FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models	May 23, 2024	AI AgentDecision Making	CodeCode Available	9	5
MiniCPM4: Ultra-Efficient LLMs on End Devices	Jun 9, 2025	Large Language Model	CodeCode Available	9	5
Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding	Jul 14, 2025	Code GenerationLanguage Modeling	CodeCode Available	9	5
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models	Dec 23, 2024	CPU	CodeCode Available	9	5
OLMo: Accelerating the Science of Language Models	Feb 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	9	5
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies	Apr 9, 2024	Domain Adaptation	CodeCode Available	9	5
UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation	Mar 31, 2025	RAGRetrieval	CodeCode Available	9	5
Model Stock: All we need is just a few fine-tuned models	Mar 28, 2024	All	CodeCode Available	9	5
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion	May 26, 2024	Language ModelingLanguage Modelling	CodeCode Available	9	5
Large Action Models: From Inception to Implementation	Dec 13, 2024	Action Generation	CodeCode Available	9	5