SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 141150 of 177339 papers

TitleStatusHype
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-HaystackCode9
NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context?Code9
YuE: Scaling Open Foundation Models for Long-Form Music GenerationCode9
Depth Anything V2Code9
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-TuningCode9
Visually Descriptive Language Model for Vector Graphics ReasoningCode9
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented GenerationCode9
World Model on Million-Length Video And Language With Blockwise RingAttentionCode9
UFO2: The Desktop AgentOSCode9
LLM4Decompile: Decompiling Binary Code with Large Language ModelsCode9
Show:102550
← PrevPage 15 of 17734Next →