SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 126150 of 658356 papers

TitleStatusHype
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse AttentionCode9
ORPO: Monolithic Preference Optimization without Reference ModelCode9
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference ServingCode9
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent CollaborationCode9
Symbolic Learning Enables Self-Evolving AgentsCode9
Aviary: training language agents on challenging scientific tasksCode9
Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial ResearchCode9
Metis: A Foundation Speech Generation Model with Masked Generative Pre-trainingCode9
Dolphin: Document Image Parsing via Heterogeneous Anchor PromptingCode9
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding BenchmarkCode9
YOLO-World: Real-Time Open-Vocabulary Object DetectionCode9
Yi: Open Foundation Models by 01.AICode9
Steering Language Models with Game-Theoretic SolversCode9
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the WildCode9
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary TextsCode9
LawGPT: A Chinese Legal Knowledge-Enhanced Large Language ModelCode9
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-HaystackCode9
NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context?Code9
YuE: Scaling Open Foundation Models for Long-Form Music GenerationCode9
Depth Anything V2Code9
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-TuningCode9
Visually Descriptive Language Model for Vector Graphics ReasoningCode9
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented GenerationCode9
World Model on Million-Length Video And Language With Blockwise RingAttentionCode9
UFO2: The Desktop AgentOSCode9
Show:102550
← PrevPage 6 of 26335Next →