SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 101150 of 658356 papers

TitleStatusHype
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code IntelligenceCode11
InstantID: Zero-shot Identity-Preserving Generation in SecondsCode11
TinyLlama: An Open-Source Small Language ModelCode11
PaperBanana: Automating Academic Illustration for AI Scientists9
Qwen3-TTS Technical Report9
Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code UnderstandingCode9
MiniCPM4: Ultra-Efficient LLMs on End DevicesCode9
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet ParadigmCode9
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion TransformersCode9
Dolphin: Document Image Parsing via Heterogeneous Anchor PromptingCode9
Emerging Properties in Unified Multimodal PretrainingCode9
UFO2: The Desktop AgentOSCode9
SkyReels-V2: Infinite-length Film Generative ModelCode9
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language ModelCode9
UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented GenerationCode9
PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula RecognitionCode9
AgentRxiv: Towards Collaborative Autonomous ResearchCode9
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data ConstructionCode9
RWKV-7 "Goose" with Expressive Dynamic State EvolutionCode9
YuE: Scaling Open Foundation Models for Long-Form Music GenerationCode9
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and ApplicationsCode9
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PCCode9
AutoAgent: A Fully-Automated and Zero-Code Framework for LLM AgentsCode9
Metis: A Foundation Speech Generation Model with Masked Generative Pre-trainingCode9
s1: Simple test-time scalingCode9
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion TransformerCode9
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech GenerationCode9
Overview of the Amphion Toolkit (v0.2)Code9
Agent Laboratory: Using LLM Agents as Research AssistantsCode9
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference ServingCode9
2 OLMo 2 FuriousCode9
Aviary: training language agents on challenging scientific tasksCode9
LTX-Video: Realtime Video Latent DiffusionCode9
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation ModelsCode9
FastVLM: Efficient Vision Encoding for Vision Language ModelsCode9
Large Action Models: From Inception to ImplementationCode9
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal UnderstandingCode9
LatentSync: Audio Conditioned Latent Diffusion Models for Lip SyncCode9
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image SynthesisCode9
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware MemoryCode9
FinRobot: AI Agent for Equity Research and Valuation with Large Language ModelsCode9
Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial ResearchCode9
SkyServe: Serving AI Models across Regions and Clouds with Spot InstancesCode9
SimpleFSDP: Simpler Fully Sharded Data Parallel with torch.compileCode9
Soft Condorcet Optimization for Ranking of General AgentsCode9
Moonshine: Speech Recognition for Live Transcription and Voice CommandsCode9
Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot FrameworkCode9
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive PerceptionCode9
HART: Efficient Visual Generation with Hybrid Autoregressive TransformerCode9
Liger Kernel: Efficient Triton Kernels for LLM TrainingCode9
Show:102550
← PrevPage 3 of 13168Next →