SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,152 code links4,818 tasks

Papers

Showing 151200 of 658356 papers

TitleStatusHype
LLM4Decompile: Decompiling Binary Code with Large Language ModelsCode9
Do Large Language Models Need a Content Delivery Network?Code9
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal UnderstandingCode9
LatentSync: Audio Conditioned Latent Diffusion Models for Lip SyncCode9
FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language ModelsCode9
MiniCPM4: Ultra-Efficient LLMs on End DevicesCode9
Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code UnderstandingCode9
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation ModelsCode9
OLMo: Accelerating the Science of Language ModelsCode9
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training StrategiesCode9
UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented GenerationCode9
Model Stock: All we need is just a few fine-tuned modelsCode9
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge FusionCode9
Large Action Models: From Inception to ImplementationCode9
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and ApplicationsCode9
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec TransformerCode9
2 OLMo 2 FuriousCode9
FinRobot: AI Agent for Equity Research and Valuation with Large Language ModelsCode9
LTX-Video: Realtime Video Latent DiffusionCode9
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion ModelsCode9
s1: Simple test-time scalingCode9
FastVLM: Efficient Vision Encoding for Vision Language ModelsCode9
Depth Anything: Unleashing the Power of Large-Scale Unlabeled DataCode9
Arcee's MergeKit: A Toolkit for Merging Large Language ModelsCode9
SkyServe: Serving AI Models across Regions and Clouds with Spot InstancesCode9
PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula RecognitionCode9
When Do We Not Need Larger Vision Models?Code9
garak: A Framework for Security Probing Large Language ModelsCode9
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt CompressionCode9
Toward Guidance-Free AR Visual Generation via Condition Contrastive AlignmentCode9
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code IntelligenceCode9
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware MemoryCode9
InternLM2 Technical ReportCode9
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive PerceptionCode9
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data ConstructionCode9
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language ModelCode9
UFO: A UI-Focused Agent for Windows OS InteractionCode9
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait AnimationCode9
RULER: What's the Real Context Size of Your Long-Context Language Models?Code9
MindSearch: Mimicking Human Minds Elicits Deep AI SearcherCode9
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech GenerationCode9
Overview of the Amphion Toolkit (v0.2)Code9
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image SynthesisCode9
Agent Laboratory: Using LLM Agents as Research AssistantsCode9
Divide and Conquer: High-Resolution Industrial Anomaly Detection via Memory Efficient Tiled EnsembleCode9
OpenVLA: An Open-Source Vision-Language-Action ModelCode9
Transformer Explainer: Interactive Learning of Text-Generative ModelsCode9
SimpleFSDP: Simpler Fully Sharded Data Parallel with torch.compileCode9
Emerging Properties in Unified Multimodal PretrainingCode9
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated ParametersCode9
Show:102550
← PrevPage 4 of 13168Next →