SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 501525 of 659983 papers

TitleStatusHype
Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary casesCode7
Robust Inverse Graphics via Probabilistic InferenceCode7
Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian SplattingCode7
MoE-LLaVA: Mixture of Experts for Large Vision-Language ModelsCode7
EAGLE: Speculative Sampling Requires Rethinking Feature UncertaintyCode7
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion TransformersCode7
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding HeadsCode7
VMamba: Visual State Space ModelCode7
Code Generation with AlphaCodium: From Prompt Engineering to Flow EngineeringCode7
HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical AssistanceCode7
Exploring Compressed Image Representation as a Perceptual Proxy: A StudyCode7
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency ModelsCode7
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-InferenceCode7
Bilateral Reference for High-Resolution Dichotomous Image SegmentationCode7
From Audio to Photoreal Embodiment: Synthesizing Humans in ConversationsCode7
OpenVoice: Versatile Instant Voice CloningCode7
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learningCode7
Prometheus: Inducing Fine-grained Evaluation Capability in Language ModelsCode7
DSPy: Compiling Declarative Language Model Calls into Self-Improving PipelinesCode7
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation DatasetCode7
Judging LLM-as-a-Judge with MT-Bench and Chatbot ArenaCode7
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image ManifoldCode7
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language ModelsCode7
Full Scaling Automation for Sustainable Development of Green Data CentersCode7
EasySpider: A No-Code Visual System for Crawling the WebCode7
Show:102550
← PrevPage 21 of 26400Next →