SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,199 code links4,818 tasks

Papers

Showing 201250 of 658356 papers

TitleStatusHype
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion TransformersCode9
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language ModelsCode9
AgentRxiv: Towards Collaborative Autonomous ResearchCode9
Natural language guidance of high-fidelity text-to-speech with synthetic annotationsCode9
Soft Condorcet Optimization for Ranking of General AgentsCode9
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end ModelCode9
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image AnimationCode9
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion TransformersCode9
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural GenerationCode9
PowerInfer-2: Fast Large Language Model Inference on a SmartphoneCode9
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PCCode9
Moonshine: Speech Recognition for Live Transcription and Voice CommandsCode9
TripoSR: Fast 3D Object Reconstruction from a Single ImageCode9
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet ParadigmCode9
AutoAgent: A Fully-Automated and Zero-Code Framework for LLM AgentsCode9
Moshi: a speech-text foundation model for real-time dialogueCode9
MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal SamplingCode9
RWKV-7 "Goose" with Expressive Dynamic State EvolutionCode9
LW-DETR: A Transformer Replacement to YOLO for Real-Time DetectionCode9
OpenELM: An Efficient Language Model Family with Open Training and Inference FrameworkCode9
Perception Encoder: The best visual embeddings are not at the output of the networkCode8
Llama 2: Open Foundation and Fine-Tuned Chat ModelsCode8
Robust Speech Recognition via Large-Scale Weak SupervisionCode8
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech RecognitionCode8
GPT4All: An Ecosystem of Open Source Compressed Language ModelsCode8
Fine-mixing: Mitigating Backdoors in Fine-tuned Language ModelsCode8
DocLayNet: A Large Human-Annotated Dataset for Document-Layout AnalysisCode8
DETRs Beat YOLOs on Real-time Object DetectionCode8
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem7
Pretraining Large Language Models with NVFP47
Qwen3-ASR Technical Report7
SAM 3D Body: Robust Full-Body Human Mesh Recovery7
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning7
Advancing Open-source World Models7
Attention Residuals7
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning7
dLLM: Simple Diffusion Language Modeling7
Robust Inverse Graphics via Probabilistic InferenceCode7
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsCode7
2D Gaussian Splatting for Geometrically Accurate Radiance FieldsCode7
In-Context LoRA for Diffusion TransformersCode7
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference AccelerationCode7
AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation PipelineCode7
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion TransformersCode7
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented GenerationCode7
Prometheus: Inducing Fine-grained Evaluation Capability in Language ModelsCode7
FourierKAN outperforms MLP on Text Classification Head Fine-tuningCode7
One-Step Image Translation with Text-to-Image ModelsCode7
HealthBench: Evaluating Large Language Models Towards Improved Human HealthCode7
OmniGen: Unified Image GenerationCode7
Show:102550
← PrevPage 5 of 13168Next →