SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 201250 of 180343 papers

TitleStatusHype
Soft Condorcet Optimization for Ranking of General AgentsCode9
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end ModelCode9
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image AnimationCode9
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion TransformersCode9
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural GenerationCode9
PowerInfer-2: Fast Large Language Model Inference on a SmartphoneCode9
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PCCode9
Moonshine: Speech Recognition for Live Transcription and Voice CommandsCode9
TripoSR: Fast 3D Object Reconstruction from a Single ImageCode9
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet ParadigmCode9
AutoAgent: A Fully-Automated and Zero-Code Framework for LLM AgentsCode9
Moshi: a speech-text foundation model for real-time dialogueCode9
MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal SamplingCode9
RWKV-7 "Goose" with Expressive Dynamic State EvolutionCode9
LW-DETR: A Transformer Replacement to YOLO for Real-Time DetectionCode9
OpenELM: An Efficient Language Model Family with Open Training and Inference FrameworkCode9
Perception Encoder: The best visual embeddings are not at the output of the networkCode8
Llama 2: Open Foundation and Fine-Tuned Chat ModelsCode8
Robust Speech Recognition via Large-Scale Weak SupervisionCode8
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech RecognitionCode8
GPT4All: An Ecosystem of Open Source Compressed Language ModelsCode8
Fine-mixing: Mitigating Backdoors in Fine-tuned Language ModelsCode8
DocLayNet: A Large Human-Annotated Dataset for Document-Layout AnalysisCode8
DETRs Beat YOLOs on Real-time Object DetectionCode8
LHM: Large Animatable Human Reconstruction Model from a Single Image in SecondsCode7
SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 QuantizationCode7
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion TransformersCode7
Transparent Image Layer Diffusion using Latent TransparencyCode7
InternVideo2: Scaling Foundation Models for Multimodal Video UnderstandingCode7
AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation PipelineCode7
Robust Inverse Graphics via Probabilistic InferenceCode7
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference FeedbackCode7
From Bytes to Ideas: Language Modeling with Autoregressive U-NetsCode7
One-Step Image Translation with Text-to-Image ModelsCode7
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented GenerationCode7
2D Gaussian Splatting for Geometrically Accurate Radiance FieldsCode7
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsCode7
In-Context LoRA for Diffusion TransformersCode7
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference AccelerationCode7
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive ParallelismCode7
Domain Expansion of Image GeneratorsCode7
CALE: Continuous Arcade Learning EnvironmentCode7
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation ModelCode7
FourierKAN outperforms MLP on Text Classification Head Fine-tuningCode7
Prometheus: Inducing Fine-grained Evaluation Capability in Language ModelsCode7
HealthBench: Evaluating Large Language Models Towards Improved Human HealthCode7
OmniGen: Unified Image GenerationCode7
Fast Timing-Conditioned Latent Audio DiffusionCode7
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence SegmentationCode7
PuLID: Pure and Lightning ID Customization via Contrastive AlignmentCode7
Show:102550
← PrevPage 5 of 3607Next →