SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 101125 of 180343 papers

TitleStatusHype
Gymnasium: A Standard Interface for Reinforcement Learning EnvironmentsCode11
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec TransformerCode9
FinRobot: AI Agent for Equity Research and Valuation with Large Language ModelsCode9
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-onCode9
Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot FrameworkCode9
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language ModelsCode9
Depth Pro: Sharp Monocular Metric Depth in Less Than a SecondCode9
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale PredictionCode9
HART: Efficient Visual Generation with Hybrid Autoregressive TransformerCode9
Sapiens: Foundation for Human Vision ModelsCode9
SkyReels-V2: Infinite-length Film Generative ModelCode9
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language ModelsCode9
DeepSeek LLM: Scaling Open-Source Language Models with LongtermismCode9
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion TransformerCode9
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language ModelCode9
Language agents achieve superhuman synthesis of scientific knowledgeCode9
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-trainingCode9
Diffusion Forcing: Next-token Prediction Meets Full-Sequence DiffusionCode9
Liger Kernel: Efficient Triton Kernels for LLM TrainingCode9
CogVLM2: Visual Language Models for Image and Video UnderstandingCode9
SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect DetectionCode9
Grounded SAM: Assembling Open-World Models for Diverse Visual TasksCode9
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video GenerationCode9
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse AttentionCode9
ORPO: Monolithic Preference Optimization without Reference ModelCode9
Show:102550
← PrevPage 5 of 7214Next →