SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 92769300 of 177340 papers

TitleStatusHype
Improved StyleGAN Embedding: Where are the Good Latents?Code2
Few-Shot Text Generation with Pattern-Exploiting TrainingCode2
CodeT: Code Generation with Generated TestsCode2
VinVL: Revisiting Visual Representations in Vision-Language ModelsCode2
MolScribe: Robust Molecular Structure Recognition with Image-To-Graph GenerationCode2
Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series ForecastingCode2
Differentially Private Synthetic Data via Foundation Model APIs 2: TextCode2
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn DialogueCode2
CodeR: Issue Resolving with Multi-Agent and Task GraphsCode2
Global Context Vision TransformersCode2
RL-X: A Deep Reinforcement Learning Library (not only) for RoboCupCode2
Learning to Fly -- a Gym Environment with PyBullet Physics for Reinforcement Learning of Multi-agent Quadcopter ControlCode2
OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal AssociationCode2
MedViT: A Robust Vision Transformer for Generalized Medical Image ClassificationCode2
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor CoresCode2
H3T: Efficient Integration of Memory Optimization and Parallelism for Large-scale Transformer TrainingCode2
Training-free CryoET Tomogram SegmentationCode2
Beyond Next Token Prediction: Patch-Level Training for Large Language ModelsCode2
Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile ManipulationCode2
BAD-NeRF: Bundle Adjusted Deblur Neural Radiance FieldsCode2
BLASER: A Text-Free Speech-to-Speech Translation Evaluation MetricCode2
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain ScenariosCode2
ReDel: A Toolkit for LLM-Powered Recursive Multi-Agent SystemsCode2
A Review of Graph Neural Networks in Epidemic ModelingCode2
LongEmbed: Extending Embedding Models for Long Context RetrievalCode2
Show:102550
← PrevPage 372 of 7094Next →