SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 41764200 of 661570 papers

TitleStatusHype
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker RecognitionCode3
Impact of architecture on robustness and interpretability of multispectral deep neural networksCode3
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter ModelCode3
FreeU: Free Lunch in Diffusion U-NetCode3
SlimPajama-DC: Understanding Data Combinations for LLM TrainingCode3
Amplifying Pathological Detection in EEG Signaling Pathways through Cross-Dataset Transfer LearningCode3
Multimodal Foundation Models: From Specialists to General-Purpose AssistantsCode3
Sparse Autoencoders Find Highly Interpretable Features in Language ModelsCode3
AudioSR: Versatile Audio Super-resolution at ScaleCode3
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image GenerationCode3
HAT: Hybrid Attention Transformer for Image RestorationCode3
Anatomy-informed Data Augmentation for Enhanced Prostate Cancer DetectionCode3
Tracking Anything with Decoupled Video SegmentationCode3
Matcha-TTS: A fast TTS architecture with conditional flow matchingCode3
nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited ResourcesCode3
Generative Data Augmentation using LLMs improves Distributional Robustness in Question AnsweringCode3
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language ModelsCode3
Emergence of Segmentation with Minimalistic White-Box TransformersCode3
SAM-Med2DCode3
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language ModelsCode3
VideoCutLER: Surprisingly Simple Unsupervised Video Instance SegmentationCode3
LongBench: A Bilingual, Multitask Benchmark for Long Context UnderstandingCode3
Matbench Discovery -- A framework to evaluate machine learning crystal stability predictionsCode3
Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and Personalized StylizationCode3
How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary DetectionCode3
Show:102550
← PrevPage 168 of 26463Next →