SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 38013825 of 177340 papers

TitleStatusHype
Mambular: A Sequential Model for Tabular Deep LearningCode3
Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference OptimizationCode3
WHAC: World-grounded Humans and CamerasCode3
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic EvaluationsCode3
Generative AI Act II: Test Time Scaling Drives Cognition EngineeringCode3
ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language ModelsCode3
Cognify: Supercharging Gen-AI Workflows With Hierarchical AutotuningCode3
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AICode3
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI AgentsCode3
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language ModelsCode3
From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation ModelsCode3
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language ModelsCode3
Chain of Draft: Thinking Faster by Writing LessCode3
Data Augmentation for Sequential Recommendation: A SurveyCode3
Programming Every Example: Lifting Pre-training Data Quality like Experts at ScaleCode3
MLVU: Benchmarking Multi-task Long Video UnderstandingCode3
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image RecognitionCode3
ECON: Explicit Clothed humans Optimized via Normal integrationCode3
Partially Rewriting a Transformer in Natural LanguageCode3
A Clean Slate for Offline Reinforcement LearningCode3
MarioGPT: Open-Ended Text2Level Generation through Large Language ModelsCode3
PINGS: Gaussian Splatting Meets Distance Fields within a Point-Based Implicit Neural MapCode3
VisualRWKV: Exploring Recurrent Neural Networks for Visual Language ModelsCode3
OS-ATLAS: A Foundation Action Model for Generalist GUI AgentsCode3
HadaCore: Tensor Core Accelerated Hadamard Transform KernelCode3
Show:102550
← PrevPage 153 of 7094Next →