SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 34513475 of 177340 papers

TitleStatusHype
YourBench: Easy Custom Evaluation Sets for EveryoneCode3
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to MultimodalityCode3
Iterative Self-Incentivization Empowers Large Language Models as Agentic SearchersCode3
Spurious Rewards: Rethinking Training Signals in RLVRCode3
MotionDirector: Motion Customization of Text-to-Video Diffusion ModelsCode3
River: machine learning for streaming data in PythonCode3
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-DesignCode3
Personalized Benchmarking with the Ludwig Benchmarking ToolkitCode3
Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical lawsCode3
Large Language Models for Generative Information Extraction: A SurveyCode3
The Rise of Diffusion Models in Time-Series ForecastingCode3
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single ShotCode3
Segment Anything Model for Road Network Graph ExtractionCode3
RS-Mamba for Large Remote Sensing Image Dense PredictionCode3
Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future DirectionsCode3
Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent TransformerCode3
SMART: Scalable Multi-agent Real-time Motion Generation via Next-token PredictionCode3
MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion ScaffoldsCode3
Generative AI for Autonomous Driving: Frontiers and OpportunitiesCode3
Understanding and Minimising Outlier Features in Neural Network TrainingCode3
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual GenerationCode3
LoRA-GA: Low-Rank Adaptation with Gradient ApproximationCode3
Fast Matrix Multiplications for Lookup Table-Quantized LLMsCode3
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image PersonalizationCode3
OpenResearcher: Unleashing AI for Accelerated Scientific ResearchCode3
Show:102550
← PrevPage 139 of 7094Next →