SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 701725 of 659983 papers

TitleStatusHype
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM IntegrationCode5
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of ToolsCode5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language ModelsCode5
OS-Copilot: Towards Generalist Computer Agents with Self-ImprovementCode5
Time-series attribution maps with regularized contrastive learningCode5
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric PerspectivesCode5
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMsCode5
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera ControlCode5
MobileSAMv2: Faster Segment Anything to EverythingCode5
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion TransformerCode5
BlackJAX: Composable Bayesian inference in JAXCode5
CodeGen2: Lessons for Training LLMs on Programming and Natural LanguagesCode5
Multimodal Autoregressive Pre-training of Large Vision EncodersCode5
Active Learning for Neural PDE SolversCode5
Cosmos World Foundation Model Platform for Physical AICode5
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing ImageryCode5
Pixel-SAIL: Single Transformer For Pixel-Grounded UnderstandingCode5
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context MultitasksCode5
Information Flow Routes: Automatically Interpreting Language Models at ScaleCode5
Bridging Different Language Models and Generative Vision Models for Text-to-Image GenerationCode5
UniDepth: Universal Monocular Metric Depth EstimationCode5
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A SurveyCode5
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and MaintenanceCode5
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZCode5
Noisereduce: Domain General Noise Reduction for Time Series SignalsCode5
Show:102550
← PrevPage 29 of 26400Next →