SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 68266850 of 474278 papers

TitleStatusHype
Step-Audio-R1 Technical Report0
TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding0
Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination0
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research0
DeepRFTv2: Kernel-level Learning for Image DeblurringCode0
Learning When to Stop: Adaptive Latent Reasoning via Reinforcement LearningCode0
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory0
Revisiting Generalization Across Difficulty Levels: It's Not So Easy0
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration0
Interpretable Multimodal Cancer Prototyping with Whole Slide Images and Incompletely Paired GenomicsCode0
MLPMoE: Zero-Shot Architectural Metamorphosis of Dense LLM MLPs into Static Mixture-of-ExpertsCode0
VA-GS: Enhancing the Geometric Representation of Gaussian Splatting via View AlignmentCode0
Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive GenerationCode0
PoETa v2: Toward More Robust Evaluation of Large Language Models in PortugueseCode0
EvilGenie: A Reward Hacking BenchmarkCode0
GreenHyperSpectra: A multi-source hyperspectral dataset for global vegetation trait predictionCode0
Co-NAML-LSTUR: A Combined Model with Attentive Multi-View Learning and Long- and Short-term User Representations for News RecommendationCode0
DEMIST: Decoupled Multi-stream latent diffusion for Quantitative Myelin Map SynthesisCode0
ConceptGuard: Proactive Safety in Text-and-Image-to-Video Generation through Multimodal Risk DetectionCode0
UniGame: Turning a Unified Multimodal Model Into Its Own AdversaryCode0
BUSTR: Breast Ultrasound Text Reporting with a Descriptor-Aware Vision-Language ModelCode0
RefOnce: Distilling References into a Prototype Memory for Referring Camouflaged Object DetectionCode0
A Probabilistic Framework for Temporal Distribution Generalization in Industry-Scale Recommender SystemsCode0
G-Net: A Provably Easy Construction of High-Accuracy Random Binary Neural NetworksCode0
Context-Aware Pragmatic Metacognitive Prompting for Sarcasm DetectionCode0
Show:102550
← PrevPage 274 of 18972Next →