SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 53015325 of 661570 papers

TitleStatusHype
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive FeedbackCode2
Ranked Entropy Minimization for Continual Test-Time AdaptationCode2
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking RewardCode2
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement LearningCode2
Training Long-Context LLMs Efficiently via Chunk-wise OptimizationCode2
GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI AgentCode2
SEED: Speaker Embedding Enhancement Diffusion ModelCode2
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement LearningCode2
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language ModelsCode2
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-DesignCode2
Seeing through Satellite Images at Street ViewsCode2
DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-ResolutionCode2
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software DevelopmentCode2
Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and SegmentationCode2
Structure-Aligned Protein Language ModelCode2
SpatialScore: Towards Unified Evaluation for Multimodal Spatial UnderstandingCode2
ARPO:End-to-End Policy Optimization for GUI Agents with Experience ReplayCode2
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel DecodingCode2
RL Tango: Reinforcing Generator and Verifier Together for Language ReasoningCode2
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task GeneralizationCode2
The P^3 dataset: Pixels, Points and Polygons for Multimodal Building VectorizationCode2
Learn to Reason Efficiently with Adaptive Length-based Reward ShapingCode2
InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object RecognitionCode2
Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music AttributesCode2
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement LearningCode2
Show:102550
← PrevPage 213 of 26463Next →