SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 53265350 of 661570 papers

TitleStatusHype
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement LearningCode2
Learn to Reason Efficiently with Adaptive Length-based Reward ShapingCode2
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task GeneralizationCode2
Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music AttributesCode2
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?Code2
iPad: Iterative Proposal-centric End-to-End Autonomous DrivingCode2
dKV-Cache: The Cache for Diffusion Language ModelsCode2
MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation ModelsCode2
Scaling Diffusion Transformers Efficiently via μPCode2
Learning Spatio-Temporal Dynamics for Trajectory Recovery via Time-Aware TransformerCode2
Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion TransformersCode2
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language ModelsCode2
Quartet: Native FP4 Training Can Be Optimal for Large Language ModelsCode2
CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code GenerationCode2
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept TokensCode2
Let LLMs Break Free from Overthinking via Self-Braking TuningCode2
Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language ModelsCode2
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice SynthesisCode2
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking AttacksCode2
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to RankCode2
Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General ReasoningCode2
KORGym: A Dynamic Game Platform for LLM Reasoning EvaluationCode2
Place Recognition: A Comprehensive Review, Current Challenges and Future DirectionsCode2
Rethinking Features-Fused-Pyramid-Neck for Object DetectionCode2
Temporal Query Network for Efficient Multivariate Time Series ForecastingCode2
Show:102550
← PrevPage 214 of 26463Next →