SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 81518175 of 177340 papers

TitleStatusHype
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable PolicyCode2
RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought ReasoningCode2
CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive ProgrammingCode2
Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated VideosCode2
Neurosymbolic Diffusion ModelsCode2
Temporal Query Network for Efficient Multivariate Time Series ForecastingCode2
Efficient Speech Language Modeling via Energy Distance in Continuous Latent SpaceCode2
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept TokensCode2
KORGym: A Dynamic Game Platform for LLM Reasoning EvaluationCode2
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image SynthesisCode2
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-DesignCode2
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language ModelsCode2
LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOSCode2
Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its MiscibilityCode2
Shifting AI Efficiency From Model-Centric to Data-Centric CompressionCode2
DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical DialogueCode2
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache CompressionCode2
Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future ProspectsCode2
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion ModelCode2
Aligning Modalities in Vision Large Language Models via Preference Fine-tuningCode2
Vision Language Models are BiasedCode2
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian SplattingCode2
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at ScaleCode2
GSCodec Studio: A Modular Framework for Gaussian Splat CompressionCode2
MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and GenerationCode2
Show:102550
← PrevPage 327 of 7094Next →