SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 49515000 of 661570 papers

TitleStatusHype
SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation2
A Survey on Efficient Vision-Language-Action Models2
BPMN Assistant: An LLM-Based Approach to Business Process Modeling2
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling2
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection2
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation2
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing2
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation2
Physical Simulator In-the-Loop Video Generation2
On Predictability of Reinforcement Learning Dynamics for Large Language Models2
WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models2
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents2
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video2
MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing2
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle2
ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation2
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation2
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger2
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering2
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing2
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising2
Bolmo: Byteifying the Next Generation of Language Models2
Spanning the Visual Analogy Space with a Weight Basis of LoRAs2
REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents2
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data2
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning2
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models2
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models2
Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models2
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning2
Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation2
InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models2
RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data2
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training2
VLANeXt: Recipes for Building Strong VLA Models2
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels2
Olaf-World: Orienting Latent Actions for Video World Modeling2
Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism2
LARGE: Legal Retrieval Augmented Generation Evaluation ToolCode2
Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-OffCode2
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-DistributionCode2
Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment FlowCode2
AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information RetrievalCode2
Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image SegmentationCode2
DiffMM: Multi-Modal Diffusion Model for RecommendationCode2
Blockwise Parallel Transformer for Large Context ModelsCode2
VkD: Improving Knowledge Distillation using Orthogonal ProjectionsCode2
Mixture of Tokens: Continuous MoE through Cross-Example AggregationCode2
Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imageryCode2
Show:102550
← PrevPage 100 of 13232Next →