SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 23512400 of 659983 papers

TitleStatusHype
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents3
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence3
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence3
Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision3
Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making3
LongCat-Flash-Thinking-2601 Technical Report3
Tree Search for LLM Agent Reinforcement Learning3
Light of Normals: Unified Feature Representation for Universal Photometric Stereo3
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents3
Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing3
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence3
Grounding World Simulation Models in a Real-World Metropolis3
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections3
EO-1: An Open Unified Embodied Foundation Model for General Robot Control3
MetricAnything: Scaling Metric Depth Pretraining with Noisy Heterogeneous Sources3
JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation3
pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation3
LaTeXTrans: Structured LaTeX Translation with Multi-Agent Coordination3
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels3
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders3
Human3R: Everyone Everywhere All at Once3
Geometry-Grounded Gaussian Splatting3
ReactMotion: Generating Reactive Listener Motions from Speaker Utterance3
Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution3
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing3
Scaling Multiagent Systems with Process Rewards3
EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering3
RLP: Reinforcement as a Pretraining Objective3
VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency3
Latent Diffusion Model without Variational Autoencoder3
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data3
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights3
DVD: Deterministic Video Depth Estimation with Generative Priors3
SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis3
Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars3
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security3
SparkVSR: Interactive Video Super-Resolution via Sparse Keyframe Propagation3
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows3
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion3
FireRed-OCR Technical Report3
AnyUp: Universal Feature Upsampling3
PartUV: Part-Based UV Unwrapping of 3D Meshes3
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling3
Much Ado About Noising: Dispelling the Myths of Generative Robotic Control3
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding3
χ_0: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies3
InstantSfM: Towards GPU-Native SfM for the Deep Learning Era3
Simulating the Visual World with Artificial Intelligence: A Roadmap3
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience3
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution3
Show:102550
← PrevPage 48 of 13200Next →