SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 48014850 of 661570 papers

TitleStatusHype
ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation2
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors2
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?2
Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models2
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training2
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs2
Esoteric Language Models: Bridging Autoregressive and Masked Diffusion LLMs2
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs2
DeFM: Learning Foundation Representations from Depth for Robotics2
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding2
RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents2
Rethinking the Trust Region in LLM Reinforcement Learning2
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression2
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents2
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight2
Should We Still Pretrain Encoders with Masked Language Modeling?2
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation2
Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models2
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding2
Physical Simulator In-the-Loop Video Generation2
OmniForcing: Unleashing Real-time Joint Audio-Visual Generation2
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models2
Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange2
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse2
REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents2
MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing2
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery2
RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind2
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model2
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation2
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger2
Evolving Interactive Diagnostic Agents in a Virtual Clinical Environment2
The Trinity of Consistency as a Defining Principle for General World Models2
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs2
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising2
OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams2
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering2
VLANeXt: Recipes for Building Strong VLA Models2
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation2
EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents2
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding2
Self-Refining Video Sampling2
SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks2
BuildArena: A Physics-Aligned Interactive Benchmark of LLMs for Engineering Construction2
Residual Context Diffusion Language Models2
SERA: Soft-Verified Efficient Repository Agents2
Exploring Reasoning Reward Model for Agents2
RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data2
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation2
Show:102550
← PrevPage 97 of 13232Next →