SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 49014950 of 661570 papers

TitleStatusHype
Mitigating Attention Sinks and Massive Activations in Audio-Visual Speech Recognition with LLMs2
OmniGAIA: Towards Native Omni-Modal AI Agents2
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories2
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering2
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?2
Deforming Videos to Masks: Flow Matching for Referring Video Segmentation2
AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories2
Enhancing Spatial Understanding in Image Generation via Reward Modeling2
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing2
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion2
Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator2
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding2
Latent Denoising Makes Good Tokenizers2
VLANeXt: Recipes for Building Strong VLA Models2
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents2
Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation2
WorldStereo: Bridging Camera-Guided Video Generation and Scene Reconstruction via 3D Geometric Memories2
Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?2
How to Correctly Report LLM-as-a-Judge Evaluations2
The Trinity of Consistency as a Defining Principle for General World Models2
Kanade: A Simple Disentangled Tokenizer for Spoken Language Modeling2
XSkill: Continual Learning from Experience and Skills in Multimodal Agents2
OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot2
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention2
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation2
Unified Multimodal Models as Auto-Encoders2
OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams2
Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding2
Streaming Autoregressive Video Generation via Diagonal Distillation2
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs2
Experiential Reinforcement Learning2
SimVLA: A Simple VLA Baseline for Robotic Manipulation2
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation2
SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation2
Efficient Reasoning with Balanced Thinking2
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs2
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion2
MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation2
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings2
Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing2
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation2
RealWonder: Real-Time Physical Action-Conditioned Video Generation2
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests2
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors2
Towards Pixel-Level VLM Perception via Simple Points Prediction2
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs2
Learning a Generative Meta-Model of LLM Activations2
Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?2
ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation2
EEG Foundation Models: Progresses, Benchmarking, and Open Problems2
Show:102550
← PrevPage 99 of 13232Next →