| ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation | Mar 12, 2026 | | —Unverified | 2 |
| From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors | Feb 27, 2026 | | —Unverified | 2 |
| StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? | Mar 2, 2026 | | —Unverified | 2 |
| Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models | Jan 29, 2026 | | —Unverified | 2 |
| Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training | Mar 12, 2026 | | —Unverified | 2 |
| SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs | Mar 1, 2026 | | —Unverified | 2 |
| Esoteric Language Models: Bridging Autoregressive and Masked Diffusion LLMs | Feb 21, 2026 | | —Unverified | 2 |
| Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs | Feb 12, 2026 | | —Unverified | 2 |
| DeFM: Learning Foundation Representations from Depth for Robotics | Jan 26, 2026 | | —Unverified | 2 |
| The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding | Jan 23, 2026 | | —Unverified | 2 |
| RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents | Feb 2, 2026 | | —Unverified | 2 |
| Rethinking the Trust Region in LLM Reinforcement Learning | Feb 4, 2026 | | —Unverified | 2 |
| Accelerating Streaming Video Large Language Models via Hierarchical Token Compression | Feb 11, 2026 | | —Unverified | 2 |
| AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents | Feb 16, 2026 | | —Unverified | 2 |
| Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight | Feb 23, 2026 | | —Unverified | 2 |
| Should We Still Pretrain Encoders with Masked Language Modeling? | Feb 24, 2026 | | —Unverified | 2 |
| InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation | Mar 3, 2026 | | —Unverified | 2 |
| Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models | Mar 4, 2026 | | —Unverified | 2 |
| ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding | Mar 5, 2026 | | —Unverified | 2 |
| Physical Simulator In-the-Loop Video Generation | Mar 6, 2026 | | —Unverified | 2 |
| OmniForcing: Unleashing Real-time Joint Audio-Visual Generation | Mar 13, 2026 | | —Unverified | 2 |
| Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion | Mar 18, 2026 | | —Unverified | 2 |
| The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models | Mar 19, 2026 | | —Unverified | 2 |
| Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange | Mar 15, 2026 | | —Unverified | 2 |
| IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse | Mar 12, 2026 | | —Unverified | 2 |
| REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents | Feb 15, 2026 | | —Unverified | 2 |
| MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing | Mar 5, 2026 | | —Unverified | 2 |
| Innovator-VL: A Multimodal Large Language Model for Scientific Discovery | Jan 27, 2026 | | —Unverified | 2 |
| RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind | Feb 25, 2026 | | —Unverified | 2 |
| Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model | Jan 23, 2026 | | —Unverified | 2 |
| Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation | Mar 13, 2026 | | —Unverified | 2 |
| Weak-Driven Learning: How Weak Agents make Strong Agents Stronger | Feb 9, 2026 | | —Unverified | 2 |
| Evolving Interactive Diagnostic Agents in a Virtual Clinical Environment | Feb 10, 2026 | | —Unverified | 2 |
| The Trinity of Consistency as a Defining Principle for General World Models | Feb 26, 2026 | | —Unverified | 2 |
| OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs | Mar 5, 2026 | | —Unverified | 2 |
| HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising | Mar 9, 2026 | | —Unverified | 2 |
| OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams | Mar 12, 2026 | | —Unverified | 2 |
| CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering | Feb 28, 2026 | | —Unverified | 2 |
| VLANeXt: Recipes for Building Strong VLA Models | Feb 20, 2026 | | —Unverified | 2 |
| DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation | Jan 29, 2026 | | —Unverified | 2 |
| EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents | Feb 26, 2026 | | —Unverified | 2 |
| HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding | Jan 26, 2026 | | —Unverified | 2 |
| Self-Refining Video Sampling | Jan 26, 2026 | | —Unverified | 2 |
| SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks | Jan 22, 2026 | | —Unverified | 2 |
| BuildArena: A Physics-Aligned Interactive Benchmark of LLMs for Engineering Construction | Jan 24, 2026 | | —Unverified | 2 |
| Residual Context Diffusion Language Models | Jan 30, 2026 | | —Unverified | 2 |
| SERA: Soft-Verified Efficient Repository Agents | Feb 2, 2026 | | —Unverified | 2 |
| Exploring Reasoning Reward Model for Agents | Jan 29, 2026 | | —Unverified | 2 |
| RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data | Feb 7, 2026 | | —Unverified | 2 |
| ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation | Feb 9, 2026 | | —Unverified | 2 |