| Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper | Mar 11, 2026 | | —Unverified | 1 |
| Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference | Mar 11, 2026 | | —Unverified | 1 |
| ResearchGym: Evaluating Language Model Agents on Real-World AI Research | Mar 11, 2026 | | —Unverified | 1 |
| MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents | Mar 11, 2026 | | —Unverified | 1 |
| World Models That Know When They Don't Know - Controllable Video Generation with Calibrated Uncertainty | Mar 10, 2026 | | —Unverified | 1 |
| Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition | Mar 10, 2026 | | —Unverified | 1 |
| AlphaApollo: A System for Deep Agentic Reasoning | Mar 10, 2026 | | —Unverified | 1 |
| TinyNav: End-to-End TinyML for Real-Time Autonomous Navigation on Microcontrollers | Mar 10, 2026 | | —Unverified | 1 |
| Reward Prediction with Factorized World States | Mar 10, 2026 | | —Unverified | 1 |
| From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors | Mar 10, 2026 | | —Unverified | 1 |
| Video-Based Reward Modeling for Computer-Use Agents | Mar 10, 2026 | | —Unverified | 1 |
| Monocular Normal Estimation via Shading Sequence Estimation | Mar 10, 2026 | | —Unverified | 1 |
| Can Vision-Language Models Solve the Shell Game? | Mar 9, 2026 | | —Unverified | 1 |
| Scale Space Diffusion | Mar 9, 2026 | | —Unverified | 1 |
| TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size | Mar 9, 2026 | | —Unverified | 1 |
| OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer | Mar 9, 2026 | | —Unverified | 1 |
| Modular Neural Image Signal Processing | Mar 9, 2026 | | —Unverified | 1 |
| FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use | Mar 9, 2026 | | —Unverified | 1 |
| VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning? | Mar 9, 2026 | | —Unverified | 1 |
| CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing | Mar 9, 2026 | | —Unverified | 1 |
| RedSage: A Cybersecurity Generalist LLM | Mar 9, 2026 | | —Unverified | 1 |
| In-Context Reinforcement Learning for Tool Use in Large Language Models | Mar 9, 2026 | | —Unverified | 1 |
| LatentMem: Customizing Latent Memory for Multi-Agent Systems | Mar 9, 2026 | | —Unverified | 1 |
| FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach | Mar 9, 2026 | | —Unverified | 1 |
| \$OneMillion-Bench: How Far are Language Agents from Human Experts? | Mar 9, 2026 | | —Unverified | 1 |
| π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs | Mar 9, 2026 | | —Unverified | 1 |
| NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving | Mar 9, 2026 | | —Unverified | 1 |
| HiconAgent: History Context-aware Policy Optimization for GUI Agents | Mar 8, 2026 | | —Unverified | 1 |
| TAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events | Mar 8, 2026 | | —Unverified | 1 |
| Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning | Mar 7, 2026 | | —Unverified | 1 |
| DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving | Mar 7, 2026 | | —Unverified | 1 |
| CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization | Mar 6, 2026 | | —Unverified | 1 |
| PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction | Mar 6, 2026 | | —Unverified | 1 |
| LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood Preference | Mar 6, 2026 | | —Unverified | 1 |
| FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling | Mar 6, 2026 | | —Unverified | 1 |
| UniVBench: Towards Unified Evaluation for Video Foundation Models | Mar 6, 2026 | | —Unverified | 1 |
| U6G XL-MIMO Radiomap Prediction: Multi-Config Dataset and Beam Map Approach | Mar 6, 2026 | | —Unverified | 1 |
| CASA: Cross-Attention over Self-Attention for Efficient Vision-Language Fusion | Mar 6, 2026 | | —Unverified | 1 |
| Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model | Mar 5, 2026 | | —Unverified | 1 |
| -Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space | Mar 5, 2026 | | —Unverified | 1 |
| BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning | Mar 5, 2026 | | —Unverified | 1 |
| KLASS: KL-Guided Fast Inference in Masked Diffusion Models | Mar 5, 2026 | | —Unverified | 1 |
| DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval | Mar 5, 2026 | | —Unverified | 1 |
| Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs | Mar 5, 2026 | | —Unverified | 1 |
| Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline | Mar 5, 2026 | | —Unverified | 1 |
| VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL | Mar 5, 2026 | | —Unverified | 1 |
| LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery | Mar 5, 2026 | | —Unverified | 1 |
| Factuality Matters: When Image Generation and Editing Meet Structured Visuals | Mar 4, 2026 | | —Unverified | 1 |
| ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems | Mar 4, 2026 | | —Unverified | 1 |
| ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors | Mar 4, 2026 | | —Unverified | 1 |