| TAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events | Mar 8, 2026 | | —Unverified | 1 |
| QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models | Feb 27, 2026 | | —Unverified | 1 |
| VLS: Steering Pretrained Robot Policies via Vision-Language Models | Feb 3, 2026 | | —Unverified | 1 |
| Privileged Information Distillation for Language Models | Feb 16, 2026 | | —Unverified | 1 |
| Learning Self-Correction in Vision-Language Models via Rollout Augmentation | Feb 9, 2026 | | —Unverified | 1 |
| Large Multimodal Models as General In-Context Classifiers | Feb 26, 2026 | | —Unverified | 1 |
| Coarse-Guided Visual Generation via Weighted h-Transform Sampling | Mar 12, 2026 | | —Unverified | 1 |
| HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions | Mar 16, 2026 | | —Unverified | 1 |
| DREAM: Where Visual Understanding Meets Text-to-Image Generation | Mar 3, 2026 | | —Unverified | 1 |
| How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing | Feb 2, 2026 | | —Unverified | 1 |
| AlphaApollo: A System for Deep Agentic Reasoning | Mar 10, 2026 | | —Unverified | 1 |
| SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents | Feb 13, 2026 | | —Unverified | 1 |
| Nacrith: Neural Lossless Compression via Ensemble Context Modeling and High-Precision CDF Coding | Feb 24, 2026 | | —Unverified | 1 |
| Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models | Mar 16, 2026 | | —Unverified | 1 |
| ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems | Mar 4, 2026 | | —Unverified | 1 |
| Stereo World Model: Camera-Guided Stereo Video Generation | Mar 18, 2026 | | —Unverified | 1 |
| MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants | Mar 16, 2026 | | —Unverified | 1 |
| MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning | Mar 2, 2026 | | —Unverified | 1 |
| SK-Adapter: Skeleton-Based Structural Control for Native 3D Generation | Mar 14, 2026 | | —Unverified | 1 |
| Rethinking Selective Knowledge Distillation | Feb 1, 2026 | | —Unverified | 1 |
| Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry | Jan 30, 2026 | | —Unverified | 1 |
| Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning | Feb 12, 2026 | | —Unverified | 1 |
| Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models | Mar 18, 2026 | | —Unverified | 1 |
| LatentMem: Customizing Latent Memory for Multi-Agent Systems | Mar 9, 2026 | | —Unverified | 1 |
| Mano: Restriking Manifold Optimization for LLM Training | Jan 30, 2026 | | —Unverified | 1 |
| Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing | Feb 10, 2026 | | —Unverified | 1 |
| Demystifing Video Reasoning | Mar 17, 2026 | | —Unverified | 1 |
| Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs | Feb 18, 2026 | | —Unverified | 1 |
| MediX-R1: Open Ended Medical Reinforcement Learning | Feb 26, 2026 | | —Unverified | 1 |
| Show, Don't Tell: Morphing Latent Reasoning into Image Generation | Feb 2, 2026 | | —Unverified | 1 |
| Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets | Feb 25, 2026 | | —Unverified | 1 |
| Safety Alignment of LMs via Non-cooperative Games | Feb 7, 2026 | | —Unverified | 1 |
| Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening | Feb 6, 2026 | | —Unverified | 1 |
| Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data | Feb 24, 2026 | | —Unverified | 1 |
| Same or Not? Enhancing Visual Perception in Vision-Language Models | Feb 4, 2026 | | —Unverified | 1 |
| MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models | Mar 17, 2026 | | —Unverified | 1 |
| MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation | Feb 23, 2026 | | —Unverified | 1 |
| Learning While Staying Curious: Entropy-Preserving Supervised Fine-Tuning via Adaptive Self-Distillation for Large Reasoning Models | Feb 8, 2026 | | —Unverified | 1 |
| One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers | Mar 12, 2026 | | —Unverified | 1 |
| Reinforced Fast Weights with Next-Sequence Prediction | Feb 18, 2026 | | —Unverified | 1 |
| VTC-Bench: Evaluating Agentic Multimodal Models via Compositional Visual Tool Chaining | Mar 19, 2026 | | —Unverified | 1 |
| InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem | Feb 16, 2026 | | —Unverified | 1 |
| PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction | Mar 6, 2026 | | —Unverified | 1 |
| AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models | Mar 1, 2026 | | —Unverified | 1 |
| General Agent Evaluation | Feb 26, 2026 | | —Unverified | 1 |
| OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG | Jan 24, 2026 | | —Unverified | 1 |
| Glance and Focus Reinforcement for Pan-cancer Screening | Feb 2, 2026 | | —Unverified | 1 |
| Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following | Mar 12, 2026 | | —Unverified | 1 |
| AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts | Jan 30, 2026 | | —Unverified | 1 |
| Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings | Mar 12, 2026 | | —Unverified | 1 |