| SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation | Feb 24, 2026 | | —Unverified | 2 |
| A Survey on Efficient Vision-Language-Action Models | Feb 2, 2026 | | —Unverified | 2 |
| BPMN Assistant: An LLM-Based Approach to Business Process Modeling | Jan 22, 2026 | | —Unverified | 2 |
| Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling | Mar 4, 2026 | | —Unverified | 2 |
| VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection | Mar 1, 2026 | | —Unverified | 2 |
| ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation | Feb 9, 2026 | | —Unverified | 2 |
| UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing | Feb 20, 2026 | | —Unverified | 2 |
| InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation | Mar 3, 2026 | | —Unverified | 2 |
| Physical Simulator In-the-Loop Video Generation | Mar 6, 2026 | | —Unverified | 2 |
| On Predictability of Reinforcement Learning Dynamics for Large Language Models | Feb 22, 2026 | | —Unverified | 2 |
| WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models | Jan 28, 2026 | | —Unverified | 2 |
| NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents | Feb 24, 2026 | | —Unverified | 2 |
| CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video | Mar 4, 2026 | | —Unverified | 2 |
| MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing | Mar 5, 2026 | | —Unverified | 2 |
| Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle | Mar 3, 2026 | | —Unverified | 2 |
| ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation | Feb 12, 2026 | | —Unverified | 2 |
| FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation | Feb 6, 2026 | | —Unverified | 2 |
| Weak-Driven Learning: How Weak Agents make Strong Agents Stronger | Feb 9, 2026 | | —Unverified | 2 |
| GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering | Mar 16, 2026 | | —Unverified | 2 |
| Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing | Mar 3, 2026 | | —Unverified | 2 |
| HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising | Mar 9, 2026 | | —Unverified | 2 |
| Bolmo: Byteifying the Next Generation of Language Models | Feb 9, 2026 | | —Unverified | 2 |
| Spanning the Visual Analogy Space with a Weight Basis of LoRAs | Feb 17, 2026 | | —Unverified | 2 |
| REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents | Feb 15, 2026 | | —Unverified | 2 |
| compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data | Feb 6, 2026 | | —Unverified | 2 |
| End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning | Feb 1, 2026 | | —Unverified | 2 |
| SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models | Mar 17, 2026 | | —Unverified | 2 |
| EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding | Mar 4, 2026 | | —Unverified | 2 |
| The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models | Mar 19, 2026 | | —Unverified | 2 |
| Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models | Mar 4, 2026 | | —Unverified | 2 |
| ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning | Feb 28, 2026 | | —Unverified | 2 |
| Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation | Feb 5, 2026 | | —Unverified | 2 |
| InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models | Feb 9, 2026 | | —Unverified | 2 |
| RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data | Feb 7, 2026 | | —Unverified | 2 |
| Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training | Mar 12, 2026 | | —Unverified | 2 |
| VLANeXt: Recipes for Building Strong VLA Models | Feb 20, 2026 | | —Unverified | 2 |
| Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels | Mar 5, 2026 | | —Unverified | 2 |
| Olaf-World: Orienting Latent Actions for Video World Modeling | Feb 10, 2026 | | —Unverified | 2 |
| Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism | Jan 27, 2026 | | —Unverified | 2 |
| LARGE: Legal Retrieval Augmented Generation Evaluation Tool | Apr 2, 2025 | RAGRetrieval | CodeCode Available | 2 |
| Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-Off | Apr 17, 2025 | Garment ReconstructionImage Generation | CodeCode Available | 2 |
| Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution | Feb 21, 2022 | | CodeCode Available | 2 |
| Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow | Aug 5, 2024 | | CodeCode Available | 2 |
| AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval | Apr 9, 2024 | AllInformation Retrieval | CodeCode Available | 2 |
| Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation | Dec 18, 2024 | Image SegmentationKnowledge Distillation | CodeCode Available | 2 |
| DiffMM: Multi-Modal Diffusion Model for Recommendation | Jun 17, 2024 | Contrastive Learningmodel | CodeCode Available | 2 |
| Blockwise Parallel Transformer for Large Context Models | May 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| VkD: Improving Knowledge Distillation using Orthogonal Projections | Jan 1, 2024 | Image GenerationKnowledge Distillation | CodeCode Available | 2 |
| Mixture of Tokens: Continuous MoE through Cross-Example Aggregation | Oct 24, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery | Jan 12, 2024 | Object RecognitionRoad Segmentation | CodeCode Available | 2 |