| SAW: Toward a Surgical Action World Model via Controllable and Scalable Video Generation | Mar 13, 2026 | | —Unverified | 0 |
| PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses | Mar 13, 2026 | | CodeCode Available | 0 |
| Human-Centered Evaluation of an LLM-Based Process Modeling Copilot: A Mixed-Methods Study with Domain Experts | Mar 13, 2026 | | —Unverified | 0 |
| IROSA: Interactive Robot Skill Adaptation using Natural Language | Mar 13, 2026 | | —Unverified | 0 |
| Fourier Angle Alignment for Oriented Object Detection in Remote Sensing | Mar 13, 2026 | | CodeCode Available | 0 |
| AI Model Modulation with Logits Redistribution | Mar 13, 2026 | | —Unverified | 0 |
| FastDSAC: Unlocking the Potential of Maximum Entropy RL in High-Dimensional Humanoid Control | Mar 13, 2026 | | —Unverified | 0 |
| DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning | Mar 13, 2026 | | CodeCode Available | 0 |
| A2Z-10M+: Geometric Deep Learning with A-to-Z BRep Annotations for AI-Assisted CAD Modeling and Reverse Engineering | Mar 13, 2026 | | —Unverified | 0 |
| HSEmotion Team at ABAW-10 Competition: Facial Expression Recognition, Valence-Arousal Estimation, Action Unit Detection and Fine-Grained Violence Classification | Mar 13, 2026 | | —Unverified | 0 |
| CognitionCapturerPro: Towards High-Fidelity Visual Decoding from EEG/MEG via Multi-modal Information and Asymmetric Alignment | Mar 13, 2026 | | CodeCode Available | 0 |
| DiffProxy: Multi-View Human Mesh Recovery via Diffusion-Generated Dense Proxies | Mar 13, 2026 | | —Unverified | 0 |
| Beyond Static Instruction: A Multi-agent AI Framework for Adaptive Augmented Reality Robot Training | Mar 13, 2026 | | —Unverified | 0 |
| Mask2Flow-TSE: Two-Stage Target Speaker Extraction with Masking and Flow Matching | Mar 13, 2026 | | —Unverified | 0 |
| 98 Faster LLM Routing Without a Dedicated GPU: Flash Attention, Prompt Compression, and Near-Streaming for the vLLM Semantic Router | Mar 13, 2026 | | —Unverified | 0 |
| Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs | Mar 13, 2026 | | —Unverified | 0 |
| GA-Drive: Geometry-Appearance Decoupled Modeling for Free-viewpoint Driving Scene Generation | Mar 13, 2026 | | —Unverified | 0 |
| ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning | Mar 13, 2026 | | —Unverified | 0 |
| The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts | Mar 13, 2026 | | —Unverified | 0 |
| TubeMLLM: A Foundation Model for Topology Knowledge Exploration in Vessel-like Anatomy | Mar 13, 2026 | | —Unverified | 0 |
| Thinking in Dynamics: How Multimodal Large Language Models Perceive, Track, and Reason Dynamics in Physical 4D World | Mar 13, 2026 | | —Unverified | 0 |
| AccelAes: Accelerating Diffusion Transformers for Training-Free Aesthetic-Enhanced Image Generation | Mar 13, 2026 | | CodeCode Available | 0 |
| SPELL: Self-Play Reinforcement Learning for Evolving Long-Context Language Models | Mar 13, 2026 | | CodeCode Available | 0 |
| GeoZero: Incentivizing Reasoning from Scratch on Geospatial Scenes | Mar 13, 2026 | | CodeCode Available | 0 |
| Mitigating Latent Mismatch in cVAE-Based Singing Voice Synthesis via Flow Matching | Mar 13, 2026 | | CodeCode Available | 0 |
| VLM4Rec: Multimodal Semantic Representation for Recommendation with Large Vision-Language Models | Mar 13, 2026 | | CodeCode Available | 0 |
| HIFICL: High-Fidelity In-Context Learning for Multimodal Tasks | Mar 13, 2026 | | CodeCode Available | 0 |
| CM-Bench: A Comprehensive Cross-Modal Feature Matching Benchmark Bridging Visible and Infrared Images | Mar 13, 2026 | | CodeCode Available | 0 |
| A protocol for evaluating robustness to H&E staining variation in computational pathology models | Mar 13, 2026 | | CodeCode Available | 0 |
| FedBPrompt: Federated Domain Generalization Person Re-Identification via Body Distribution Aware Visual Prompts | Mar 13, 2026 | | CodeCode Available | 0 |
| Fair Lung Disease Diagnosis from Chest CT via Gender-Adversarial Attention Multiple Instance Learning | Mar 13, 2026 | | CodeCode Available | 0 |
| SortScrews: A Dataset and Baseline for Real-time Screw Classification | Mar 13, 2026 | | CodeCode Available | 0 |
| Think and Answer ME: Benchmarking and Exploring Multi-Entity Reasoning Grounding in Remote Sensing | Mar 13, 2026 | | CodeCode Available | 0 |
| Vision Verification Enhanced Fusion of VLMs for Efficient Visual Reasoning | Mar 13, 2026 | | CodeCode Available | 0 |
| HFP-SAM: Hierarchical Frequency Prompted SAM for Efficient Marine Animal Segmentation | Mar 13, 2026 | | CodeCode Available | 0 |
| UNIStainNet: Foundation-Model-Guided Virtual Staining of H&E to IHC | Mar 13, 2026 | | CodeCode Available | 0 |
| IGASA: Integrated Geometry-Aware and Skip-Attention Modules for Enhanced Point Cloud Registration | Mar 13, 2026 | | CodeCode Available | 0 |
| CVGL: Causal Learning and Geometric Topology | Mar 13, 2026 | | CodeCode Available | 0 |
| Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages | Mar 13, 2026 | | CodeCode Available | 0 |
| Multiscale Structure-Guided Latent Diffusion for Multimodal MRI Translation | Mar 13, 2026 | | CodeCode Available | 0 |
| Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback | Mar 13, 2026 | | CodeCode Available | 0 |
| Automatic Labelling for Low-Light Pedestrian Detection | Mar 13, 2026 | | CodeCode Available | 0 |
| Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views | Mar 13, 2026 | | CodeCode Available | 0 |
| Parameterized Prompt for Incremental Object Detection | Mar 13, 2026 | | CodeCode Available | 0 |
| GraphPilot: Grounded Scene Graph Conditioning for Language-Based Autonomous Driving | Mar 13, 2026 | | CodeCode Available | 0 |
| AnatomiX, an Anatomy-Aware Grounded Multimodal Large Language Model for Chest X-Ray Interpretation | Mar 13, 2026 | | CodeCode Available | 0 |
| BitDance: Scaling Autoregressive Generative Models with Binary Tokens | Mar 13, 2026 | | CodeCode Available | 0 |
| Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning | Mar 13, 2026 | | CodeCode Available | 0 |
| SODA: Sensitivity-Oriented Dynamic Acceleration for Diffusion Transformer | Mar 13, 2026 | | CodeCode Available | 0 |
| CMHANet: A Cross-Modal Hybrid Attention Network for Point Cloud Registration | Mar 13, 2026 | | CodeCode Available | 0 |