GeoMag: A Vision-Language Model for Pixel-level Fine-Grained Remote Sensing Image Parsing Jul 8, 2025 Language Modeling Language Modelling
— Unverified 0Out-of-distribution detection in 3D applications: a review Jul 1, 2025 Autonomous Driving Navigate
— Unverified 0SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds Jun 16, 2025 3D Object Recognition Object Recognition
Code Code Available 0Continual Hyperbolic Learning of Instances and Classes Jun 12, 2025 Continual Learning Object Recognition
— Unverified 0DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects Jun 11, 2025 Depth Completion Depth Estimation
— Unverified 0Aligning Text, Images, and 3D Structure Token-by-Token Jun 9, 2025 3D Object Recognition Instruction Following
— Unverified 0STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving Jun 6, 2025 Autonomous Driving Autonomous Vehicles
Code Code Available 1Feature-Based Lie Group Transformer for Real-World Applications Jun 5, 2025 Object Object Recognition
— Unverified 0EV-Flying: an Event-based Dataset for In-The-Wild Recognition of Flying Objects Jun 4, 2025 Event-based vision Object Recognition
— Unverified 0Explicitly Modeling Subcortical Vision with a Neuro-Inspired Front-End Improves CNN Robustness Jun 3, 2025 Data Augmentation Object Recognition
— Unverified 0Efficient Estimation of Regularized Tyler's M-Estimator Using Approximate LOOCV May 30, 2025 Face Recognition Object Recognition
— Unverified 0TrackVLA: Embodied Visual Tracking in the Wild May 29, 2025 Language Modeling Language Modelling
— Unverified 0SHTOcc: Effective 3D Occupancy Prediction with Sparse Head and Tail Voxels May 28, 2025 Autonomous Driving GPU
Code Code Available 0ADD-SLAM: Adaptive Dynamic Dense SLAM with Gaussian Splatting May 26, 2025 NeRF object-detection
— Unverified 0Detailed Evaluation of Modern Machine Learning Approaches for Optic Plastics Sorting May 22, 2025 Instance Segmentation Object Recognition
— Unverified 0RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation May 21, 2025 GPU Natural Language Queries
— Unverified 0InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition May 21, 2025 Earth Observation Object
Code Code Available 2Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers May 21, 2025 Clustering Object Recognition
— Unverified 0PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI May 19, 2025 Benchmarking Minecraft
— Unverified 0ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding May 18, 2025 Brain Decoding Contrastive Learning
— Unverified 0Model alignment using inter-modal bridges May 18, 2025 Image Generation model
— Unverified 0AW-GATCN: Adaptive Weighted Graph Attention Convolutional Network for Event Camera Data Joint Denoising and Object Recognition May 16, 2025 Denoising Event Segmentation
— Unverified 0A Light and Smart Wearable Platform with Multimodal Foundation Model for Enhanced Spatial Reasoning in People with Blindness and Low Vision May 16, 2025 Large Language Model Navigate
— Unverified 0MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence May 15, 2025 Attribute Object
— Unverified 0Improving Unsupervised Task-driven Models of Ventral Visual Stream via Relative Position Predictivity May 13, 2025 Contrastive Learning Object
Code Code Available 0Topology-Guided Knowledge Distillation for Efficient Point Cloud Processing May 12, 2025 3D Object Recognition Autonomous Driving
Code Code Available 0Visually Interpretable Subtask Reasoning for Visual Question Answering May 12, 2025 Attribute Object Recognition
Code Code Available 0ArtRAG: Retrieval-Augmented Generation with Structured Context for Visual Art Understanding May 9, 2025 Image Captioning Object Recognition
— Unverified 0Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models May 3, 2025 Diagnostic Object Recognition
— Unverified 0Transferable Adversarial Attacks on Black-Box Vision-Language Models May 2, 2025 Image Captioning Object Recognition
— Unverified 0Zoomer: Adaptive Image Focus Optimization for Black-box MLLM Apr 30, 2025 Image Captioning Object Recognition
— Unverified 0LM-MCVT: A Lightweight Multi-modal Multi-view Convolutional-Vision Transformer Approach for 3D Object Recognition Apr 27, 2025 3D Object Recognition Object
— Unverified 0Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency Apr 24, 2025 Benchmarking Math
Code Code Available 1Disaggregated Deep Learning via In-Physics Computing at Radio Frequency Apr 24, 2025 Autonomous Navigation Deep Learning
— Unverified 0V^2R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations Apr 23, 2025 Dataset Generation Object Recognition
— Unverified 0Naturally Computed Scale Invariance in the Residual Stream of ResNet18 Apr 22, 2025 Object Recognition
Code Code Available 0Quantum Doubly Stochastic Transformers Apr 22, 2025 Inductive Bias Object Recognition
— Unverified 0Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU Simulation Apr 17, 2025 GPU Object Recognition
Code Code Available 2DVLTA-VQA: Decoupled Vision-Language Modeling with Text-Guided Adaptation for Blind Video Quality Assessment Apr 16, 2025 Language Modeling Language Modelling
— Unverified 0Visual Language Models show widespread visual deficits on neuropsychological tests Apr 15, 2025 Object Recognition Visual Reasoning
— Unverified 0MASSeg : 2nd Technical Report for 4th PVUW MOSE Track Apr 14, 2025 Data Augmentation Object
Code Code Available 0Hardware, Algorithms, and Applications of the Neuromorphic Vision Sensor: a Review Apr 11, 2025 Object Recognition Optical Flow Estimation
— Unverified 0P2Object: Single Point Supervised Object Detection and Instance Segmentation Apr 10, 2025 Instance Segmentation Multiple Instance Learning
Code Code Available 2D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition Apr 8, 2025 Image Generation Object
— Unverified 0Advancing Egocentric Video Question Answering with Multimodal Large Language Models Apr 6, 2025 Object Recognition Question Answering
— Unverified 0ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection Mar 28, 2025 Action Recognition Human-Object Interaction Detection
— Unverified 0Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users Mar 28, 2025 Object Recognition Reading Comprehension
— Unverified 0Foveated Instance Segmentation Mar 27, 2025 Instance Segmentation Object Recognition
Code Code Available 0DuckSegmentation: A segmentation model based on the AnYue Hemp Duck Dataset Mar 27, 2025 Knowledge Distillation Object Recognition
— Unverified 0Leveraging 3D Geometric Priors in 2D Rotation Symmetry Detection Mar 26, 2025 Object Recognition Symmetry Detection
— Unverified 0