SDOF-Tracker: Fast and Accurate Multiple Human Tracking by Skipped-Detection and Optical-Flow Jun 27, 2021 Human Detection Optical Flow Estimation
Code Code Available 0Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation Aug 21, 2024 3D Semantic Segmentation Data Augmentation
Code Code Available 0CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Sep 19, 2024 Audio captioning Language Modeling
Code Code Available 0Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation Apr 12, 2018 Optical Flow Estimation Scene Flow Estimation
Code Code Available 0Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment Jun 12, 2024 3D Reconstruction Scene Understanding
Code Code Available 0Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamics Audio-Visual Scenarios May 21, 2023 Audio-visual Question Answering Audio-Visual Question Answering (AVQA)
Code Code Available 0Aerial Scene Understanding in The Wild: Multi-Scene Recognition via Prototype-based Memory Networks Apr 22, 2021 Retrieval Scene Recognition
Code Code Available 0Task-Aware Asynchronous Multi-Task Model with Class Incremental Contrastive Learning for Surgical Scene Understanding Nov 28, 2022 Contrastive Learning Decision Making
Code Code Available 0Evaluating Compositional Scene Understanding in Multimodal Generative Models Mar 29, 2025 Scene Understanding
Code Code Available 0VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning Mar 5, 2023 Answer Generation Entity Alignment
Code Code Available 0ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation Oct 9, 2017 GPU Real-Time Semantic Segmentation
Code Code Available 0ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding Jul 28, 2024 Contrastive Learning Intention-oriented Segmentation
Code Code Available 0SeGAN: Segmenting and Generating the Invisible Mar 29, 2017 Depth Estimation Scene Understanding
Code Code Available 0Artificial Color Constancy via GoogLeNet with Angular Loss Function Nov 20, 2018 Color Constancy Object Recognition
Code Code Available 0Adaptive Visual Scene Understanding: Incremental Scene Graph Generation Oct 2, 2023 Benchmarking Continual Learning
Code Code Available 0Temporally Consistent Horizon Lines Jul 23, 2019 3D Reconstruction Autonomous Vehicles
Code Code Available 0CARL-D: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentation Feb 17, 2022 2D Object Detection Autonomous Driving
Code Code Available 0Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions Feb 13, 2018 BIG-bench Machine Learning Management
Code Code Available 0Efficient ConvNet for Real-time Semantic Segmentation Jun 1, 2017 GPU Real-Time Semantic Segmentation
Code Code Available 0Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence May 22, 2019 Optical Flow Estimation Scene Understanding
Code Code Available 0Segmenting the Future Apr 24, 2019 Autonomous Driving Decision Making
Code Code Available 0Learning Regional Purity for Instance Segmentation on 3D Point Clouds Nov 3, 2020 3D Instance Segmentation 3D Semantic Segmentation
Code Code Available 0SeG-SR: Integrating Semantic Knowledge into Remote Sensing Image Super-Resolution via Vision-Language Model May 29, 2025 Image Super-Resolution Language Modeling
Code Code Available 0Learning Panoptic Segmentation from Instance Contours Oct 16, 2020 Clustering Instance Segmentation
Code Code Available 0Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning Nov 26, 2024 Object object-detection
Code Code Available 0Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding Apr 20, 2025 Autonomous Driving Image Captioning
Code Code Available 0Efficient Computation Sharing for Multi-Task Visual Scene Understanding Mar 16, 2023 Multi-Task Learning Scene Understanding
Code Code Available 0DualMLP: a two-stream fusion model for 3D point cloud classification Oct 10, 2023 3D Point Cloud Classification Point Cloud Classification
Code Code Available 0Road Scene Understanding by Occupancy Grid Learning from Sparse Radar Clusters using Semantic Segmentation Mar 31, 2019 Autonomous Driving road scene understanding
Code Code Available 0Self-Supervised Partial Cycle-Consistency for Multi-View Matching Jan 10, 2025 Scene Understanding
Code Code Available 0Learning Monocular Depth by Distilling Cross-domain Stereo Networks Aug 20, 2018 Autonomous Driving Depth Estimation
Code Code Available 0Boundary-Seeking Generative Adversarial Networks Feb 27, 2017 Scene Understanding Text Generation
Code Code Available 0Dual-Glance Model for Deciphering Social Relationships Aug 2, 2017 model object-detection
Code Code Available 0Self-Supervised Road Layout Parsing with Graph Auto-Encoding Mar 21, 2022 Image Reconstruction Scene Understanding
Code Code Available 0Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors May 30, 2025 3D geometry Large Language Model
Code Code Available 0Self-supervised Vision Transformers for 3D Pose Estimation of Novel Objects May 31, 2023 3D Pose Estimation Contrastive Learning
Code Code Available 0Zoom in on the Plant: Fine-grained Analysis of Leaf, Stem and Vein Instances Dec 14, 2023 Scene Understanding
Code Code Available 0Language-based Colorization of Scene Sketches Nov 17, 2019 Colorization Image Generation
Code Code Available 0Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning Sep 16, 2021 Decoder Image Captioning
Code Code Available 0Adversarial Attacks on Monocular Pose Estimation Jul 14, 2022 Depth Estimation Monocular Depth Estimation
Code Code Available 0Visually Grounded VQA by Lattice-based Retrieval Nov 15, 2022 Information Retrieval Question Answering
Code Code Available 0The ADUULM-360 Dataset -- A Multi-Modal Dataset for Depth Estimation in Adverse Weather Nov 18, 2024 Autonomous Driving Depth Estimation
Code Code Available 0DRRNet: Macro-Micro Feature Fusion and Dual Reverse Refinement for Camouflaged Object Detection May 14, 2025 object-detection Object Detection
Code Code Available 0Doubly Contrastive End-to-End Semantic Segmentation for Autonomous Driving under Adverse Weather Nov 21, 2022 Autonomous Driving GPU
Code Code Available 0A Review on Deep Learning Techniques Applied to Semantic Segmentation Apr 22, 2017 Autonomous Driving Deep Learning
Code Code Available 0Semantic Foreground Inpainting from Weak Supervision Sep 10, 2019 Scene Understanding Semantic Segmentation
Code Code Available 0BOLD5000: A public fMRI dataset of 5000 images Sep 5, 2018 Diversity Scene Understanding
Code Code Available 0DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding Mar 25, 2024 Decoder Object
Code Code Available 0UniNet: A Unified Scene Understanding Network and Exploring Multi-Task Relationships through the Lens of Adversarial Attacks Aug 10, 2021 Depth Estimation Depth Prediction
Code Code Available 0Knowledge-Guided Object Discovery with Acquired Deep Impressions Mar 19, 2021 Object Object Discovery
Code Code Available 0