Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–850 of 1723 papers

Title	Date	Tasks	Status
Deep ensembles based on Stochastic Activation Selection for Polyp Segmentation	Apr 2, 2021	Autonomous DrivingDecoder	—Unverified
A Multiple-View Geometric Model for Specularity Prediction on General Curved Surfaces	Aug 20, 2021	3D ReconstructionPrediction	—Unverified
Learning to Detect Human-Object Interactions With Knowledge	Jun 1, 2019	Human-Object Interaction DetectionObject	—Unverified
Learning to Exploit Stability for 3D Scene Parsing	Dec 1, 2018	Scene ParsingScene Understanding	—Unverified
Learning to Interpret and Describe Abstract Scenes	May 1, 2015	AttributeImage Retrieval	—Unverified
Multimodal 3D Object Detection on Unseen Domains	Apr 17, 2024	3D Object DetectionAutonomous Driving	—Unverified
IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement	Jun 29, 2021	2D Semantic Segmentation3D Semantic Scene Completion	—Unverified
Image-to-Height Domain Translation for Synthetic Aperture Sonar	Dec 12, 2021	Generative Adversarial NetworkScene Understanding	—Unverified
Deep cross-domain building extraction for selective depth estimation from oblique aerial imagery	Apr 23, 2018	3D ReconstructionDepth Estimation	—Unverified
Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction	Jan 2, 2025	Instance SegmentationScene Understanding	—Unverified
Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization	Jul 27, 2015	Scene UnderstandingSemantic Similarity	—Unverified
An Exemplar-based CRF for Multi-instance Object Segmentation	Jun 1, 2014	Instance SegmentationObject	—Unverified
Leveraging Auxiliary Text for Deep Recognition of Unseen Visual Relationships	Oct 27, 2019	Graph GenerationRelationship Detection	—Unverified
Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems	Jun 17, 2025	Autonomous DrivingImage Segmentation	—Unverified
Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation	Nov 3, 2023	3D Semantic SegmentationPoint Cloud Segmentation	—Unverified
A Comprehensive Review of Modern Object Segmentation Approaches	Jan 13, 2023	Image SegmentationObject	—Unverified
Image Parsing with Stochastic Scene Grammar	Dec 1, 2011	ClusteringScene Labeling	—Unverified
Deep Contextual Attention for Human-Object Interaction Detection	Oct 17, 2019	Human-Object Interaction DetectionObject	—Unverified
Lifting GIS Maps into Strong Geometric Context for Scene Understanding	Jul 14, 2015	Depth Estimationobject-detection	—Unverified
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding	Mar 16, 2016	ObjectScene Understanding	—Unverified
Image-Graph-Image Translation via Auto-Encoding	Dec 10, 2020	Scene UnderstandingTranslation	—Unverified
A model of saliency-based visual attention for rapid scene analysis	Nov 1, 1998	Saliency PredictionScene Understanding	—Unverified
Multimodal 3D Reasoning Segmentation with Complex Scenes	Nov 21, 2024	Reasoning SegmentationScene Understanding	—Unverified
LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment	Feb 27, 2024	Scene Understanding	—Unverified
Living in a Material World: Learning Material Properties from Full-Waveform Flash Lidar Data for Semantic Segmentation	May 7, 2023	Scene UnderstandingSemantic Segmentation	—Unverified
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding	Jun 16, 2019	Caption GenerationImage Captioning	—Unverified
Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks	Aug 23, 2021	Face RecognitionObject Recognition	—Unverified
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness	Sep 26, 2024	3D Question Answering (3D-QA)Position	—Unverified
LLaVA-4D: Embedding SpatioTemporal Prompt into LMMs for 4D Scene Understanding	May 18, 2025	Scene Understanding	—Unverified
IM2CAD	Aug 18, 2016	Scene Understanding	—Unverified
A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features	Jan 17, 2025	Language ModelingLanguage Modelling	—Unverified
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding	Apr 5, 2022	Autonomous VehiclesScene Understanding	—Unverified
AVD2: Accident Video Diffusion for Accident Video Description	Feb 20, 2025	Autonomous DrivingScene Understanding	—Unverified
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation	Feb 14, 2024	DecoderObject	—Unverified
Identifying First-person Camera Wearers in Third-person Videos	Apr 20, 2017	Activity RecognitionObject Tracking	—Unverified
AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving	Sep 12, 2023	Autonomous DrivingBenchmarking	—Unverified
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation	Jun 17, 2023	Decision MakingInstruction Following	—Unverified
Long Range Pooling for 3D Large-Scale Scene Understanding	Jan 17, 2023	Scene Understanding	—Unverified
DAWN: Vehicle Detection in Adverse Weather Nature Dataset	Aug 12, 2020	Autonomous DrivingScene Understanding	—Unverified
MOSE: Monocular Semantic Reconstruction Using NeRF-Lifted Noisy Priors	Sep 21, 2024	2D Semantic Segmentation3D Semantic Segmentation	—Unverified
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs	Jun 5, 2025	cross-modal alignmentDense Captioning	—Unverified
Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation	Mar 25, 2023	Domain AdaptationERP	—Unverified
DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation	May 28, 2025	Autonomous NavigationRAG	—Unverified
Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding	Aug 1, 2023	3D geometry3D Open-Vocabulary Instance Segmentation	—Unverified
Data-Driven Scene Understanding with Adaptively Retrieved Exemplars	Feb 3, 2015	Scene UnderstandingSemantic Segmentation	—Unverified
LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition	Jul 9, 2024	Instruction FollowingRepresentation Learning	—Unverified
A Variational Observation Model of 3D Object for Probabilistic Semantic SLAM	Sep 14, 2018	Bayesian InferenceObject	—Unverified
Movies2Scenes: Using Movie Metadata to Learn Scene Representation	Feb 22, 2022	Contrastive LearningScene Understanding	—Unverified
HyKo: A Spectral Dataset for Scene Understanding	Oct 22, 2017	Autonomous DrivingScene Understanding	—Unverified
A Comparative Evaluation of Approximate Probabilistic Simulation and Deep Neural Networks as Accounts of Human Physical Scene Understanding	May 4, 2016	Scene Understanding	—Unverified

Show:10 25 50

← PrevPage 17 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified