Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1101–1150 of 1723 papers

Title	Date	Tasks	Status
Parametric Exponential Linear Unit for Deep Convolutional Neural Networks	May 30, 2016	Object RecognitionScene Understanding	—Unverified
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences	Jul 10, 2024	Multi-Task LearningScene Understanding	—Unverified
A Memory System of a Robot Cognitive Architecture and its Implementation in ArmarX	Jun 5, 2022	Scene Understanding	—Unverified
Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation	Apr 10, 2023	Panoptic SegmentationScene Understanding	—Unverified
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding	Feb 24, 2025	Question AnsweringResponse Generation	—Unverified
Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art	Apr 18, 2017	Autonomous DrivingAutonomous Vehicles	—Unverified
Compositional Scene Understanding through Inverse Generative Modeling	May 27, 2025	Scene Understanding	—Unverified
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives	Dec 30, 2024	Novel View SynthesisScene Understanding	—Unverified
Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval from a Single Image	Aug 20, 2021	RetrievalScene Understanding	—Unverified
Compositional 3D Human-Object Neural Animation	Apr 27, 2023	Human-Object Interaction DetectionNeRF	—Unverified
PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds	Feb 29, 2024	Depth EstimationDepth Prediction	—Unverified
CompNVS: Novel View Synthesis with Scene Completion	Jul 23, 2022	Novel View SynthesisScene Understanding	—Unverified
Pedestrian Travel Time Estimation in Crowded Scenes	Dec 1, 2015	BlockingScene Understanding	—Unverified
Peek-a-Boo: Occlusion Reasoning in Indoor Scenes With Plane Representations	Jun 1, 2020	Scene Understanding	—Unverified
Video Token Sparsification for Efficient Multimodal LLMs in Autonomous Driving	Sep 16, 2024	Autonomous DrivingLogical Reasoning	—Unverified
Complete 3d relationships extraction modality alignment network for 3d dense captioning	Aug 1, 2024	3D dense captioning3D Object Detection	—Unverified
PerspectiveNet: Multi-View Perception for Dynamic Scene Understanding	Oct 22, 2024	Scene UnderstandingText Generation	—Unverified
Competitive Simplicity for Multi-Task Learning for Real-Time Foggy Scene Understanding via Domain Adaptation	Dec 9, 2020	Depth EstimationDomain Adaptation	—Unverified
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly	Jun 10, 2025	Question AnsweringScene Understanding	—Unverified
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding	Jan 27, 2025	BenchmarkingCommon Sense Reasoning	—Unverified
Combining Implicit-Explicit View Correlation for Light Field Semantic Segmentation	Jan 1, 2023	Scene UnderstandingSegmentation	—Unverified
Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks	Dec 22, 2016	Boundary DetectionEdge Detection	—Unverified
4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding	Dec 6, 2021	3D Instance Segmentation3D Semantic Segmentation	—Unverified
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties	Jun 27, 2023	FrictionScene Understanding	—Unverified
PhysPose: Refining 6D Object Poses with Physical Constraints	Mar 30, 2025	6D Pose Estimation using RGBPose Estimation	—Unverified
Picture: A Probabilistic Programming Language for Scene Perception	Jun 1, 2015	3D Human Pose Estimation3D Object Reconstruction	—Unverified
CoMatcher: Multi-View Collaborative Feature Matching	Apr 2, 2025	Scene Understandingset matching	—Unverified
Cognitive Interpretation of Everyday Activities: Toward Perceptual Narrative Based Visuo-Spatial Scene Interpretation	Jun 22, 2013	PositionScene Understanding	—Unverified
Places: An Image Database for Deep Scene Understanding	Oct 6, 2016	BIG-bench Machine LearningClassification	—Unverified
COFGA: Classification Of Fine-Grained Features In Aerial Images	Aug 27, 2018	ClassificationGeneral Classification	—Unverified
PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN	Mar 29, 2021	Instance SegmentationScene Understanding	—Unverified
3D Vision-Language Gaussian Splatting	Oct 10, 2024	3D ReconstructionAutonomous Driving	—Unverified
Plausible Uncertainties for Human Pose Regression	Jan 1, 2023	Autonomous DrivingPose Estimation	—Unverified
PointCA: Evaluating the Robustness of 3D Point Cloud Completion Models Against Adversarial Examples	Nov 22, 2022	Adversarial AttackPoint Cloud Classification	—Unverified
CodeMapping: Real-Time Dense Mapping for Sparse SLAM using Compact Scene Representations	Jul 19, 2021	3D ReconstructionDepth Estimation	—Unverified
Co-AttenDWG: Co-Attentive Dimension-Wise Gating and Expert Fusion for Multi-Modal Offensive Content Detection	May 25, 2025	cross-modal alignmentScene Understanding	—Unverified
ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings	Mar 29, 2020	Autonomous DrivingClustering	—Unverified
CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction	Apr 22, 2024	3D Point Cloud ClassificationAutonomous Vehicles	—Unverified
Cloud-Device Collaborative Learning for Multimodal Large Language Models	Dec 26, 2023	Device-Cloud CollaborationKnowledge Distillation	—Unverified
Clock-Modeled Ternary Spatial Relations for Visual Scene Analysis	Mar 1, 2013	Scene Understanding	—Unverified
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP	Mar 8, 2023	Scene UnderstandingSemantic Segmentation	—Unverified
Polarimetric Spatio-Temporal Light Transport Probing	May 25, 2021	MetamerismScene Understanding	—Unverified
Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization	Jan 13, 2024	Pseudo LabelRepresentation Learning	—Unverified
Classification of Geographical Land Structure Using Convolution Neural Network and Transfer Learning	Nov 19, 2024	Scene UnderstandingTransfer Learning	—Unverified
Classification of Aerial Photogrammetric 3D Point Clouds	May 23, 2017	ClassificationGeneral Classification	—Unverified
Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments	Mar 21, 2017	Motion PlanningScene Understanding	—Unverified
3D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing	Aug 25, 2024	Data AugmentationDiversity	—Unverified
Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation	Sep 30, 2024	Image RetrievalScene Understanding	—Unverified
PoSeg: Pose-Aware Refinement Network for Human Instance Segmentation	Jan 7, 2020	Human Instance SegmentationInstance Segmentation	—Unverified
PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation	Jun 28, 2024	DecoderImage Segmentation	—Unverified

Show:10 25 50

← PrevPage 23 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified