Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–450 of 1723 papers

Title	Date	Tasks	Status	Hype
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models	Jul 23, 2022	Scene Understanding	CodeCode Available	1
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection	Jul 30, 2021	3D Object Detectionobject-detection	CodeCode Available	1
Semantic Segmentation-Assisted Instance Feature Fusion for Multi-Level 3D Part Instance Segmentation	Aug 9, 2022	3D Instance Segmentation3D Part Segmentation	CodeCode Available	1
From General to Specific: Informative Scene Graph Generation via Balance Adjustment	Aug 30, 2021	BlockingGraph Generation	CodeCode Available	1
SemSegDepth: A Combined Model for Semantic Segmentation and Depth Completion	Sep 1, 2022	Depth CompletionScene Understanding	CodeCode Available	1
Global Aggregation then Local Distribution in Fully Convolutional Networks	Sep 16, 2019	Instance Segmentationobject-detection	CodeCode Available	1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving	May 13, 2025	3D visual groundingAutonomous Driving	CodeCode Available	1
FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding	Dec 5, 2020	image-classificationImage Classification	CodeCode Available	1
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction	May 9, 2024	Contrastive LearningScene Understanding	CodeCode Available	1
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding	Apr 16, 2020	Human Part SegmentationPanoptic Segmentation	CodeCode Available	1
Dual-Hybrid Attention Network for Specular Highlight Removal	Jul 17, 2024	highlight removalObject Recognition	CodeCode Available	1
FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving	Aug 14, 2023	Autonomous DrivingOptical Flow Estimation	CodeCode Available	1
Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds	Sep 1, 2021	3D Object Detection3D Point Cloud Classification	CodeCode Available	1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning	May 31, 2022	Common Sense ReasoningGraph Generation	CodeCode Available	1
Dynamic Graph Message Passing Networks	Aug 19, 2019	Image Classificationobject-detection	CodeCode Available	1
Dynamic Graph Message Passing Networks for Visual Recognition	Sep 20, 2022	image-classificationImage Classification	CodeCode Available	1
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models	May 15, 2023	3D Object DetectionImage Captioning	CodeCode Available	1
A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed Images	Feb 16, 2021	Decision MakingScene Understanding	CodeCode Available	1
Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild	Jul 23, 2020	Few-Shot Object DetectionMeta-Learning	CodeCode Available	1
Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation	Mar 8, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud Segmentation	Mar 1, 2021	3D Semantic SegmentationDecoder	CodeCode Available	1
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data	Nov 17, 2021	3D Object Detectionobject-detection	CodeCode Available	1
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding	Jan 14, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation	Dec 24, 2021	Depth EstimationDepth Prediction	CodeCode Available	1
FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions	Oct 4, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1

Show:10 25 50

← PrevPage 18 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified