Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–475 of 1723 papers

Title	Date	Tasks	Status	Hype
The Cityscapes Dataset for Semantic Urban Scene Understanding	Apr 6, 2016	object-detectionObject Detection	CodeCode Available	1
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs	Mar 25, 2025	BenchmarkingScene Segmentation	CodeCode Available	1
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation	Jul 9, 2020	Scene Understanding	CodeCode Available	1
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding	Apr 16, 2020	Human Part SegmentationPanoptic Segmentation	CodeCode Available	1
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments	Jul 10, 2022	Instance SegmentationPanoptic Segmentation	CodeCode Available	1
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks	Mar 28, 2020	3D Medical Imaging SegmentationAction Recognition	CodeCode Available	1
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection	Jul 30, 2021	3D Object Detectionobject-detection	CodeCode Available	1
F-ViTA: Foundation Model Guided Visible to Thermal Translation	Apr 3, 2025	Scene UnderstandingStyle Transfer	CodeCode Available	1
Egocentric Scene Understanding via Multimodal Spatial Rectifier	Jul 14, 2022	Scene UnderstandingSurface Normal Estimation	CodeCode Available	1
Towards In-context Scene Understanding	Jun 2, 2023	Depth EstimationIn-Context Learning	CodeCode Available	1
CamContextI2V: Context-aware Controllable Video Generation	Apr 8, 2025	DiversityScene Understanding	CodeCode Available	1
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data	Nov 17, 2021	3D Object Detectionobject-detection	CodeCode Available	1
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding	Jan 14, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation	Dec 24, 2021	Depth EstimationDepth Prediction	CodeCode Available	1
TPSeNCE: Towards Artifact-Free Realistic Rain Generation for Deraining and Object Detection in Rain	Nov 1, 2023	Contrastive LearningImage-to-Image Translation	CodeCode Available	1
From General to Specific: Informative Scene Graph Generation via Balance Adjustment	Aug 30, 2021	BlockingGraph Generation	CodeCode Available	1
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding	Mar 14, 2024	Contrastive LearningRepresentation Learning	CodeCode Available	1
Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs	Apr 17, 2025	3D geometry3DGS	CodeCode Available	1
Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene	Aug 11, 2020	Instance SegmentationPoint Cloud Segmentation	CodeCode Available	1
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation	Feb 27, 2022	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
TransRadar: Adaptive-Directional Transformer for Real-Time Multi-View Radar Semantic Segmentation	Oct 3, 2023	Autonomous DrivingScene Understanding	CodeCode Available	1
Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks	Aug 17, 2021	3D Instance SegmentationInstance Segmentation	CodeCode Available	1
Uncertainty-aware Panoptic Segmentation	Jun 29, 2022	Panoptic SegmentationScene Understanding	CodeCode Available	1
Uncertainty-Driven Active Vision for Implicit Scene Reconstruction	Oct 3, 2022	Scene Understanding	CodeCode Available	1
Microsoft COCO: Common Objects in Context	May 1, 2014	Instance SegmentationObject	CodeCode Available	1

Show:10 25 50

← PrevPage 19 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified