Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–225 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction	May 9, 2024	Contrastive LearningScene Understanding	CodeCode Available	1	5
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering	Jul 30, 2024	Inverse RenderingNeRF	CodeCode Available	1	5
Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks	Aug 17, 2021	3D Instance SegmentationInstance Segmentation	CodeCode Available	1	5
Learning How To Robustly Estimate Camera Pose in Endoscopic Videos	Apr 17, 2023	3D ReconstructionCamera Pose Estimation	CodeCode Available	1	5
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving Environments	Sep 22, 2020	Domain AdaptationScene Understanding	CodeCode Available	1	5
Dynamic Graph Message Passing Networks	Aug 19, 2019	Image Classificationobject-detection	CodeCode Available	1	5
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model	Mar 30, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	1	5
Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding	Nov 29, 2024	3D geometry3DGS	CodeCode Available	1	5
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments	Jul 10, 2022	Instance SegmentationPanoptic Segmentation	CodeCode Available	1	5
IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal Estimation	Dec 20, 2019	Disparity EstimationScene Understanding	CodeCode Available	1	5
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning	Mar 10, 2025	ObjectScene Understanding	CodeCode Available	1	5
Boundary-induced and scene-aggregated network for monocular depth prediction	Feb 26, 2021	Depth EstimationDepth Prediction	CodeCode Available	1	5
Holistic 3D Scene Understanding from a Single Image with Implicit Representation	Mar 11, 2021	3D Object Detection3D Shape Reconstruction	CodeCode Available	1	5
HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans	Sep 12, 2023	3D Object Retrieval3D Scene Reconstruction	CodeCode Available	1	5
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge	May 31, 2019	object-detectionObject Detection	CodeCode Available	1	5
Segmenting Known Objects and Unseen Unknowns without Prior Knowledge	Sep 12, 2022	Instance SegmentationObject Detection	CodeCode Available	1	5
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge	Nov 21, 2023	Large Language ModelMultimodal Deep Learning	CodeCode Available	1	5
EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery	Jan 20, 2025	Language ModelingLanguage Modelling	CodeCode Available	1	5
Estimating Generic 3D Room Structures from 2D Annotations	Jun 15, 2023	Scene Understanding	CodeCode Available	1	5
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models	May 15, 2023	3D Object DetectionImage Captioning	CodeCode Available	1	5
AVSegFormer: Audio-Visual Segmentation with Transformer	Jul 3, 2023	DecoderScene Understanding	CodeCode Available	1	5
Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussian	Aug 7, 2024	Autonomous Drivingobject-detection	CodeCode Available	1	5
Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis	Mar 9, 2021	3d scene graph generationgraph construction	CodeCode Available	1	5
Explainable Object-induced Action Decision for Autonomous Vehicles	Mar 20, 2020	Autonomous DrivingAutonomous Vehicles	CodeCode Available	1	5
Human-centric Scene Understanding for 3D Large-scale Scenarios	Jul 26, 2023	Action RecognitionScene Understanding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 9 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified