Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 476–500 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
Semantic Segmentation-Assisted Instance Feature Fusion for Multi-Level 3D Part Instance Segmentation	Aug 9, 2022	3D Instance Segmentation3D Part Segmentation	CodeCode Available	1	5
Panoptic Video Scene Graph Generation	Nov 28, 2023	Graph GenerationPanoptic Scene Graph Generation	CodeCode Available	1	5
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning	Jun 21, 2022	Contrastive LearningDomain Generalization	CodeCode Available	1	5
Towards Efficient Scene Understanding via Squeeze Reasoning	Nov 6, 2020	Instance Segmentationobject-detection	CodeCode Available	1	5
Predicting Deeper into the Future of Semantic Segmentation	Mar 22, 2017	AttributeAutonomous Driving	CodeCode Available	0	5
Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment	Jun 12, 2024	3D ReconstructionScene Understanding	CodeCode Available	0	5
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding	Apr 20, 2025	Autonomous DrivingImage Captioning	CodeCode Available	0	5
Pose-aware Multi-level Feature Network for Human Object Interaction Detection	Sep 18, 2019	Human-Object Interaction DetectionObject	CodeCode Available	0	5
Planning Safety Trajectories with Dual-Phase, Physics-Informed, and Transportation Knowledge-Driven Large Language Models	Apr 6, 2025	Computational EfficiencyGeneral Knowledge	CodeCode Available	0	5
Evaluating Compositional Scene Understanding in Multimodal Generative Models	Mar 29, 2025	Scene Understanding	CodeCode Available	0	5
A Review on Deep Learning Techniques Applied to Semantic Segmentation	Apr 22, 2017	Autonomous DrivingDeep Learning	CodeCode Available	0	5
ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation	Oct 9, 2017	GPUReal-Time Semantic Segmentation	CodeCode Available	0	5
Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video	May 27, 2019	Inductive BiasModel Predictive Control	CodeCode Available	0	5
PENet: A Joint Panoptic Edge Detection Network	Mar 15, 2023	Edge DetectionMulti-Task Learning	CodeCode Available	0	5
CARL-D: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentation	Feb 17, 2022	2D Object DetectionAutonomous Driving	CodeCode Available	0	5
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding	Oct 19, 2024	Autonomous Drivingobject-detection	CodeCode Available	0	5
Parsing Geometry Using Structure-Aware Shape Templates	Aug 3, 2018	ObjectObject Recognition	CodeCode Available	0	5
Parsing Natural Scenes and Natural Language with Recursive Neural Networks	Jun 1, 2011	General ClassificationScene Classification	CodeCode Available	0	5
Panoramic Depth Estimation via Supervised and Unsupervised Learning in Indoor Scenes	Aug 18, 2021	Camera CalibrationDepth Estimation	CodeCode Available	0	5
PanoRecon: Real-Time Panoptic 3D Reconstruction from Monocular Video	Jan 1, 2024	3D Panoptic Segmentation3D Reconstruction	CodeCode Available	0	5
OVeNet: Offset Vector Network for Semantic Segmentation	Mar 25, 2023	Optical Character Recognition (OCR)Scene Understanding	CodeCode Available	0	5
OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies	Dec 31, 2024	3DGS3D Semantic Segmentation	CodeCode Available	0	5
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding	Jul 10, 2025	Scene UnderstandingSpatial Reasoning	CodeCode Available	0	5
P2AT: Pyramid Pooling Axial Transformer for Real-time Semantic Segmentation	Oct 23, 2023	Autonomous DrivingDecoder	CodeCode Available	0	5
Parallel Neural Computing for Scene Understanding from LiDAR Perception in Autonomous Racing	Dec 24, 2024	Autonomous DrivingAutonomous Racing	CodeCode Available	0	5

Show:10 25 50

← PrevPage 20 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified