Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–375 of 1723 papers

Title	Date	Tasks	Status	Hype
Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality	Mar 11, 2021	Scene UnderstandingTime Series	CodeCode Available	1
Global-Reasoned Multi-Task Learning Model for Surgical Scene Understanding	Jan 28, 2022	Graph AttentionKnowledge Distillation	CodeCode Available	1
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision	Apr 3, 2025	3D Object Detectioncross-modal alignment	CodeCode Available	1
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks	Feb 17, 2023	DeblurringDeep Learning	CodeCode Available	1
F-ViTA: Foundation Model Guided Visible to Thermal Translation	Apr 3, 2025	Scene UnderstandingStyle Transfer	CodeCode Available	1
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding	Nov 9, 2015	Decision MakingDecoder	CodeCode Available	1
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection	Jul 30, 2021	3D Object Detectionobject-detection	CodeCode Available	1
Multi-view 3D Object Reconstruction and Uncertainty Modelling with Neural Shape Prior	Jun 17, 2023	3D Object ReconstructionObject	CodeCode Available	1
Distilled Semantics for Comprehensive Scene Understanding from Videos	Mar 31, 2020	Depth EstimationKnowledge Distillation	CodeCode Available	1
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization	Aug 24, 2021	DiversityGraph Neural Network	CodeCode Available	1
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection	Dec 4, 2021	3D Object DetectionObject	CodeCode Available	1
NODIS: Neural Ordinary Differential Scene Understanding	Jan 14, 2020	AllGraph Generation	CodeCode Available	1
OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation	Jul 28, 2023	Autonomous DrivingScene Understanding	CodeCode Available	1
DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric Voxelization	Apr 30, 2023	DecoderNeRF	CodeCode Available	1
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny Objects	Mar 27, 2018	General ClassificationObject	CodeCode Available	1
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints	Apr 2, 2020	3D ReconstructionDepth Estimation	CodeCode Available	1
AeroRIT: A New Scene for Hyperspectral Image Analysis	Dec 17, 2019	Hyperspectral image analysisImage Super-Resolution	CodeCode Available	1
FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud Segmentation	Mar 1, 2021	3D Semantic SegmentationDecoder	CodeCode Available	1
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	Nov 17, 2022	3D Object DetectionDepth Estimation	CodeCode Available	1
One-Shot Object Affordance Detection in the Wild	Aug 8, 2021	Action RecognitionAffordance Detection	CodeCode Available	1
FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions	Oct 4, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP	Jan 12, 2023	3D Semantic SegmentationContrastive Learning	CodeCode Available	1
Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration	Dec 17, 2024	audio-visual event localizationaudio-visual learning	CodeCode Available	1
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation	Dec 22, 2021	Common Sense ReasoningQuestion Answering	CodeCode Available	1
FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving	Aug 14, 2023	Autonomous DrivingOptical Flow Estimation	CodeCode Available	1

Show:10 25 50

← PrevPage 15 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified