Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1501–1550 of 1723 papers

Title	Date	Tasks	Status
On the iterative refinement of densely connected representation levels for semantic segmentation	Apr 30, 2018	Image SegmentationScene Understanding	CodeCode Available
One model to use them all: Training a segmentation model with complementary datasets	Feb 29, 2024	AllAnatomy	CodeCode Available
Image interpretation by iterative bottom-up top-down processing	May 12, 2021	Scene Understanding	CodeCode Available
Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection	Oct 1, 2019	RGB-D Salient Object DetectionSaliency Detection	CodeCode Available
PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds	Mar 18, 2025	3D Object Detection3D Semantic Segmentation	CodeCode Available
Adapting Deep Network Features to Capture Psychological Representations	Aug 6, 2016	Object RecognitionScene Understanding	CodeCode Available
Single Image 3D Object Estimation with Primitive Graph Networks	Sep 9, 2021	Graph Neural NetworkObject	CodeCode Available
Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity	Mar 8, 2025	Depth EstimationScene Understanding	CodeCode Available
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields	Mar 17, 2024	3D ReconstructionNeRF	CodeCode Available
Unsupervised Single-shot Depth Estimation using Perceptual Reconstruction	Jan 28, 2022	3D ReconstructionDepth Estimation	CodeCode Available
Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency	Jan 1, 2024	3D visual groundingRelation	CodeCode Available
Quantitative Depth Quality Assessment of RGBD Cameras At Close Range Using 3D Printed Fixtures	Mar 21, 2019	Scene Understanding	CodeCode Available
Single Network Panoptic Segmentation for Street Scene Understanding	Feb 7, 2019	Instance SegmentationPanoptic Segmentation	CodeCode Available
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation	Feb 2, 2022	PointGoal NavigationScene Understanding	CodeCode Available
Object-aware Sound Source Localization via Audio-Visual Scene Understanding	Jan 1, 2025	Scene UnderstandingSound Source Localization	CodeCode Available
Single Shot Scene Text Retrieval	Aug 27, 2018	Image RetrievalRetrieval	CodeCode Available
Beyond Human Perception: Understanding Multi-Object World from Monocular View	Jan 1, 2025	3D visual groundingDenoising	CodeCode Available
The Ikshana Hypothesis of Human Scene Understanding	Jan 21, 2021	Representation LearningScene Understanding	CodeCode Available
Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection	Jan 25, 2019	Anomaly DetectionDecoder	CodeCode Available
DenseASPP for Semantic Segmentation in Street Scenes	Jun 1, 2018	Autonomous DrivingImage Segmentation	CodeCode Available
Weakly Supervised Segmentation on Outdoor 4D Point Clouds With Temporal Matching and Spatial Graph Propagation	Jan 1, 2022	Point Cloud SegmentationScene Understanding	CodeCode Available
Towards Improving the Generation Quality of Autoregressive Slot VAEs	Jun 3, 2022	Image GenerationObject	CodeCode Available
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences	Apr 2, 2019	3D Semantic SegmentationScene Understanding	CodeCode Available
Deep Video Deblurring for Hand-Held Cameras	Jul 1, 2017	DeblurringImage Deblurring	CodeCode Available
RCNet: Deep Recurrent Collaborative Network for Multi-View Low-Light Image Enhancement	Sep 6, 2024	Image EnhancementLow-Light Image Enhancement	CodeCode Available
Deep Video Deblurring	Nov 25, 2016	DeblurringImage Deblurring	CodeCode Available
IGFNet: Illumination-Guided Fusion Network for Semantic Scene Understanding using RGB-Thermal Images	Dec 4, 2023	Autonomous DrivingScene Understanding	CodeCode Available
Real-time 3D Traffic Cone Detection for Autonomous Driving	Feb 6, 2019	3D Object DetectionAutonomous Driving	CodeCode Available
Towards Global Localization using Multi-Modal Object-Instance Re-Identification	Sep 18, 2024	Camera LocalizationObject	CodeCode Available
Object Attribute Matters in Visual Question Answering	Dec 20, 2023	AttributeGraph Neural Network	CodeCode Available
IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments	Nov 26, 2018	Autonomous NavigationDomain Adaptation	CodeCode Available
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation	May 4, 2025	BenchmarkingFeature Upsampling	CodeCode Available
Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images	Sep 4, 2018	AttributeDynamic Time Warping	CodeCode Available
A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap	Jul 31, 2024	Human-Object Interaction DetectionImage Reconstruction	CodeCode Available
Deep Surface Normal Estimation with Hierarchical RGB-D Fusion	Apr 6, 2019	Scene UnderstandingSurface Normal Estimation	CodeCode Available
Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer	Apr 3, 2019	Deep Reinforcement LearningReinforcement Learning	CodeCode Available
Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection	Feb 15, 2019	Relationship DetectionScene Understanding	CodeCode Available
Preferences Prediction using a Gallery of Mobile Device based on Scene Recognition and Object Detection	Jul 10, 2019	Face Recognitionobject-detection	CodeCode Available
Spatial As Deep: Spatial CNN for Traffic Scene Understanding	Dec 17, 2017	Lane DetectionScene Understanding	CodeCode Available
ICGNet: A Unified Approach for Instance-Centric Grasping	Jan 18, 2024	ObjectObject Reconstruction	CodeCode Available
Non-central panorama indoor dataset	Jan 30, 2024	Scene Understanding	CodeCode Available
Weighted Intersection over Union (wIoU) for Evaluating Image Segmentation	Jul 21, 2021	Image Segmentationobject-detection	CodeCode Available
Deep Learning based Switching Filter for Impulsive Noise Removal in Color Images	Dec 3, 2019	DenoisingImage Denoising	CodeCode Available
Deep Learning--Based Scene Simplification for Bionic Vision	Jan 30, 2021	Deep LearningDepth Estimation	CodeCode Available
NextStop: An Improved Tracker For Panoptic LIDAR Segmentation Data	Jan 8, 2025	Autonomous DrivingInstance Segmentation	CodeCode Available
Neural RGB->D Sensing: Depth and Uncertainty from a Video Camera	Jan 9, 2019	3D Reconstruction3D Scene Reconstruction	CodeCode Available
Neural Radiance Field Codebooks	Jan 10, 2023	ObjectRepresentation Learning	CodeCode Available
DeepIPCv2: LiDAR-powered Robust Environmental Perception and Navigational Control for Autonomous Vehicle	Jul 13, 2023	Autonomous DrivingScene Understanding	CodeCode Available
IAM: Enhancing RGB-D Instance Segmentation with New Benchmarks	Jan 3, 2025	Data IntegrationImage Segmentation	CodeCode Available
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-modal Knowledge Transfer	Dec 12, 2023	Action RecognitionAction Segmentation	CodeCode Available

Show:10 25 50

← PrevPage 31 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified