Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–325 of 1723 papers

Title	Date	Tasks	Status	Hype
Uncertainty-aware Panoptic Segmentation	Jun 29, 2022	Panoptic SegmentationScene Understanding	CodeCode Available	1
MGNet: Monocular Geometric Scene Understanding for Autonomous Driving	Jun 27, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1
IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments	Jun 27, 2022	Autonomous VehiclesScene Segmentation	CodeCode Available	1
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning	Jun 21, 2022	Contrastive LearningDomain Generalization	CodeCode Available	1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning	May 31, 2022	Common Sense ReasoningGraph Generation	CodeCode Available	1
Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds	Apr 22, 2022	3D dense captioning3D Object Detection	CodeCode Available	1
P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior	Apr 5, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Online panoptic 3D reconstruction as a Linear Assignment Problem	Apr 1, 2022	3D ReconstructionImage Segmentation	CodeCode Available	1
Point Scene Understanding via Disentangled Instance Mesh Reconstruction	Mar 31, 2022	RetrievalScene Understanding	CodeCode Available	1
Collaborative Transformers for Grounded Situation Recognition	Mar 30, 2022	Grounded Situation RecognitionImage Classification	CodeCode Available	1
Learning to Answer Questions in Dynamic Audio-Visual Scenarios	Mar 26, 2022	audio-visual learningAudio-visual Question Answering	CodeCode Available	1
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering	Mar 17, 2022	Implicit RelationsQuestion Answering	CodeCode Available	1
WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection	Mar 16, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation	Mar 2, 2022	Domain AdaptationScene Understanding	CodeCode Available	1
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation	Feb 27, 2022	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning	Feb 26, 2022	3D Point Cloud ClassificationPoint Cloud Segmentation	CodeCode Available	1
RescueNet: A High Resolution UAV Semantic Segmentation Benchmark Dataset for Natural Disaster Damage Assessment	Feb 24, 2022	Scene UnderstandingSegmentation	CodeCode Available	1
ReorientBot: Learning Object Reorientation for Specific-Posed Placement	Feb 22, 2022	Motion GenerationMotion Planning	CodeCode Available	1
3DRM:Pair-wise relation module for 3D object detection	Feb 20, 2022	3D Object DetectionObject	CodeCode Available	1
SafePicking: Learning Safe Object Extraction via Object-Level Mapping	Feb 11, 2022	Motion PlanningObject	CodeCode Available	1
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics	Feb 7, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1
Global-Reasoned Multi-Task Learning Model for Surgical Scene Understanding	Jan 28, 2022	Graph AttentionKnowledge Distillation	CodeCode Available	1
MonoDistill: Learning Spatial Features for Monocular 3D Object Detection	Jan 26, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Point Cloud Pre-Training With Natural 3D Structures	Jan 1, 2022	3D Object Detectionobject-detection	CodeCode Available	1

Show:10 25 50

← PrevPage 13 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified