Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 1723 papers

Title	Date	Tasks	Status	Hype
Uncertainty-aware Panoptic Segmentation	Jun 29, 2022	Panoptic SegmentationScene Understanding	CodeCode Available	1
IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments	Jun 27, 2022	Autonomous VehiclesScene Segmentation	CodeCode Available	1
MGNet: Monocular Geometric Scene Understanding for Autonomous Driving	Jun 27, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning	Jun 21, 2022	Contrastive LearningDomain Generalization	CodeCode Available	1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning	May 31, 2022	Common Sense ReasoningGraph Generation	CodeCode Available	1
Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds	Apr 22, 2022	3D dense captioning3D Object Detection	CodeCode Available	1
P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior	Apr 5, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Online panoptic 3D reconstruction as a Linear Assignment Problem	Apr 1, 2022	3D ReconstructionImage Segmentation	CodeCode Available	1
Point Scene Understanding via Disentangled Instance Mesh Reconstruction	Mar 31, 2022	RetrievalScene Understanding	CodeCode Available	1
Collaborative Transformers for Grounded Situation Recognition	Mar 30, 2022	Grounded Situation RecognitionImage Classification	CodeCode Available	1
Learning to Answer Questions in Dynamic Audio-Visual Scenarios	Mar 26, 2022	audio-visual learningAudio-visual Question Answering	CodeCode Available	1
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering	Mar 17, 2022	Implicit RelationsQuestion Answering	CodeCode Available	1
WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection	Mar 16, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation	Mar 2, 2022	Domain AdaptationScene Understanding	CodeCode Available	1
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation	Feb 27, 2022	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning	Feb 26, 2022	3D Point Cloud ClassificationPoint Cloud Segmentation	CodeCode Available	1
RescueNet: A High Resolution UAV Semantic Segmentation Benchmark Dataset for Natural Disaster Damage Assessment	Feb 24, 2022	Scene UnderstandingSegmentation	CodeCode Available	1
ReorientBot: Learning Object Reorientation for Specific-Posed Placement	Feb 22, 2022	Motion GenerationMotion Planning	CodeCode Available	1
3DRM:Pair-wise relation module for 3D object detection	Feb 20, 2022	3D Object DetectionObject	CodeCode Available	1
SafePicking: Learning Safe Object Extraction via Object-Level Mapping	Feb 11, 2022	Motion PlanningObject	CodeCode Available	1
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics	Feb 7, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1
Global-Reasoned Multi-Task Learning Model for Surgical Scene Understanding	Jan 28, 2022	Graph AttentionKnowledge Distillation	CodeCode Available	1
MonoDistill: Learning Spatial Features for Monocular 3D Object Detection	Jan 26, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Point Cloud Pre-Training With Natural 3D Structures	Jan 1, 2022	3D Object Detectionobject-detection	CodeCode Available	1
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation	Dec 27, 2021	Computational EfficiencyInstance Segmentation	CodeCode Available	1
Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation	Dec 24, 2021	Depth EstimationDepth Prediction	CodeCode Available	1
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation	Dec 22, 2021	Common Sense ReasoningQuestion Answering	CodeCode Available	1
ScanQA: 3D Question Answering for Spatial Scene Understanding	Dec 20, 2021	3D Question Answering (3D-QA)Object	CodeCode Available	1
Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation	Dec 16, 2021	Feature ImportanceScene Understanding	CodeCode Available	1
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation	Dec 5, 2021	Depth-aware Video Panoptic SegmentationDepth Estimation	CodeCode Available	1
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection	Dec 4, 2021	3D Object DetectionObject	CodeCode Available	1
AirObject: A Temporally Evolving Graph Embedding for Object Identification	Nov 30, 2021	Graph AttentionGraph Embedding	CodeCode Available	1
Instance-wise Occlusion and Depth Orders in Natural Scenes	Nov 29, 2021	Depth EstimationDepth Prediction	CodeCode Available	1
Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing	Nov 24, 2021	AttributeScene Understanding	CodeCode Available	1
Grounded Situation Recognition with Transformers	Nov 19, 2021	DecoderGrounded Situation Recognition	CodeCode Available	1
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data	Nov 17, 2021	3D Object Detectionobject-detection	CodeCode Available	1
Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views	Nov 13, 2021	ObjectScene Understanding	CodeCode Available	1
Panoptic 3D Scene Reconstruction From a Single RGB Image	Nov 3, 2021	2D Panoptic Segmentation3D Instance Segmentation	CodeCode Available	1
3DP3: 3D Scene Perception via Probabilistic Programming	Oct 30, 2021	ObjectPose Estimation	CodeCode Available	1
A Versatile and Efficient Reinforcement Learning Framework for Autonomous Driving	Oct 22, 2021	Autonomous Drivingreinforcement-learning	CodeCode Available	1
PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image	Oct 21, 2021	DecoderDepth Estimation	CodeCode Available	1
Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images	Oct 5, 2021	Autonomous NavigationLane Detection	CodeCode Available	1
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D	Sep 28, 2021	Multiple Object TrackingNovel View Synthesis	CodeCode Available	1
Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds	Sep 23, 2021	3D Semantic Scene Completion3D Semantic Segmentation	CodeCode Available	1
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation	Sep 20, 2021	DecoderPrediction	CodeCode Available	1
PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds	Sep 12, 2021	object-detectionObject Detection	CodeCode Available	1
Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds	Sep 1, 2021	3D Object Detection3D Point Cloud Classification	CodeCode Available	1
From General to Specific: Informative Scene Graph Generation via Balance Adjustment	Aug 30, 2021	BlockingGraph Generation	CodeCode Available	1
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization	Aug 24, 2021	DiversityGraph Neural Network	CodeCode Available	1

Show:10 25 50

← PrevPage 7 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified