Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise Binarization	Jul 22, 2022	3D Instance Segmentation3D Object Detection	CodeCode Available	1	5
3DRM:Pair-wise relation module for 3D object detection	Feb 20, 2022	3D Object DetectionObject	CodeCode Available	1	5
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction	May 9, 2024	Contrastive LearningScene Understanding	CodeCode Available	1	5
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D	Sep 28, 2021	Multiple Object TrackingNovel View Synthesis	CodeCode Available	1	5
LayoutMP3D: Layout Annotation of Matterport3D	Mar 30, 2020	Scene Understanding	CodeCode Available	1	5
Instance-wise Occlusion and Depth Orders in Natural Scenes	Nov 29, 2021	Depth EstimationDepth Prediction	CodeCode Available	1	5
Distilled Semantics for Comprehensive Scene Understanding from Videos	Mar 31, 2020	Depth EstimationKnowledge Distillation	CodeCode Available	1	5
Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images	Aug 6, 2021	Depth EstimationPanoptic Segmentation	CodeCode Available	1	5
DIP: Unsupervised Dense In-Context Post-training of Visual Representations	Jun 23, 2025	GPUMeta-Learning	CodeCode Available	1	5
Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks	Aug 17, 2021	3D Instance SegmentationInstance Segmentation	CodeCode Available	1	5
Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping	Apr 1, 2024	image-classificationImage Classification	CodeCode Available	1	5
Digging Into Self-Supervised Monocular Depth Estimation	Jun 4, 2018	Camera Pose EstimationDepth Estimation	CodeCode Available	1	5
RGB-D Indiscernible Object Counting in Underwater Scenes	Apr 23, 2023	BenchmarkingDepth Estimation	CodeCode Available	1	5
Bidirectional Projection Network for Cross Dimension Scene Understanding	Mar 26, 2021	2D Semantic Segmentation3D Semantic Segmentation	CodeCode Available	1	5
DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection	Dec 25, 2023	3D Object Detectionobject-detection	CodeCode Available	1	5
Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration	Dec 17, 2024	audio-visual event localizationaudio-visual learning	CodeCode Available	1	5
Explainable Object-induced Action Decision for Autonomous Vehicles	Mar 20, 2020	Autonomous DrivingAutonomous Vehicles	CodeCode Available	1	5
Learning and Reasoning with the Graph Structure Representation in Robotic Surgery	Jul 7, 2020	Edge ClassificationGraph Generation	CodeCode Available	1	5
Learning Triadic Belief Dynamics in Nonverbal Communication from Videos	Apr 7, 2021	Scene Understanding	CodeCode Available	1	5
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny Objects	Mar 27, 2018	General ClassificationObject	CodeCode Available	1	5
Human-centric Scene Understanding for 3D Large-scale Scenarios	Jul 26, 2023	Action RecognitionScene Understanding	CodeCode Available	1	5
IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments	Jun 27, 2022	Autonomous VehiclesScene Segmentation	CodeCode Available	1	5
Holistic 3D Scene Understanding from a Single Image with Implicit Representation	Mar 11, 2021	3D Object Detection3D Shape Reconstruction	CodeCode Available	1	5
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization	Aug 24, 2021	DiversityGraph Neural Network	CodeCode Available	1	5
Segmenting Known Objects and Unseen Unknowns without Prior Knowledge	Sep 12, 2022	Instance SegmentationObject Detection	CodeCode Available	1	5
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving	Jun 1, 2020	3D Object DetectionAutonomous Driving	CodeCode Available	1	5
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization	May 8, 2025	Scene UnderstandingSound Source Localization	CodeCode Available	1	5
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection	Dec 4, 2021	3D Object DetectionObject	CodeCode Available	1	5
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks	Feb 17, 2023	DeblurringDeep Learning	CodeCode Available	1	5
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation	Mar 28, 2023	Panoptic Scene Graph GenerationScene Graph Generation	CodeCode Available	1	5
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames	Nov 1, 2019	Autonomous NavigationGPU	CodeCode Available	1	5
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding	Nov 9, 2015	Decision MakingDecoder	CodeCode Available	1	5
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation	Mar 2, 2022	Domain AdaptationScene Understanding	CodeCode Available	1	5
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	Nov 17, 2022	3D Object DetectionDepth Estimation	CodeCode Available	1	5
Beyond Appearances: Material Segmentation with Embedded Spectral Information from RGB-D imagery	May 17, 2024	Material ClassificationMaterial Recognition	CodeCode Available	1	5
HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans	Sep 12, 2023	3D Object Retrieval3D Scene Reconstruction	CodeCode Available	1	5
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding	Mar 14, 2024	Contrastive LearningRepresentation Learning	CodeCode Available	1	5
Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation	Dec 16, 2021	Feature ImportanceScene Understanding	CodeCode Available	1	5
DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency	Apr 16, 2025	Few-Shot LearningInteractive Segmentation	CodeCode Available	1	5
3DP3: 3D Scene Perception via Probabilistic Programming	Oct 30, 2021	ObjectPose Estimation	CodeCode Available	1	5
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1	5
Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding	Jan 5, 2019	Domain AdaptationScene Understanding	CodeCode Available	1	5
AVSegFormer: Audio-Visual Segmentation with Transformer	Jul 3, 2023	DecoderScene Understanding	CodeCode Available	1	5
Detecting Human-Object Interaction via Fabricated Compositional Learning	Mar 15, 2021	Affordance RecognitionHuman-Object Interaction Detection	CodeCode Available	1	5
Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond	May 11, 2023	Scene Understanding	CodeCode Available	1	5
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes	Jul 1, 2024	Autonomous VehiclesImage Segmentation	CodeCode Available	1	5
DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image Fusion	Sep 18, 2024	Infrared And Visible Image FusionScene Understanding	CodeCode Available	1	5
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection	Dec 5, 2023	3D Object DetectionDenoising	CodeCode Available	1	5
Grounded Situation Recognition with Transformers	Nov 19, 2021	DecoderGrounded Situation Recognition	CodeCode Available	1	5
Global Aggregation then Local Distribution in Fully Convolutional Networks	Sep 16, 2019	Instance Segmentationobject-detection	CodeCode Available	1	5

Show:10 25 50

← PrevPage 4 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified