Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 176–200 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving	Jun 1, 2020	3D Object DetectionAutonomous Driving	CodeCode Available	1	5
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization	May 8, 2025	Scene UnderstandingSound Source Localization	CodeCode Available	1	5
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection	Dec 4, 2021	3D Object DetectionObject	CodeCode Available	1	5
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks	Feb 17, 2023	DeblurringDeep Learning	CodeCode Available	1	5
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation	Mar 28, 2023	Panoptic Scene Graph GenerationScene Graph Generation	CodeCode Available	1	5
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames	Nov 1, 2019	Autonomous NavigationGPU	CodeCode Available	1	5
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding	Nov 9, 2015	Decision MakingDecoder	CodeCode Available	1	5
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation	Mar 2, 2022	Domain AdaptationScene Understanding	CodeCode Available	1	5
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	Nov 17, 2022	3D Object DetectionDepth Estimation	CodeCode Available	1	5
Beyond Appearances: Material Segmentation with Embedded Spectral Information from RGB-D imagery	May 17, 2024	Material ClassificationMaterial Recognition	CodeCode Available	1	5
HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans	Sep 12, 2023	3D Object Retrieval3D Scene Reconstruction	CodeCode Available	1	5
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding	Mar 14, 2024	Contrastive LearningRepresentation Learning	CodeCode Available	1	5
Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation	Dec 16, 2021	Feature ImportanceScene Understanding	CodeCode Available	1	5
DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency	Apr 16, 2025	Few-Shot LearningInteractive Segmentation	CodeCode Available	1	5
3DP3: 3D Scene Perception via Probabilistic Programming	Oct 30, 2021	ObjectPose Estimation	CodeCode Available	1	5
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1	5
Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding	Jan 5, 2019	Domain AdaptationScene Understanding	CodeCode Available	1	5
AVSegFormer: Audio-Visual Segmentation with Transformer	Jul 3, 2023	DecoderScene Understanding	CodeCode Available	1	5
Detecting Human-Object Interaction via Fabricated Compositional Learning	Mar 15, 2021	Affordance RecognitionHuman-Object Interaction Detection	CodeCode Available	1	5
Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond	May 11, 2023	Scene Understanding	CodeCode Available	1	5
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes	Jul 1, 2024	Autonomous VehiclesImage Segmentation	CodeCode Available	1	5
DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image Fusion	Sep 18, 2024	Infrared And Visible Image FusionScene Understanding	CodeCode Available	1	5
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection	Dec 5, 2023	3D Object DetectionDenoising	CodeCode Available	1	5
Grounded Situation Recognition with Transformers	Nov 19, 2021	DecoderGrounded Situation Recognition	CodeCode Available	1	5
Global Aggregation then Local Distribution in Fully Convolutional Networks	Sep 16, 2019	Instance Segmentationobject-detection	CodeCode Available	1	5

Show:10 25 50

← PrevPage 8 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified