Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–275 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection	Dec 5, 2023	3D Object DetectionDenoising	CodeCode Available	1	5
Digging Into Self-Supervised Monocular Depth Estimation	Jun 4, 2018	Camera Pose EstimationDepth Estimation	CodeCode Available	1	5
Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor Setups	Jan 12, 2021	Scene Understanding	CodeCode Available	1	5
Detecting Human-Object Interaction via Fabricated Compositional Learning	Mar 15, 2021	Affordance RecognitionHuman-Object Interaction Detection	CodeCode Available	1	5
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans	Mar 24, 2024	3D Instance SegmentationInstance Segmentation	CodeCode Available	1	5
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning	Oct 8, 2020	Natural Language Visual GroundingScene Understanding	CodeCode Available	1	5
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction	May 9, 2024	Contrastive LearningScene Understanding	CodeCode Available	1	5
Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussian	Aug 7, 2024	Autonomous Drivingobject-detection	CodeCode Available	1	5
Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy Prediction	Mar 28, 2025	Autonomous DrivingScene Understanding	CodeCode Available	1	5
Learning to Answer Questions in Dynamic Audio-Visual Scenarios	Mar 26, 2022	audio-visual learningAudio-visual Question Answering	CodeCode Available	1	5
Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation	Jul 15, 2025	Large Language ModelScene Understanding	CodeCode Available	1	5
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny Objects	Mar 27, 2018	General ClassificationObject	CodeCode Available	1	5
LiON: Learning Point-wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data	Sep 19, 2023	Anomaly DetectionAutonomous Driving	CodeCode Available	1	5
Learning Triadic Belief Dynamics in Nonverbal Communication from Videos	Apr 7, 2021	Scene Understanding	CodeCode Available	1	5
A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion	Jun 14, 2024	3D ReconstructionAutonomous Driving	CodeCode Available	1	5
AirObject: A Temporally Evolving Graph Embedding for Object Identification	Nov 30, 2021	Graph AttentionGraph Embedding	CodeCode Available	1	5
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1	5
A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving	Aug 17, 2021	Autonomous DrivingDepth Estimation	CodeCode Available	1	5
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization	Aug 24, 2021	DiversityGraph Neural Network	CodeCode Available	1	5
Learning Object-level Point Augmentor for Semi-supervised 3D Object Detection	Dec 19, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1	5
Learning Visual Commonsense for Robust Scene Graph Generation	Jun 17, 2020	Graph GenerationScene Graph Generation	CodeCode Available	1	5
Learning Human-Object Interaction Detection using Interaction Points	Mar 31, 2020	Human-Object Interaction DetectionKeypoint Detection	CodeCode Available	1	5
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition	Aug 10, 2021	Action ClassificationAction Recognition	CodeCode Available	1	5
Learning and Reasoning with the Graph Structure Representation in Robotic Surgery	Jul 7, 2020	Edge ClassificationGraph Generation	CodeCode Available	1	5
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation	Feb 7, 2023	6D Pose Estimation6D Pose Estimation using RGB	CodeCode Available	1	5

Show:10 25 50

← PrevPage 11 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified