Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1451–1500 of 1723 papers

Title	Date	Tasks	Status
Assessing the generalization performance of SAM for ureteroscopy scene understanding	May 22, 2025	Scene UnderstandingSegmentation	—Unverified
A Sentence Is Worth a Thousand Pixels	Jun 1, 2013	Re-RankingScene Understanding	—Unverified
A Semantic Communication System for Real-time 3D Reconstruction Tasks	Dec 2, 2024	3D ReconstructionScene Understanding	—Unverified
Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph	Aug 28, 2024	Autonomous DrivingGraph Neural Network	—Unverified
Semantic and structural image segmentation for prosthetic vision	Sep 25, 2018	Image SegmentationObject	—Unverified
Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning	Sep 5, 2023	Graph AttentionObject Rearrangement	—Unverified
You Only Speak Once to See	Sep 27, 2024	Contrastive LearningObject	—Unverified
Structured agents for physical construction	Apr 5, 2019	Deep Reinforcement LearningReinforcement Learning	—Unverified
Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization	Jul 3, 2023	object-detectionObject Detection	—Unverified
Structured Generative Models for Scene Understanding	Feb 7, 2023	Scene Understanding	—Unverified
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning	Mar 2, 2023	Human-Object Interaction DetectionKnowledge Distillation	—Unverified
Neural Language of Thought Models	Feb 2, 2024	Image GenerationObject	—Unverified
A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation	Oct 19, 2016	object-detectionObject Detection	—Unverified
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos	Jun 27, 2023	Multi-Task LearningScene Understanding	—Unverified
Submodular Field Grammars: Representation, Inference, and Application to Image Parsing	Dec 1, 2018	Scene Understanding	—Unverified
A Robotic 3D Perception System for Operating Room Environment Awareness	Mar 20, 2020	3D Semantic SegmentationScene Segmentation	—Unverified
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite	Jun 1, 2015	3D ReconstructionScene Understanding	—Unverified
SUPER: A Novel Lane Detection System	May 14, 2020	Lane DetectionScene Understanding	—Unverified
ArK: Augmented Reality with Knowledge Interactive Emergent Ability	May 1, 2023	AI AgentMixed Reality	—Unverified
SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians	Dec 13, 2024	GPUObject Localization	—Unverified
Weakly Supervised Learning of Affordances	May 10, 2016	Human-Object Interaction DetectionImage Segmentation	—Unverified
Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review	Feb 16, 2025	Scene Understanding	—Unverified
SurgiSAM2: Fine-tuning a foundational model for surgical video anatomy segmentation and detection	Mar 5, 2025	AnatomyScene Segmentation	—Unverified
SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks	Aug 24, 2023	Scene Understanding	—Unverified
Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models	Jul 17, 2025	3D Point Cloud ReconstructionPoint cloud reconstruction	—Unverified
A Review on Visual-SLAM: Advancements from Geometric Modelling to Learning-based Semantic Scene Understanding	Sep 12, 2022	Scene Understanding	—Unverified
SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field	Mar 21, 2024	3D Scene ReconstructionAutonomous Driving	—Unverified
Survey of Action Recognition, Spotting and Spatio-Temporal Localization in Soccer -- Current Trends and Research Perspectives	Sep 21, 2023	Action LocalizationAction Recognition	—Unverified
A Review and A Robust Framework of Data-Efficient 3D Scene Parsing with Traditional/Learned 3D Descriptors	Dec 3, 2023	Active LearningInstance Segmentation	—Unverified
A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators	Aug 1, 2018	AttributeNatural Questions	—Unverified
3D-Aware Instance Segmentation and Tracking in Egocentric Videos	Aug 19, 2024	3D Object ReconstructionInstance Segmentation	—Unverified
Symbolic Graph Inference for Compound Scene Understanding	Oct 30, 2024	Question AnsweringScene Understanding	—Unverified
Synergizing Contrastive Learning and Optimal Transport for 3D Point Cloud Domain Adaptation	Aug 27, 2023	Contrastive LearningDomain Adaptation	—Unverified
Syn-Mediverse: A Multimodal Synthetic Dataset for Intelligent Scene Understanding of Healthcare Facilities	Aug 6, 2023	Depth EstimationInstance Segmentation	—Unverified
A Reinforcement Learning Approach to Target Tracking in a Camera Network	Jul 26, 2018	Q-Learningreinforcement-learning	—Unverified
SynthCam3D: Semantic Understanding With Synthetic Indoor Scenes	May 1, 2015	Scene UnderstandingSegmentation	—Unverified
Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery	Jul 17, 2020	Deep LearningScene Understanding	—Unverified
A Reflectance Based Method For Shadow Detection and Removal	Jul 11, 2018	Detecting ShadowsScene Understanding	—Unverified
Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander	Jul 15, 2025	Language ModelingLanguage Modelling	—Unverified
Tactile MNIST: Benchmarking Active Tactile Perception	Jun 3, 2025	BenchmarkingScene Understanding	—Unverified
TADFormer : Task-Adaptive Dynamic Transformer for Efficient Multi-Task Learning	Jan 8, 2025	Multi-Task Learningparameter-efficient fine-tuning	—Unverified
TADFormer: Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning	Jan 1, 2025	Multi-Task Learningparameter-efficient fine-tuning	—Unverified
Are Cars Just 3D Boxes? - Jointly Estimating the 3D Shape of Multiple Objects	Jun 1, 2014	3D geometry3D Shape Modeling	—Unverified
AquaticCLIP: A Vision-Language Foundation Model for Underwater Scene Analysis	Feb 3, 2025	Object CountingScene Understanding	—Unverified
Talk-to-Resolve: Combining scene understanding and spatial dialogue to resolve granular task ambiguity for a collocated robot	Nov 22, 2021	Scene Understanding	—Unverified
TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs	Sep 8, 2024	Depth EstimationMonocular Depth Estimation	—Unverified
Weakly Supervised Point Clouds Transformer for 3D Object Detection	Sep 8, 2023	3D Object DetectionObject	—Unverified
TARS: Traffic-Aware Radar Scene Flow Estimation	Mar 13, 2025	Autonomous Drivingobject-detection	—Unverified
Zero-Shot 4D Lidar Panoptic Segmentation	Apr 1, 2025	DiversityPanoptic Segmentation	—Unverified
A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance	May 16, 2024	LIDAR Semantic SegmentationScene Understanding	—Unverified

Show:10 25 50

← PrevPage 30 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified