Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1326–1350 of 1723 papers

Title	Date	Tasks	Status
Segment Any 3D Gaussians	Dec 1, 2023	Interactive SegmentationScene Understanding	—Unverified
Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation	Mar 16, 2024	Instance SegmentationObject	—Unverified
Segment Any RGB-Thermal Model with Language-aided Distillation	May 4, 2025	Instance SegmentationKnowledge Distillation	—Unverified
Segment Anything, Even Occluded	Mar 8, 2025	Amodal Instance SegmentationAutonomous Driving	—Unverified
Segmentation Guided Attention Networks for Visual Question Answering	Jul 1, 2017	Common Sense ReasoningQuestion Answering	—Unverified
Segmentation-guided Domain Adaptation for Efficient Depth Completion	Oct 14, 2022	Depth CompletionDomain Adaptation	—Unverified
Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic Segmentation	Jan 1, 2022	3D Semantic SegmentationAutonomous Driving	—Unverified
YETI (YET to Intervene) Proactive Interventions by Multimodal AI Agents in Augmented Reality Tasks	Jan 16, 2025	AI AgentScene Understanding	—Unverified
Binaural SoundNet: Predicting Semantics, Depth and Motion with Binaural Sounds	Sep 6, 2021	Scene UnderstandingSuper-Resolution	—Unverified
Visual Vibrometry: Estimating MaterialProperties from Small Motions in Video	Apr 15, 2017	ObjectScene Understanding	—Unverified
Bilateral Adaptation for Human-Object Interaction Detection with Occlusion-Robustness	Jan 1, 2024	Human-Object Interaction Detectionobject-detection	—Unverified
Self-Supervised and Generalizable Tokenization for CLIP-Based 3D Understanding	May 24, 2025	Domain GeneralizationRepresentation Learning	—Unverified
Self-supervised Learning of Occlusion Aware Flow Guided 3D Geometry Perception with Adaptive Cross Weighted Loss from Monocular Videos	Aug 9, 2021	3D geometry3D Geometry Perception	—Unverified
Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness	Jul 7, 2024	Activity RecognitionScene Understanding	—Unverified
Self-Supervised Object Detection from Egocentric Videos	Jan 1, 2023	Class-agnostic Object DetectionObject	—Unverified
Visual Vibrometry: Estimating Material Properties From Small Motion in Video	Jun 1, 2015	Scene Understanding	—Unverified
Visual Vibrometry: Estimating Material Properties from Small Motions in Video	Apr 15, 2017	Scene Understanding	—Unverified
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer	Oct 9, 2020	Decoderimage-classification	—Unverified
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding	May 8, 2023	PredictionScene Understanding	—Unverified
Self-Supervised Relative Depth Learning for Urban Scene Understanding	Dec 13, 2017	Depth EstimationMonocular Depth Estimation	—Unverified
Visuomotor Understanding for Representation Learning of Driving Scenes	Sep 16, 2019	Optical Flow EstimationRepresentation Learning	—Unverified
Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields	Jun 9, 2022	Data AugmentationEdge Detection	—Unverified
VLM-E2E: Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion	Feb 25, 2025	Autonomous DrivingNavigate	—Unverified
SELMA: SEmantic Large-scale Multimodal Acquisitions in Variable Weather, Daytime and Viewpoints	Apr 20, 2022	Autonomous DrivingScene Understanding	—Unverified
Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models	May 3, 2025	DiagnosticObject Recognition	—Unverified

Show:10 25 50

← PrevPage 54 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified