Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1351–1400 of 1723 papers

Title	Date	Tasks	Status
Visual Lexicon: Rich Image Features in Language Space	Dec 9, 2024	Image GenerationImage Reconstruction	—Unverified
Visual Semantic Parsing: From Images to Abstract Meaning Representation	Oct 26, 2022	Abstract Meaning RepresentationScene Understanding	—Unverified
Visual-Semantic Scene Understanding by Sharing Labels in a Context Network	Sep 16, 2013	Data AugmentationObject	—Unverified
Visual Traffic Knowledge Graph Generation from Scene Images	Jan 1, 2023	Graph AttentionGraph Generation	—Unverified
Visual Vibrometry: Estimating MaterialProperties from Small Motions in Video	Apr 15, 2017	ObjectScene Understanding	—Unverified
Visual Vibrometry: Estimating Material Properties From Small Motion in Video	Jun 1, 2015	Scene Understanding	—Unverified
Visual Vibrometry: Estimating Material Properties from Small Motions in Video	Apr 15, 2017	Scene Understanding	—Unverified
Visuomotor Understanding for Representation Learning of Driving Scenes	Sep 16, 2019	Optical Flow EstimationRepresentation Learning	—Unverified
VLM-E2E: Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion	Feb 25, 2025	Autonomous DrivingNavigate	—Unverified
VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry	Apr 23, 2018	Outdoor LocalizationScene Understanding	—Unverified
VLP: Vision Language Planning for Autonomous Driving	Jan 10, 2024	Autonomous DrivingMotion Planning	—Unverified
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding	Dec 14, 2023	Scene UnderstandingTransfer Learning	—Unverified
VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding	Jun 28, 2025	3DGSInstance Segmentation	—Unverified
vS-Graphs: Integrating Visual SLAM and Situational Graphs through Multi-level Scene Understanding	Mar 3, 2025	Scene UnderstandingSimultaneous Localization and Mapping	—Unverified
Waymo Open Dataset: Panoramic Video Panoptic Segmentation	Jun 15, 2022	3D Multi-Object TrackingAutonomous Driving	—Unverified
Weakly Supervised 3D Instance Segmentation without Instance-level Annotations	Aug 3, 2023	3D Instance SegmentationInstance Segmentation	—Unverified
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment	Dec 15, 2023	3D visual groundingNatural Language Queries	—Unverified
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning	Mar 2, 2023	Human-Object Interaction DetectionKnowledge Distillation	—Unverified
Weakly Supervised Learning of Affordances	May 10, 2016	Human-Object Interaction DetectionImage Segmentation	—Unverified
Weakly Supervised Point Clouds Transformer for 3D Object Detection	Sep 8, 2023	3D Object DetectionObject	—Unverified
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots	Jan 29, 2016	image-classificationImage Classification	—Unverified
What Demands Attention in Urban Street Scenes? From Scene Understanding towards Road Safety: A Survey of Vision-driven Datasets and Studies	Jul 9, 2025	Scene UnderstandingSurvey	—Unverified
What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?	May 19, 2017	Object RecognitionScene Recognition	—Unverified
When Neural Networks Using Different Sensors Create Similar Features	Nov 4, 2021	Autonomous DrivingClassification	—Unverified
When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach	Jan 1, 2024	Scene UnderstandingVisual Grounding	—Unverified
Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks	Dec 4, 2017	Object RecognitionScene Understanding	—Unverified
Wireless Sensing With Deep Spectrogram Network and Primitive Based Autoregressive Hybrid Channel Model	Apr 21, 2021	Dataset GenerationScene Understanding	—Unverified
YETI (YET to Intervene) Proactive Interventions by Multimodal AI Agents in Augmented Reality Tasks	Jan 16, 2025	AI AgentScene Understanding	—Unverified
You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects	Apr 4, 2024	ObjectPose Tracking	—Unverified
You Only Speak Once to See	Sep 27, 2024	Contrastive LearningObject	—Unverified
Zero-Shot 4D Lidar Panoptic Segmentation	Apr 1, 2025	DiversityPanoptic Segmentation	—Unverified
Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models	Oct 10, 2023	ObjectObject Tracking	—Unverified
Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models	Jan 13, 2025	Scene Understanding	—Unverified
Zero-Shot Semantic Segmentation via Spatial and Multi-Scale Aware Visual Class Embedding	Nov 30, 2021	Domain AdaptationLanguage Modeling	—Unverified
ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding	Apr 26, 2023	Scene Understanding	—Unverified
Polarimetric Spatio-Temporal Light Transport Probing	May 25, 2021	MetamerismScene Understanding	—Unverified
Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments	Mar 21, 2017	Motion PlanningScene Understanding	—Unverified
PoSeg: Pose-Aware Refinement Network for Human Instance Segmentation	Jan 7, 2020	Human Instance SegmentationInstance Segmentation	—Unverified
PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation	Jun 28, 2024	DecoderImage Segmentation	—Unverified
Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos	Dec 15, 2020	Autonomous VehiclesCamera Auto-Calibration	—Unverified
Predicting Reaction Time to Comprehend Scenes with Foveated Scene Understanding Maps	May 19, 2025	Scene Understanding	—Unverified
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving	Mar 24, 2025	Autonomous DrivingKnowledge Graphs	—Unverified
Prediction of Scene Plausibility	Dec 2, 2022	PredictionScene Understanding	—Unverified
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network	Apr 16, 2024	Autonomous DrivingFeature Engineering	—Unverified
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario	Apr 8, 2025	3D Object DetectionAutonomous Driving	—Unverified
Probabilistic Future Prediction for Video Scene Understanding	Mar 13, 2020	Future predictionOptical Flow Estimation	—Unverified
ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation	Jun 5, 2025	3D ReconstructionNeRF	—Unverified
Prospective Role of Foundation Models in Advancing Autonomous Vehicles	Dec 8, 2023	Autonomous DrivingAutonomous Vehicles	—Unverified
PSDR-Room: Single Photo to Scene using Differentiable Rendering	Jul 6, 2023	Scene Understanding	—Unverified
Pseudo Label-Guided Multi Task Learning for Scene Understanding	Jan 1, 2021	Depth EstimationMonocular Depth Estimation	—Unverified

Show:10 25 50

← PrevPage 28 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified