Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1301–1325 of 1723 papers

Title	Date	Tasks	Status
Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames	Nov 28, 2023	ClusteringDiversity	—Unverified
SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments	Nov 28, 2024	Adversarial TextScene Understanding	—Unverified
Scene Text Detection for Augmented Reality -- Character Bigram Approach to reduce False Positive Rate	Dec 26, 2020	Scene Text DetectionScene Understanding	—Unverified
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text	Apr 25, 2022	Image RetrievalRetrieval	—Unverified
Scene Understanding Enabled Semantic Communication with Open Channel Coding	Jan 24, 2025	Question AnsweringScene Understanding	—Unverified
Scene Understanding for Autonomous Manipulation with Deep Learning	Mar 23, 2019	Action UnderstandingAffordance Detection	—Unverified
Scene Understanding for Autonomous Driving	May 11, 2021	Autonomous DrivingScene Understanding	—Unverified
Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes	Sep 26, 2024	object-detectionObject Detection	—Unverified
Scene Understanding Networks for Autonomous Driving based on Around View Monitoring System	May 18, 2018	3D Object DetectionAutonomous Driving	—Unverified
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding	Jan 17, 2024	3D visual groundingScene Understanding	—Unverified
Visual Semantic Parsing: From Images to Abstract Meaning Representation	Oct 26, 2022	Abstract Meaning RepresentationScene Understanding	—Unverified
SDNet: Semantically Guided Depth Estimation Network	Jul 24, 2019	Autonomous VehiclesDepth Estimation	—Unverified
Visual-Semantic Scene Understanding by Sharing Labels in a Context Network	Sep 16, 2013	Data AugmentationObject	—Unverified
SE(3) Equivariant Ray Embeddings for Implicit Multi-View Depth Estimation	Nov 11, 2024	Data AugmentationDecoder	—Unverified
SeaDSC: A video-based unsupervised method for dynamic scene change detection in unmanned surface vehicles	Nov 20, 2023	Change DetectionMotion Planning	—Unverified
Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction	Sep 17, 2021	Representation LearningSaliency Prediction	—Unverified
SeasoNet: A Seasonal Scene Classification, segmentation and Retrieval dataset for satellite Imagery over Germany	Jul 19, 2022	Image RetrievalRetrieval	—Unverified
Second-order Democratic Aggregation	Aug 22, 2018	General ClassificationMaterial Classification	—Unverified
Neural Groundplans: Persistent Neural Scene Representations from a Single Image	Jul 22, 2022	DisentanglementInstance Segmentation	—Unverified
Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer	Apr 24, 2024	Grounded Situation RecognitionScene Understanding	—Unverified
Seeing Beyond the Scene: Enhancing Vision-Language Models with Interactional Reasoning	May 14, 2025	Relation ExtractionScene Understanding	—Unverified
Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis	Jul 15, 2025	MarketingOptical Character Recognition	—Unverified
Seeing with Humans: Gaze-Assisted Neural Image Captioning	Aug 18, 2016	Image CaptioningObject	—Unverified
Seeing With Sound: Long-range Acoustic Beamforming for Multimodal Scene Understanding	Jan 1, 2023	Autonomous Vehiclesobject-detection	—Unverified
Visual Traffic Knowledge Graph Generation from Scene Images	Jan 1, 2023	Graph AttentionGraph Generation	—Unverified

Show:10 25 50

← PrevPage 53 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified