Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1151–1175 of 1723 papers

Title	Date	Tasks	Status
ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling	Mar 22, 2025	Panoptic SegmentationScene Understanding	—Unverified
Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos	Dec 15, 2020	Autonomous VehiclesCamera Auto-Calibration	—Unverified
Vision-based Automated Bridge Component Recognition Integrated With High-level Scene Understanding	May 15, 2018	Scene ClassificationScene Understanding	—Unverified
Predicting Reaction Time to Comprehend Scenes with Foveated Scene Understanding Maps	May 19, 2025	Scene Understanding	—Unverified
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving	Mar 24, 2025	Autonomous DrivingKnowledge Graphs	—Unverified
Prediction of Scene Plausibility	Dec 2, 2022	PredictionScene Understanding	—Unverified
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network	Apr 16, 2024	Autonomous DrivingFeature Engineering	—Unverified
CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds	Jan 7, 2025	Contrastive LearningLanguage Modeling	—Unverified
City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning	Jul 17, 2025	Question AnsweringScene Understanding	—Unverified
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario	Apr 8, 2025	3D Object DetectionAutonomous Driving	—Unverified
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation	Jul 29, 2021	Depth EstimationMonocular Depth Estimation	—Unverified
Probabilistic Future Prediction for Video Scene Understanding	Mar 13, 2020	Future predictionOptical Flow Estimation	—Unverified
ChatSplat: 3D Conversational Gaussian Splatting	Dec 1, 2024	Large Language ModelScene Understanding	—Unverified
ChatBEV: A Visual Language Model that Understands BEV Maps	Mar 18, 2025	Autonomous DrivingLanguage Modeling	—Unverified
ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation	Jun 5, 2025	3D ReconstructionNeRF	—Unverified
Prospective Role of Foundation Models in Advancing Autonomous Vehicles	Dec 8, 2023	Autonomous DrivingAutonomous Vehicles	—Unverified
Vision-Centric Representation-Efficient Fine-Tuning for Robust Universal Foreground Segmentation	Apr 20, 2025	AttributeForeground Segmentation	—Unverified
PSDR-Room: Single Photo to Scene using Differentiable Rendering	Jul 6, 2023	Scene Understanding	—Unverified
Pseudo Label-Guided Multi Task Learning for Scene Understanding	Jan 1, 2021	Depth EstimationMonocular Depth Estimation	—Unverified
PT-ResNet: Perspective Transformation-Based Residual Network for Semantic Road Image Segmentation	Oct 29, 2019	Image Segmentationroad scene understanding	—Unverified
Challenges for Monocular 6D Object Pose Estimation in Robotics	Jul 22, 2023	6D Pose Estimation using RGBObject	—Unverified
Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM	Apr 29, 2024	Phrase GroundingScene Understanding	—Unverified
Quantifying the synthetic and real domain gap in aerial scene understanding	Nov 29, 2024	Domain AdaptationScene Understanding	—Unverified
Vision-Language Embodiment for Monocular Depth Estimation	Jan 1, 2025	3D ReconstructionDepth Estimation	—Unverified
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding	Apr 9, 2024	Scene UnderstandingSegmentation	—Unverified

Show:10 25 50

← PrevPage 47 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified