Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1151–1200 of 1723 papers

Title	Date	Tasks	Status
ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling	Mar 22, 2025	Panoptic SegmentationScene Understanding	—Unverified
Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos	Dec 15, 2020	Autonomous VehiclesCamera Auto-Calibration	—Unverified
Vision-based Automated Bridge Component Recognition Integrated With High-level Scene Understanding	May 15, 2018	Scene ClassificationScene Understanding	—Unverified
Predicting Reaction Time to Comprehend Scenes with Foveated Scene Understanding Maps	May 19, 2025	Scene Understanding	—Unverified
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving	Mar 24, 2025	Autonomous DrivingKnowledge Graphs	—Unverified
Prediction of Scene Plausibility	Dec 2, 2022	PredictionScene Understanding	—Unverified
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network	Apr 16, 2024	Autonomous DrivingFeature Engineering	—Unverified
CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds	Jan 7, 2025	Contrastive LearningLanguage Modeling	—Unverified
City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning	Jul 17, 2025	Question AnsweringScene Understanding	—Unverified
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario	Apr 8, 2025	3D Object DetectionAutonomous Driving	—Unverified
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation	Jul 29, 2021	Depth EstimationMonocular Depth Estimation	—Unverified
Probabilistic Future Prediction for Video Scene Understanding	Mar 13, 2020	Future predictionOptical Flow Estimation	—Unverified
ChatSplat: 3D Conversational Gaussian Splatting	Dec 1, 2024	Large Language ModelScene Understanding	—Unverified
ChatBEV: A Visual Language Model that Understands BEV Maps	Mar 18, 2025	Autonomous DrivingLanguage Modeling	—Unverified
ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation	Jun 5, 2025	3D ReconstructionNeRF	—Unverified
Prospective Role of Foundation Models in Advancing Autonomous Vehicles	Dec 8, 2023	Autonomous DrivingAutonomous Vehicles	—Unverified
Vision-Centric Representation-Efficient Fine-Tuning for Robust Universal Foreground Segmentation	Apr 20, 2025	AttributeForeground Segmentation	—Unverified
PSDR-Room: Single Photo to Scene using Differentiable Rendering	Jul 6, 2023	Scene Understanding	—Unverified
Pseudo Label-Guided Multi Task Learning for Scene Understanding	Jan 1, 2021	Depth EstimationMonocular Depth Estimation	—Unverified
PT-ResNet: Perspective Transformation-Based Residual Network for Semantic Road Image Segmentation	Oct 29, 2019	Image Segmentationroad scene understanding	—Unverified
Challenges for Monocular 6D Object Pose Estimation in Robotics	Jul 22, 2023	6D Pose Estimation using RGBObject	—Unverified
Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM	Apr 29, 2024	Phrase GroundingScene Understanding	—Unverified
Quantifying the synthetic and real domain gap in aerial scene understanding	Nov 29, 2024	Domain AdaptationScene Understanding	—Unverified
Vision-Language Embodiment for Monocular Depth Estimation	Jan 1, 2025	3D ReconstructionDepth Estimation	—Unverified
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding	Apr 9, 2024	Scene UnderstandingSegmentation	—Unverified
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding	Mar 18, 2024	ObjectRelation Prediction	—Unverified
Category-Level and Open-Set Object Pose Estimation for Robotics	Apr 28, 2025	6D Pose Estimation6D Pose Estimation using RGB	—Unverified
Radiation Search Operations using Scene Understanding with Autonomous UAV and UGV	Aug 31, 2016	Scene SegmentationScene Understanding	—Unverified
Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images	Apr 5, 2016	Scene Understanding	—Unverified
RAFT: Robust Augmentation of FeaTures for Image Segmentation	May 7, 2025	Active LearningDomain Adaptation	—Unverified
RailSem19: A Dataset for Semantic Rail Scene Understanding	Jun 16, 2019	Scene UnderstandingSemantic Segmentation	—Unverified
RangeSeg: Range-Aware Real Time Segmentation of 3D LiDAR Point Clouds	May 2, 2022	Autonomous DrivingDecoder	—Unverified
Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning	Sep 12, 2023	Autonomous VehiclesQuestion Answering	—Unverified
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry	Mar 14, 2022	Monocular Visual OdometryMotion Estimation	—Unverified
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration	Apr 9, 2025	3D Semantic SegmentationBenchmarking	—Unverified
RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation	May 21, 2025	GPUNatural Language Queries	—Unverified
Vision-Language Models for Autonomous Driving: CLIP-Based Dynamic Scene Understanding	Jan 9, 2025	Autonomous DrivingIn-Context Learning	—Unverified
REACT: Recognize Every Action Everywhere All At Once	Nov 27, 2023	Action RecognitionActivity Recognition	—Unverified
RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation	Jan 1, 2023	Graph GenerationScene Understanding	—Unverified
Vision-Language Models Struggle to Align Entities across Modalities	Mar 5, 2025	AttributeCode Generation	—Unverified
Real time backbone for semantic segmentation	Mar 16, 2019	Autonomous DrivingModel Compression	—Unverified
Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL)	Feb 5, 2022	object-detectionObject Detection	—Unverified
Real-Time Semantic Stereo Matching	Oct 1, 2019	Scene UnderstandingSemantic Segmentation	—Unverified
Reasoning About Physical Interactions with Object-Centric Models	May 1, 2019	ObjectScene Understanding	—Unverified
Reasoning About Physical Interactions with Object-Oriented Prediction and Planning	Dec 28, 2018	ObjectScene Understanding	—Unverified
Reasoning with shapes: profiting cognitive susceptibilities to infer linear mapping transformations between shapes	Sep 1, 2017	Scene Understanding	—Unverified
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation	Oct 24, 2023	Autonomous DrivingScene Understanding	—Unverified
Recognizing Dynamic Scenes with Deep Dual Descriptor based on Key Frames and Key Segments	Feb 15, 2017	Scene RecognitionScene Understanding	—Unverified
Recognizing Material Properties from Images	Jan 9, 2018	Material ClassificationMaterial Recognition	—Unverified
Reconstructing Animals and the Wild	Nov 27, 2024	3D ReconstructionScene Understanding	—Unverified

Show:10 25 50

← PrevPage 24 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified