Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 751–800 of 1723 papers

Title	Date	Tasks	Status
Deep Learned Full-3D Object Completion from Single View	Aug 21, 2018	3D geometry3D Reconstruction	—Unverified
Improving Online Lane Graph Extraction by Object-Lane Clustering	Jul 20, 2023	3D Object DetectionAutonomous Driving	—Unverified
Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds	Jul 23, 2021	Point Cloud SegmentationScene Understanding	—Unverified
Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation	Mar 10, 2020	Domain AdaptationScene Understanding	—Unverified
A Weakly-Supervised Depth Estimation Network Using Attention Mechanism	Jul 10, 2021	Depth EstimationMonocular Depth Estimation	—Unverified
Improving Human-Object Interaction Detection via Phrase Learning and Label Composition	Dec 14, 2021	Human-Object Interaction DetectionScene Understanding	—Unverified
LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding	Dec 23, 2024	3D Semantic SegmentationScene Understanding	—Unverified
Improving Building Segmentation for Off-Nadir Satellite Imagery	Sep 8, 2021	Scene UnderstandingSegmentation	—Unverified
Improving 6D Object Pose Estimation of metallic Household and Industry Objects	Mar 5, 2025	6D Pose Estimation using RGBPose Estimation	—Unverified
Deep ensembles based on Stochastic Activation Selection for Polyp Segmentation	Apr 2, 2021	Autonomous DrivingDecoder	—Unverified
A Multiple-View Geometric Model for Specularity Prediction on General Curved Surfaces	Aug 20, 2021	3D ReconstructionPrediction	—Unverified
Memory-Augmented Multimodal LLMs for Surgical VQA via Self-Contained Inquiry	Nov 17, 2024	Question AnsweringScene Understanding	—Unverified
Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding	Sep 26, 2023	Scene UnderstandingSimultaneous Localization and Mapping	—Unverified
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving	Jan 7, 2025	Autonomous DrivingContrastive Learning	—Unverified
Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Experiments, and Challenges	Oct 20, 2024	Autonomous DrivingDecision Making	—Unverified
Large Language Models (LLMs) as Traffic Control Systems at Urban Intersections: A New Paradigm	Nov 16, 2024	Autonomous VehiclesDecision Making	—Unverified
IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement	Jun 29, 2021	2D Semantic Segmentation3D Semantic Scene Completion	—Unverified
Image-to-Height Domain Translation for Synthetic Aperture Sonar	Dec 12, 2021	Generative Adversarial NetworkScene Understanding	—Unverified
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning	Jun 29, 2016	General ClassificationPedestrian Detection	—Unverified
Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment	Jun 17, 2025	Autonomous DrivingInstance Segmentation	—Unverified
Deep cross-domain building extraction for selective depth estimation from oblique aerial imagery	Apr 23, 2018	3D ReconstructionDepth Estimation	—Unverified
Learning 3D Robotics Perception using Inductive Priors	May 30, 2024	3D ReconstructionImage Generation	—Unverified
Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems	Jun 17, 2025	Autonomous DrivingImage Segmentation	—Unverified
Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions	Apr 8, 2020	3d scene graph generation3D Semantic Segmentation	—Unverified
A Comprehensive Review of Modern Object Segmentation Approaches	Jan 13, 2023	Image SegmentationObject	—Unverified
Image Parsing with Stochastic Scene Grammar	Dec 1, 2011	ClusteringScene Labeling	—Unverified
Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey	Mar 17, 2025	3D ReconstructionAutonomous Driving	—Unverified
Learning-based Relational Object Matching Across Views	May 3, 2023	Graph Neural NetworkImage Retrieval	—Unverified
Learning Category- and Instance-Aware Pixel Embedding for Fast Panoptic Segmentation	Sep 28, 2020	Instance SegmentationPanoptic Segmentation	—Unverified
Learning Densities in Feature Space for Reliable Segmentation of Indoor Scenes	Aug 1, 2019	Scene UnderstandingSemantic Segmentation	—Unverified
Learning Depth from Single Images with Deep Neural Network Embedding Focal Length	Mar 27, 2018	Depth EstimationNetwork Embedding	—Unverified
Learning Direct Optimization for Scene Understanding	Dec 18, 2018	Scene Understanding	—Unverified
Deep Contextual Attention for Human-Object Interaction Detection	Oct 17, 2019	Human-Object Interaction DetectionObject	—Unverified
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding	Mar 16, 2016	ObjectScene Understanding	—Unverified
Image-Graph-Image Translation via Auto-Encoding	Dec 10, 2020	Scene UnderstandingTranslation	—Unverified
Learning Geometric-Aware Properties in 2D Representation Using Lightweight CAD Models, or Zero Real 3D Pairs	Jan 1, 2023	Scene Understanding	—Unverified
A model of saliency-based visual attention for rapid scene analysis	Nov 1, 1998	Saliency PredictionScene Understanding	—Unverified
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation	Mar 11, 2025	Image SegmentationPanoptic Segmentation	—Unverified
Learning in Audio-visual Context: A Review, Analysis, and New Perspective	Aug 20, 2022	audio-visual learningScene Understanding	—Unverified
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data	Mar 22, 2024	DenoisingScene Understanding	—Unverified
Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding	Apr 28, 2025	3D Semantic SegmentationContrastive Learning	—Unverified
Meta Learning with Differentiable Closed-form Solver for Fast Video Object Segmentation	Sep 28, 2019	FormMeta-Learning	—Unverified
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding	Jun 16, 2019	Caption GenerationImage Captioning	—Unverified
Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks	Aug 23, 2021	Face RecognitionObject Recognition	—Unverified
Mapping High-level Semantic Regions in Indoor Environments without Object Recognition	Mar 11, 2024	Graph GenerationLanguage Modeling	—Unverified
IM2CAD	Aug 18, 2016	Scene Understanding	—Unverified
Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction	Sep 17, 2021	Representation LearningSaliency Prediction	—Unverified
3D Question Answering for City Scene Understanding	Jul 24, 2024	Autonomous DrivingQuestion Answering	—Unverified
A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features	Jan 17, 2025	Language ModelingLanguage Modelling	—Unverified
MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report	Jun 14, 2024	Autonomous DrivingScene Understanding	—Unverified

Show:10 25 50

← PrevPage 16 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified