Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1601–1625 of 1723 papers

Title	Date	Tasks	Status
Matterport3D: Learning from RGB-D Data in Indoor Environments	Sep 18, 2017	General ClassificationScene Understanding	CodeCode Available
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection	Feb 15, 2022	Generalized Zero-Shot Object DetectionScene Understanding	CodeCode Available
CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation	Jan 16, 2025	Novel View SynthesisScene Understanding	CodeCode Available
Structured Label Inference for Visual Understanding	Feb 18, 2018	Action DetectionGeneral Classification	CodeCode Available
AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding	Feb 27, 2024	3D Object Detection3D Part Segmentation	CodeCode Available
From Feature Importance to Natural Language Explanations Using LLMs with RAG	Jul 30, 2024	counterfactualCounterfactual Reasoning	CodeCode Available
m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks	Aug 23, 2020	AnatomyData Augmentation	CodeCode Available
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation	Oct 31, 2018	3D Object DetectionCamera Pose Estimation	CodeCode Available
Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans	Dec 1, 2021	4D Panoptic SegmentationAutonomous Navigation	CodeCode Available
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding	Apr 4, 2023	Autonomous DrivingDomain Adaptation	CodeCode Available
LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics	Apr 16, 2018	NavigateScene Understanding	CodeCode Available
Loss Switching Fusion with Similarity Search for Video Classification	Jun 27, 2019	ClassificationClustering	CodeCode Available
Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance	Sep 10, 2024	Bilevel OptimizationPoint Cloud Completion	CodeCode Available
AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning	Jan 1, 2025	Audio-visual Question AnsweringContinual Learning	CodeCode Available
Continual Learning of Unsupervised Monocular Depth from Videos	Nov 4, 2023	Autonomous DrivingContinual Learning	CodeCode Available
FlowGrad: Using Motion for Visual Sound Source Localization	Nov 15, 2022	Optical Flow EstimationScene Understanding	CodeCode Available
An Information-Theoretic Metric of Transferability for Task Transfer Learning	May 1, 2019	General ClassificationScene Understanding	CodeCode Available
SceneAware: Scene-Constrained Pedestrian Trajectory Prediction with LLM-Guided Walkability	Jun 17, 2025	Pedestrian Trajectory PredictionScene Understanding	CodeCode Available
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition	Nov 27, 2024	Action RecognitionGraph Attention	CodeCode Available
Lightweight integration of 3D features to improve 2D image segmentation	Dec 16, 2022	Image SegmentationScene Understanding	CodeCode Available
Surgical Scene Segmentation by Transformer With Asymmetric Feature Enhancement	Oct 23, 2024	AnatomyScene Segmentation	CodeCode Available
Constructing a Visual Relationship Authenticity Dataset	Oct 11, 2020	Relationship DetectionScene Understanding	CodeCode Available
Confidence-Aware Paced-Curriculum Learning by Label Smoothing for Surgical Scene Understanding	Dec 22, 2022	Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION	CodeCode Available
Computational Imaging for Machine Perception: Transferring Semantic Segmentation beyond Aberrations	Nov 21, 2022	Domain AdaptationScene Understanding	CodeCode Available
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding	Apr 18, 2025	Deep LearningPoint Cloud Completion	CodeCode Available

Show:10 25 50

← PrevPage 65 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified