Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–675 of 1723 papers

Title	Date	Tasks	Status	Score
MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities	Aug 14, 2020	Representation LearningScene Understanding	CodeCode Available	5
Matterport3D: Learning from RGB-D Data in Indoor Environments	Sep 18, 2017	General ClassificationScene Understanding	CodeCode Available	5
MC-PanDA: Mask Confidence for Panoptic Domain Adaptation	Jul 19, 2024	Domain AdaptationPanoptic Segmentation	CodeCode Available	5
Gated Driver Attention Predictor	Aug 1, 2023	Driver Attention MonitoringPrediction	CodeCode Available	5
Gated2Depth: Real-time Dense Lidar from Gated Images	Feb 13, 2019	Scene Understanding	CodeCode Available	5
METEOR Guided Divergence for Video Captioning	Dec 20, 2022	Hierarchical Reinforcement LearningScene Understanding	CodeCode Available	5
GaIA: Graphical Information Gain based Attention Network for Weakly Supervised Point Cloud Semantic Segmentation	Oct 2, 2022	Scene UnderstandingSegmentation	CodeCode Available	5
m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks	Aug 23, 2020	AnatomyData Augmentation	CodeCode Available	5
Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance	Sep 10, 2024	Bilevel OptimizationPoint Cloud Completion	CodeCode Available	5
Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory	Jul 4, 2021	Question AnsweringScene Understanding	CodeCode Available	5
Loss Switching Fusion with Similarity Search for Video Classification	Jun 27, 2019	ClassificationClustering	CodeCode Available	5
LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics	Apr 16, 2018	NavigateScene Understanding	CodeCode Available	5
MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation	Nov 16, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	5
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild	Jan 8, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Lightweight integration of 3D features to improve 2D image segmentation	Dec 16, 2022	Image SegmentationScene Understanding	CodeCode Available	5
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images	Jan 26, 2016	DiversityGeneral Classification	CodeCode Available	5
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection	Feb 15, 2022	Generalized Zero-Shot Object DetectionScene Understanding	CodeCode Available	5
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding	Apr 18, 2025	Deep LearningPoint Cloud Completion	CodeCode Available	5
From Feature Importance to Natural Language Explanations Using LLMs with RAG	Jul 30, 2024	counterfactualCounterfactual Reasoning	CodeCode Available	5
CNN-based Lidar Point Cloud De-Noising in Adverse Weather	Dec 9, 2019	Autonomous VehiclesScene Understanding	CodeCode Available	5
Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning	Aug 1, 2020	Cross-Modal RetrievalRepresentation Learning	CodeCode Available	5
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition	Nov 27, 2024	Action RecognitionGraph Attention	CodeCode Available	5
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding	Apr 4, 2023	Autonomous DrivingDomain Adaptation	CodeCode Available	5
Learning Regional Purity for Instance Segmentation on 3D Point Clouds	Nov 3, 2020	3D Instance Segmentation3D Semantic Segmentation	CodeCode Available	5
Learning Panoptic Segmentation from Instance Contours	Oct 16, 2020	ClusteringInstance Segmentation	CodeCode Available	5

Show:10 25 50

← PrevPage 27 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified