Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–425 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond	Oct 13, 2024	Autonomous DrivingAutonomous Vehicles	CodeCode Available	1	5
Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding	Mar 16, 2025	Autonomous DrivingRAG	CodeCode Available	1	5
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering	Jul 30, 2024	Inverse RenderingNeRF	CodeCode Available	1	5
SemSegDepth: A Combined Model for Semantic Segmentation and Depth Completion	Sep 1, 2022	Depth CompletionScene Understanding	CodeCode Available	1	5
Online panoptic 3D reconstruction as a Linear Assignment Problem	Apr 1, 2022	3D ReconstructionImage Segmentation	CodeCode Available	1	5
Dual-Hybrid Attention Network for Specular Highlight Removal	Jul 17, 2024	highlight removalObject Recognition	CodeCode Available	1	5
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding	Apr 16, 2020	Human Part SegmentationPanoptic Segmentation	CodeCode Available	1	5
Grounded Situation Recognition with Transformers	Nov 19, 2021	DecoderGrounded Situation Recognition	CodeCode Available	1	5
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction	May 9, 2024	Contrastive LearningScene Understanding	CodeCode Available	1	5
Distilled Semantics for Comprehensive Scene Understanding from Videos	Mar 31, 2020	Depth EstimationKnowledge Distillation	CodeCode Available	1	5
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving Environments	Sep 22, 2020	Domain AdaptationScene Understanding	CodeCode Available	1	5
Mask4D: End-to-End Mask-Based 4D Panoptic Segmentation for LiDAR Sequences	Sep 18, 2023	3D Panoptic Segmentation4D Panoptic Segmentation	CodeCode Available	1	5
DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection	Dec 25, 2023	3D Object Detectionobject-detection	CodeCode Available	1	5
MassMIND: Massachusetts Maritime INfrared Dataset	Sep 9, 2022	Instance SegmentationScene Understanding	CodeCode Available	1	5
Dynamic Graph Message Passing Networks	Aug 19, 2019	Image Classificationobject-detection	CodeCode Available	1	5
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model	Mar 30, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	1	5
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge	Nov 21, 2023	Large Language ModelMultimodal Deep Learning	CodeCode Available	1	5
Monte Carlo Scene Search for 3D Scene Understanding	Mar 14, 2021	Scene Understanding	CodeCode Available	1	5
You Only Need One Thing One Click: Self-Training for Weakly Supervised 3D Scene Understanding	Mar 26, 2023	3D Instance SegmentationInstance Segmentation	CodeCode Available	1	5
MGNet: Monocular Geometric Scene Understanding for Autonomous Driving	Jun 27, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1	5
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data	Nov 17, 2021	3D Object Detectionobject-detection	CodeCode Available	1	5
Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy Prediction	Mar 28, 2025	Autonomous DrivingScene Understanding	CodeCode Available	1	5
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding	Jan 14, 2025	Language ModelingLanguage Modelling	CodeCode Available	1	5
DPF: Learning Dense Prediction Fields with Weak Supervision	Mar 29, 2023	Intrinsic Image DecompositionPrediction	CodeCode Available	1	5
Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation	Dec 24, 2021	Depth EstimationDepth Prediction	CodeCode Available	1	5

Show:10 25 50

← PrevPage 17 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified