Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 1723 papers

Title	Date	Tasks	Status	Hype	Score
MLRSNet: A Multi-label High Spatial Resolution Remote Sensing Dataset for Semantic Scene Understanding	Oct 1, 2020	Deep Learningimage-classification	CodeCode Available	1	5
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks	Mar 28, 2020	3D Medical Imaging SegmentationAction Recognition	CodeCode Available	1	5
Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic Segmentation	Apr 22, 2023	Autonomous DrivingKnowledge Distillation	CodeCode Available	1	5
AVSegFormer: Audio-Visual Segmentation with Transformer	Jul 3, 2023	DecoderScene Understanding	CodeCode Available	1	5
CamContextI2V: Context-aware Controllable Video Generation	Apr 8, 2025	DiversityScene Understanding	CodeCode Available	1	5
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny Objects	Mar 27, 2018	General ClassificationObject	CodeCode Available	1	5
F-ViTA: Foundation Model Guided Visible to Thermal Translation	Apr 3, 2025	Scene UnderstandingStyle Transfer	CodeCode Available	1	5
FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving	Aug 14, 2023	Autonomous DrivingOptical Flow Estimation	CodeCode Available	1	5
Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene	Aug 11, 2020	Instance SegmentationPoint Cloud Segmentation	CodeCode Available	1	5
Joint 2D-3D-Semantic Data for Indoor Scene Understanding	Feb 3, 2017	Scene Understanding	CodeCode Available	1	5
From General to Specific: Informative Scene Graph Generation via Balance Adjustment	Aug 30, 2021	BlockingGraph Generation	CodeCode Available	1	5
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection	Jul 30, 2021	3D Object Detectionobject-detection	CodeCode Available	1	5
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation	Dec 27, 2021	Computational EfficiencyInstance Segmentation	CodeCode Available	1	5
Lane Graph Estimation for Scene Understanding in Urban Driving	May 1, 2021	Autonomous DrivingAutonomous Vehicles	CodeCode Available	1	5
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks	Feb 17, 2023	DeblurringDeep Learning	CodeCode Available	1	5
Instance-wise Occlusion and Depth Orders in Natural Scenes	Nov 29, 2021	Depth EstimationDepth Prediction	CodeCode Available	1	5
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering	Mar 17, 2022	Implicit RelationsQuestion Answering	CodeCode Available	1	5
A Review of Panoptic Segmentation for Mobile Mapping Point Clouds	Apr 27, 2023	Instance SegmentationPanoptic Segmentation	CodeCode Available	1	5
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1	5
IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal Estimation	Dec 20, 2019	Disparity EstimationScene Understanding	CodeCode Available	1	5
General Geometry-aware Weakly Supervised 3D Object Detection	Jul 18, 2024	3D Object DetectionObject	CodeCode Available	1	5
Language-Assisted 3D Feature Learning for Semantic Scene Understanding	Nov 25, 2022	DescriptiveInstance Segmentation	CodeCode Available	1	5
All-Day Multi-Camera Multi-Target Tracking	Jan 1, 2025	AllMamba	CodeCode Available	1	5
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames	Nov 1, 2019	Autonomous NavigationGPU	CodeCode Available	1	5
Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation	Oct 30, 2020	Instance SegmentationPanoptic Segmentation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 10 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified