Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1025 of 1723 papers

Title	Date	Tasks	Status	Hype
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding	Apr 5, 2022	Autonomous VehiclesScene Understanding	—Unverified	0
P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior	Apr 5, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation	Apr 3, 2022	DecoderDepth Estimation	CodeCode Available	2
Online panoptic 3D reconstruction as a Linear Assignment Problem	Apr 1, 2022	3D ReconstructionImage Segmentation	CodeCode Available	1
Point Scene Understanding via Disentangled Instance Mesh Reconstruction	Mar 31, 2022	RetrievalScene Understanding	CodeCode Available	1
Collaborative Transformers for Grounded Situation Recognition	Mar 30, 2022	Grounded Situation RecognitionImage Classification	CodeCode Available	1
Multi-Task Learning for Visual Scene Understanding	Mar 28, 2022	Multi-Task LearningScene Understanding	—Unverified	0
Learning to Answer Questions in Dynamic Audio-Visual Scenarios	Mar 26, 2022	audio-visual learningAudio-visual Question Answering	CodeCode Available	1
Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification	Mar 25, 2022	RetrievalScene Understanding	—Unverified	0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering	Mar 24, 2022	Optical Character RecognitionOptical Character Recognition (OCR)	—Unverified	0
Self-Supervised Road Layout Parsing with Graph Auto-Encoding	Mar 21, 2022	Image ReconstructionScene Understanding	CodeCode Available	0
Towards 3D Scene Understanding by Referring Synthetic Models	Mar 20, 2022	Scene UnderstandingTransfer Learning	—Unverified	0
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows	Mar 20, 2022	Human-Object Interaction DetectionObject	—Unverified	0
Deep Point Cloud Simplification for High-quality Surface Reconstruction	Mar 17, 2022	Scene UnderstandingSurface Reconstruction	—Unverified	0
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering	Mar 17, 2022	Implicit RelationsQuestion Answering	CodeCode Available	1
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans	Mar 17, 2022	3D Object Recognitionglobal-optimization	—Unverified	0
WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection	Mar 16, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding	Mar 15, 2022	Boundary DetectionHuman Parsing	CodeCode Available	2
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry	Mar 14, 2022	Monocular Visual OdometryMotion Estimation	—Unverified	0
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers	Mar 9, 2022	3D Object DetectionAutonomous Vehicles	CodeCode Available	2
On Steering Multi-Annotations per Sample for Multi-Task Learning	Mar 6, 2022	Instance SegmentationMulti-Task Learning	—Unverified	0
Fast Neural Architecture Search for Lightweight Dense Prediction Networks	Mar 3, 2022	Depth EstimationImage Super-Resolution	—Unverified	0
Hybrid Optimized Deep Convolution Neural Network based Learning Model for Object Detection	Mar 2, 2022	Content-Based Image RetrievalDeep Learning	—Unverified	0
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation	Mar 2, 2022	Domain AdaptationScene Understanding	CodeCode Available	1

Show:10 25 50

← PrevPage 41 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified