Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1050 of 1723 papers

Title	Date	Tasks	Status	Hype
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding	Apr 5, 2022	Autonomous VehiclesScene Understanding	—Unverified	0
P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior	Apr 5, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation	Apr 3, 2022	DecoderDepth Estimation	CodeCode Available	2
Online panoptic 3D reconstruction as a Linear Assignment Problem	Apr 1, 2022	3D ReconstructionImage Segmentation	CodeCode Available	1
Point Scene Understanding via Disentangled Instance Mesh Reconstruction	Mar 31, 2022	RetrievalScene Understanding	CodeCode Available	1
Collaborative Transformers for Grounded Situation Recognition	Mar 30, 2022	Grounded Situation RecognitionImage Classification	CodeCode Available	1
Multi-Task Learning for Visual Scene Understanding	Mar 28, 2022	Multi-Task LearningScene Understanding	—Unverified	0
Learning to Answer Questions in Dynamic Audio-Visual Scenarios	Mar 26, 2022	audio-visual learningAudio-visual Question Answering	CodeCode Available	1
Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification	Mar 25, 2022	RetrievalScene Understanding	—Unverified	0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering	Mar 24, 2022	Optical Character RecognitionOptical Character Recognition (OCR)	—Unverified	0
Self-Supervised Road Layout Parsing with Graph Auto-Encoding	Mar 21, 2022	Image ReconstructionScene Understanding	CodeCode Available	0
Towards 3D Scene Understanding by Referring Synthetic Models	Mar 20, 2022	Scene UnderstandingTransfer Learning	—Unverified	0
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows	Mar 20, 2022	Human-Object Interaction DetectionObject	—Unverified	0
Deep Point Cloud Simplification for High-quality Surface Reconstruction	Mar 17, 2022	Scene UnderstandingSurface Reconstruction	—Unverified	0
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering	Mar 17, 2022	Implicit RelationsQuestion Answering	CodeCode Available	1
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans	Mar 17, 2022	3D Object Recognitionglobal-optimization	—Unverified	0
WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection	Mar 16, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Deep learning for radar data exploitation of autonomous vehicle	Mar 15, 2022	Autonomous DrivingDeep Learning	CodeCode Available	1
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding	Mar 15, 2022	Boundary DetectionHuman Parsing	CodeCode Available	2
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry	Mar 14, 2022	Monocular Visual OdometryMotion Estimation	—Unverified	0
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers	Mar 9, 2022	3D Object DetectionAutonomous Vehicles	CodeCode Available	2
On Steering Multi-Annotations per Sample for Multi-Task Learning	Mar 6, 2022	Instance SegmentationMulti-Task Learning	—Unverified	0
Fast Neural Architecture Search for Lightweight Dense Prediction Networks	Mar 3, 2022	Depth EstimationImage Super-Resolution	—Unverified	0
Hybrid Optimized Deep Convolution Neural Network based Learning Model for Object Detection	Mar 2, 2022	Content-Based Image RetrievalDeep Learning	—Unverified	0
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation	Mar 2, 2022	Domain AdaptationScene Understanding	CodeCode Available	1
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation	Feb 27, 2022	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning	Feb 26, 2022	3D Point Cloud ClassificationPoint Cloud Segmentation	CodeCode Available	1
RescueNet: A High Resolution UAV Semantic Segmentation Benchmark Dataset for Natural Disaster Damage Assessment	Feb 24, 2022	Scene UnderstandingSegmentation	CodeCode Available	1
GroupViT: Semantic Segmentation Emerges from Text Supervision	Feb 22, 2022	Object DetectionScene Understanding	CodeCode Available	2
ReorientBot: Learning Object Reorientation for Specific-Posed Placement	Feb 22, 2022	Motion GenerationMotion Planning	CodeCode Available	1
Movies2Scenes: Using Movie Metadata to Learn Scene Representation	Feb 22, 2022	Contrastive LearningScene Understanding	—Unverified	0
3DRM:Pair-wise relation module for 3D object detection	Feb 20, 2022	3D Object DetectionObject	CodeCode Available	1
CARL-D: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentation	Feb 17, 2022	2D Object DetectionAutonomous Driving	CodeCode Available	0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection	Feb 15, 2022	Generalized Zero-Shot Object DetectionScene Understanding	CodeCode Available	0
HAKE: A Knowledge Engine Foundation for Human Activity Understanding	Feb 14, 2022	Action RecognitionHuman-Object Interaction Detection	CodeCode Available	2
SafePicking: Learning Safe Object Extraction via Object-Level Mapping	Feb 11, 2022	Motion PlanningObject	CodeCode Available	1
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics	Feb 7, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1
Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL)	Feb 5, 2022	object-detectionObject Detection	—Unverified	0
StandardSim: A Synthetic Dataset For Retail Environments	Feb 4, 2022	Change DetectionDepth Estimation	—Unverified	0
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation	Feb 2, 2022	PointGoal NavigationScene Understanding	CodeCode Available	0
Unsupervised Single-shot Depth Estimation using Perceptual Reconstruction	Jan 28, 2022	3D ReconstructionDepth Estimation	CodeCode Available	0
Global-Reasoned Multi-Task Learning Model for Surgical Scene Understanding	Jan 28, 2022	Graph AttentionKnowledge Distillation	CodeCode Available	1
MonoDistill: Learning Spatial Features for Monocular 3D Object Detection	Jan 26, 2022	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	1
Moving Beyond Navigation with Active Neural SLAM	Jan 17, 2022	Domain Generalizationmotion prediction	—Unverified	0
Towards holistic scene understanding: Semantic segmentation and beyond	Jan 16, 2022	object-detectionObject Detection	—Unverified	0
Interactive Attention AI to translate low light photos to captions for night scene understanding in women safety	Jan 4, 2022	DecoderDeep Learning	—Unverified	0
Scene Graph Generation: A Comprehensive Survey	Jan 3, 2022	Graph Generationobject-detection	—Unverified	0
Weakly Supervised Segmentation on Outdoor 4D Point Clouds With Temporal Matching and Spatial Graph Propagation	Jan 1, 2022	Point Cloud SegmentationScene Understanding	CodeCode Available	0
Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic Segmentation	Jan 1, 2022	3D Semantic SegmentationAutonomous Driving	—Unverified	0
Glass Segmentation Using Intensity and Spectral Polarization Cues	Jan 1, 2022	Camouflaged Object SegmentationScene Understanding	—Unverified	0

Show:10 25 50

← PrevPage 21 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified