Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–850 of 1723 papers

Title	Date	Tasks	Status	Hype
Transavs: End-To-End Audio-Visual Segmentation With Transformer	May 12, 2023	Scene UnderstandingSegmentation	—Unverified	0
Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond	May 11, 2023	Scene Understanding	CodeCode Available	1
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs	May 10, 2023	Scene UnderstandingVisual Reasoning	—Unverified	0
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding	May 8, 2023	PredictionScene Understanding	—Unverified	0
Living in a Material World: Learning Material Properties from Full-Waveform Flash Lidar Data for Semantic Segmentation	May 7, 2023	Scene UnderstandingSemantic Segmentation	—Unverified	0
Learning-based Relational Object Matching Across Views	May 3, 2023	Graph Neural NetworkImage Retrieval	—Unverified	0
ArK: Augmented Reality with Knowledge Interactive Emergent Ability	May 1, 2023	AI AgentMixed Reality	—Unverified	0
TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding	May 1, 2023	3D Object DetectionMonocular Depth Estimation	CodeCode Available	2
DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric Voxelization	Apr 30, 2023	DecoderNeRF	CodeCode Available	1
Neural Implicit Dense Semantic SLAM	Apr 27, 2023	3D geometryScene Understanding	—Unverified	0
A Review of Panoptic Segmentation for Mobile Mapping Point Clouds	Apr 27, 2023	Instance SegmentationPanoptic Segmentation	CodeCode Available	1
Compositional 3D Human-Object Neural Animation	Apr 27, 2023	Human-Object Interaction DetectionNeRF	—Unverified	0
ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding	Apr 26, 2023	Scene Understanding	—Unverified	0
RGB-D Indiscernible Object Counting in Underwater Scenes	Apr 23, 2023	BenchmarkingDepth Estimation	CodeCode Available	1
Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic Segmentation	Apr 22, 2023	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
Advances in Deep Concealed Scene Understanding	Apr 21, 2023	Scene UnderstandingSemantic Segmentation	CodeCode Available	1
Factored Neural Representation for Scene Understanding	Apr 21, 2023	Novel View SynthesisObject	—Unverified	0
RS2G: Data-Driven Scene-Graph Extraction and Embedding for Robust Autonomous Perception and Scenario Understanding	Apr 17, 2023	Autonomous VehiclesGraph Learning	CodeCode Available	1
360^ High-Resolution Depth Estimation via Uncertainty-aware Structural Knowledge Transfer	Apr 17, 2023	Depth EstimationMonocular Depth Estimation	—Unverified	0
Learning How To Robustly Estimate Camera Pose in Endoscopic Videos	Apr 17, 2023	3D ReconstructionCamera Pose Estimation	CodeCode Available	1
STRAP: Structured Object Affordance Segmentation with Point Supervision	Apr 17, 2023	ObjectScene Understanding	CodeCode Available	1
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection	Apr 17, 2023	Human-Object Interaction DetectionQuantization	CodeCode Available	1
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding	Apr 14, 2023	3D Object DetectionScene Understanding	CodeCode Available	2
iDisc: Internal Discretization for Monocular Depth Estimation	Apr 13, 2023	Autonomous DrivingDepth Estimation	CodeCode Available	3
Graph-based Topology Reasoning for Driving Scenes	Apr 11, 2023	3D Lane DetectionAutonomous Driving	CodeCode Available	2
Semantic Segmentation with High Inference Speed in Off-Road Environments	Apr 10, 2023	2D Semantic SegmentationAutonomous Vehicles	CodeCode Available	0
Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation	Apr 10, 2023	Panoptic SegmentationScene Understanding	—Unverified	0
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding	Apr 4, 2023	Autonomous DrivingDomain Adaptation	CodeCode Available	0
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding	Apr 3, 2023	Contrastive LearningInstance Segmentation	CodeCode Available	2
Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings	Mar 30, 2023	ObjectScene Understanding	—Unverified	0
Complementary Random Masking for RGB-Thermal Semantic Segmentation	Mar 30, 2023	Scene UnderstandingSemantic Segmentation	CodeCode Available	1
DPF: Learning Dense Prediction Fields with Weak Supervision	Mar 29, 2023	Intrinsic Image DecompositionPrediction	CodeCode Available	1
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation	Mar 28, 2023	Panoptic Scene Graph GenerationScene Graph Generation	CodeCode Available	1
Real-Time Semantic Segmentation using Hyperspectral Images for Mapping Unstructured and Unknown Environments	Mar 27, 2023	Autonomous NavigationReal-Time Semantic Segmentation	CodeCode Available	1
You Only Need One Thing One Click: Self-Training for Weakly Supervised 3D Scene Understanding	Mar 26, 2023	3D Instance SegmentationInstance Segmentation	CodeCode Available	1
Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation	Mar 25, 2023	Domain AdaptationERP	—Unverified	0
Viewpoint Equivariance for Multi-View 3D Object Detection	Mar 25, 2023	3D Object DetectionObject	CodeCode Available	1
OVeNet: Offset Vector Network for Semantic Segmentation	Mar 25, 2023	Optical Character Recognition (OCR)Scene Understanding	CodeCode Available	0
Self-distillation for surgical action recognition	Mar 22, 2023	Action RecognitionMedical Image Analysis	CodeCode Available	1
Uni-Fusion: Universal Continuous Mapping	Mar 22, 2023	Scene Understanding	—Unverified	0
Semantic segmentation of surgical hyperspectral images under geometric domain shifts	Mar 20, 2023	Organ SegmentationScene Segmentation	—Unverified	0
Constructing Metric-Semantic Maps using Floor Plan Priors for Long-Term Indoor Localization	Mar 20, 2023	3D Object DetectionIndoor Localization	CodeCode Available	1
CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition	Mar 20, 2023	RetrievalScene Understanding	CodeCode Available	2
Content Adaptive Front End For Audio Classification	Mar 18, 2023	Audio ClassificationAudio Signal Processing	—Unverified	0
Efficient Computation Sharing for Multi-Task Visual Scene Understanding	Mar 16, 2023	Multi-Task LearningScene Understanding	CodeCode Available	0
Shifted-Windows Transformers for the Detection of Cerebral Aneurysms in Microsurgery	Mar 16, 2023	Scene Understanding	—Unverified	0
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving	Mar 16, 2023	3D Object DetectionAutonomous Driving	CodeCode Available	3
PENet: A Joint Panoptic Edge Detection Network	Mar 15, 2023	Edge DetectionMulti-Task Learning	CodeCode Available	0
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection	Mar 14, 2023	3D Object DetectionDecoder	CodeCode Available	1
Generalized 3D Self-supervised Learning Framework via Prompted Foreground-Aware Feature Contrast	Mar 11, 2023	3D Semantic SegmentationContrastive Learning	—Unverified	0

Show:10 25 50

← PrevPage 17 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified