Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 576–600 of 1723 papers

Title	Date	Tasks	Status	Hype
Mapping High-level Semantic Regions in Indoor Environments without Object Recognition	Mar 11, 2024	Graph GenerationLanguage Modeling	—Unverified	0
Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer	Mar 11, 2024	AnatomyDisentanglement	CodeCode Available	1
Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation	Mar 8, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Embodied Understanding of Driving Scenarios	Mar 7, 2024	Autonomous DrivingLanguage Modeling	CodeCode Available	3
Out of the Room: Generalizing Event-Based Dynamic Motion Segmentation for Complex Scenes	Mar 7, 2024	Motion SegmentationOptical Flow Estimation	—Unverified	0
GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding	Mar 6, 2024	NeRFScene Understanding	—Unverified	0
HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes	Mar 5, 2024	Scene Understanding	—Unverified	0
FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything	Feb 29, 2024	3D Object ReconstructionInstance Segmentation	CodeCode Available	2
WHU-Synthetic: A Synthetic Perception Dataset for 3-D Multitask Model Research	Feb 29, 2024	3D ReconstructionAttribute	CodeCode Available	1
One model to use them all: Training a segmentation model with complementary datasets	Feb 29, 2024	AllAnatomy	CodeCode Available	0
PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds	Feb 29, 2024	Depth EstimationDepth Prediction	—Unverified	0
LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment	Feb 27, 2024	Scene Understanding	—Unverified	0
AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding	Feb 27, 2024	3D Object Detection3D Part Segmentation	CodeCode Available	0
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding	Feb 23, 2024	Scene Understanding	—Unverified	0
Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding	Feb 22, 2024	DiversityScene Understanding	CodeCode Available	3
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models	Feb 19, 2024	Autonomous DrivingScene Understanding	—Unverified	0
Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review	Feb 17, 2024	Panoptic SegmentationScene Segmentation	CodeCode Available	1
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation	Feb 14, 2024	DecoderObject	—Unverified	0
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models	Feb 12, 2024	HallucinationObject Localization	CodeCode Available	4
InCoRo: In-Context Learning for Robotics Control with Feedback Loops	Feb 7, 2024	In-Context LearningScene Understanding	—Unverified	0
Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives	Feb 5, 2024	Continual LearningMulti-Task Learning	CodeCode Available	2
SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM	Feb 5, 2024	3D Semantic SegmentationCamera Pose Estimation	CodeCode Available	3
Neural Language of Thought Models	Feb 2, 2024	Image GenerationObject	—Unverified	0
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data	Jan 31, 2024	BenchmarkingChange Detection	CodeCode Available	0
Non-central panorama indoor dataset	Jan 30, 2024	Scene Understanding	CodeCode Available	0

Show:10 25 50

← PrevPage 24 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified