Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 321–330 of 1723 papers

Title	Date	Tasks	Status	Hype
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition	Nov 27, 2024	Action RecognitionGraph Attention	CodeCode Available	0
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents	Nov 27, 2024	Autonomous NavigationObject Recognition	CodeCode Available	0
Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning	Nov 26, 2024	Objectobject-detection	CodeCode Available	0
HSI-Drive v2.0: More Data for New Challenges in Scene Understanding for Autonomous Driving	Nov 26, 2024	Autonomous DrivingImage Segmentation	—Unverified	0
An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models	Nov 25, 2024	DenoisingScene Understanding	CodeCode Available	2
Open-Vocabulary Octree-Graph for 3D Scene Understanding	Nov 25, 2024	ObjectScene Understanding	—Unverified	0
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics	Nov 25, 2024	Robot ManipulationScene Understanding	—Unverified	0
ROOT: VLM based System for Indoor Scene Understanding and Beyond	Nov 24, 2024	Scene GenerationScene Understanding	CodeCode Available	1
UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations	Nov 22, 2024	Autonomous DrivingScene Understanding	—Unverified	0
Multimodal 3D Reasoning Segmentation with Complex Scenes	Nov 21, 2024	Reasoning SegmentationScene Understanding	—Unverified	0

Show:10 25 50

← PrevPage 33 of 173Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified