Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1226–1250 of 1723 papers

Title	Date	Tasks	Status
Rethinking Semantic Segmentation Evaluation for Explainability and Model Selection	Jan 21, 2021	Autonomous NavigationModel Selection	—Unverified
VrR-VG: Refocusing Visually-Relevant Relationships	Feb 1, 2019	Image CaptioningQuestion Answering	—Unverified
Review on 6D Object Pose Estimation with the focus on Indoor Scene Understanding	Dec 4, 2022	6D Pose Estimation using RGBObject	—Unverified
Review on Panoramic Imaging and Its Applications in Scene Understanding	May 11, 2022	Autonomous DrivingDepth Estimation	—Unverified
3D Scene Understanding at Urban Intersection using Stereo Vision and Digital Map	Dec 10, 2021	Autonomous VehiclesNavigate	—Unverified
Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery	Mar 29, 2025	Action UnderstandingInstrument Recognition	—Unverified
Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts, Datasets and Metrics	Mar 8, 2023	Autonomous VehiclesScene Understanding	—Unverified
Camera-Only Bird's Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles	May 9, 2025	Autonomous NavigationAutonomous Vehicles	—Unverified
Camera Control at the Edge with Language Models for Scene Understanding	May 9, 2025	Language ModelingLanguage Modelling	—Unverified
Right Side Up? Disentangling Orientation Understanding in MLLMs with Fine-grained Multi-axis Perception Tasks	May 27, 2025	3D Scene ReconstructionDiagnostic	—Unverified
Visual Affordance and Function Understanding: A Survey	Jul 18, 2018	Affordance DetectionScene Understanding	—Unverified
Road Rage Reasoning with Vision-language Models (VLMs): Task Definition and Evaluation Dataset	Mar 14, 2025	Scene Understanding	—Unverified
Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets	May 21, 2025	Dataset GenerationDescriptive	—Unverified
Calibrated and Efficient Sampling-Free Confidence Estimation for LiDAR Scene Semantic Segmentation	Nov 18, 2024	Autonomous DrivingLIDAR Semantic Segmentation	—Unverified
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics	Nov 25, 2024	Robot ManipulationScene Understanding	—Unverified
Robust 3D Scene Segmentation through Hierarchical and Learnable Part-Fusion	Nov 16, 2021	3D Semantic SegmentationAutonomous Driving	—Unverified
Robust Category-Level 3D Pose Estimation from Synthetic Data	May 25, 2023	3D Pose Estimation3D Reconstruction	—Unverified
Robust deep learning-based semantic organ segmentation in hyperspectral images	Nov 9, 2021	Deep LearningImage Segmentation	—Unverified
Robust Multi-Modal Image Stitching for Improved Scene Understanding	Dec 28, 2023	Image StitchingScene Understanding	—Unverified
CAGS: Open-Vocabulary 3D Scene Understanding with Context-Aware Gaussian Splatting	Apr 16, 2025	3DGS3D Instance Segmentation	—Unverified
Robust Visual Localization via Semantic-Guided Multi-Scale Transformer	Jun 10, 2025	regressionScene Understanding	—Unverified
Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms	Dec 10, 2021	3D ReconstructionAutonomous Navigation	—Unverified
CaDIS: Cataract Dataset for Image Segmentation	Jun 27, 2019	2D Semantic Segmentation task 1 (8 classes)2D Semantic Segmentation task 2 (17 classes)	—Unverified
Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness	Apr 2, 2025	Scene Understanding	—Unverified
3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare	Jun 1, 2018	3D Object ReconstructionAutonomous Driving	—Unverified

Show:10 25 50

← PrevPage 50 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified