Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 1723 papers

Title	Date	Tasks	Status	Hype
DIP: Unsupervised Dense In-Context Post-training of Visual Representations	Jun 23, 2025	GPUMeta-Learning	CodeCode Available	1
Scene-R1: Video-Grounded Large Language Models for 3D Scene Reasoning without 3D Annotations	Jun 21, 2025	Question AnsweringScene Understanding	—Unverified	0
Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems	Jun 17, 2025	Autonomous DrivingImage Segmentation	—Unverified	0
Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment	Jun 17, 2025	Autonomous DrivingInstance Segmentation	—Unverified	0
Unified Representation Space for 3D Visual Grounding	Jun 17, 2025	3D visual groundingContrastive Learning	—Unverified	0
SceneAware: Scene-Constrained Pedestrian Trajectory Prediction with LLM-Guided Walkability	Jun 17, 2025	Pedestrian Trajectory PredictionScene Understanding	CodeCode Available	0
FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding	Jun 16, 2025	FormGraph Generation	—Unverified	0
SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis	Jun 12, 2025	Novel View SynthesisScene Understanding	—Unverified	0
SemanticSplat: Feed-Forward 3D Scene Understanding with Language-Aware Gaussian Fields	Jun 11, 2025	3D ReconstructionScene Understanding	—Unverified	0
Robust Visual Localization via Semantic-Guided Multi-Scale Transformer	Jun 10, 2025	regressionScene Understanding	—Unverified	0

Show:10 25 50

← PrevPage 3 of 173Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified