Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 131–140 of 1723 papers

Title	Date	Tasks	Status	Hype
SoccerNet-v3D: Leveraging Sports Broadcast Replays for 3D Scene Understanding	Apr 14, 2025	Camera CalibrationObject Localization	CodeCode Available	1
Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding	Apr 9, 2025	Scene UnderstandingSelf-Supervised Learning	CodeCode Available	1
CamContextI2V: Context-aware Controllable Video Generation	Apr 8, 2025	DiversityScene Understanding	CodeCode Available	1
F-ViTA: Foundation Model Guided Visible to Thermal Translation	Apr 3, 2025	Scene UnderstandingStyle Transfer	CodeCode Available	1
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision	Apr 3, 2025	3D Object Detectioncross-modal alignment	CodeCode Available	1
WikiVideo: Article Generation from Multiple Videos	Apr 1, 2025	ArticlesRAG	CodeCode Available	1
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model	Mar 30, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy Prediction	Mar 28, 2025	Autonomous DrivingScene Understanding	CodeCode Available	1
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs	Mar 25, 2025	BenchmarkingScene Segmentation	CodeCode Available	1
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding	Mar 20, 2025	Scene Understanding	CodeCode Available	1

Show:10 25 50

← PrevPage 14 of 173Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified