Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 131–140 of 1723 papers

Title	Date	Tasks	Status	Hype
WikiVideo: Article Generation from Multiple Videos	Apr 1, 2025	ArticlesRAG	CodeCode Available	1
Zero-Shot 4D Lidar Panoptic Segmentation	Apr 1, 2025	DiversityPanoptic Segmentation	—Unverified	0
Context-Aware Human Behavior Prediction Using Multimodal Large Language Models: Challenges and Insights	Apr 1, 2025	Activity PredictionDomain Generalization	—Unverified	0
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model	Mar 30, 2025	Autonomous DrivingDecision Making	CodeCode Available	4
PhysPose: Refining 6D Object Poses with Physical Constraints	Mar 30, 2025	6D Pose Estimation using RGBPose Estimation	—Unverified	0
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model	Mar 30, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments	Mar 29, 2025	NavigateOpen Vocabulary Semantic Segmentation	—Unverified	0
Empowering Large Language Models with 3D Situation Awareness	Mar 29, 2025	Scene Understanding	—Unverified	0
Evaluating Compositional Scene Understanding in Multimodal Generative Models	Mar 29, 2025	Scene Understanding	CodeCode Available	0
Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery	Mar 29, 2025	Action UnderstandingInstrument Recognition	—Unverified	0

Show:10 25 50

← PrevPage 14 of 173Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified