SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 131140 of 1723 papers

TitleStatusHype
WikiVideo: Article Generation from Multiple VideosCode1
Zero-Shot 4D Lidar Panoptic Segmentation0
Context-Aware Human Behavior Prediction Using Multimodal Large Language Models: Challenges and Insights0
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action ModelCode4
PhysPose: Refining 6D Object Poses with Physical Constraints0
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation ModelCode1
Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments0
Empowering Large Language Models with 3D Situation Awareness0
Evaluating Compositional Scene Understanding in Multimodal Generative ModelsCode0
Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery0
Show:102550
← PrevPage 14 of 173Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified