SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 11511175 of 1723 papers

TitleStatusHype
ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling0
Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos0
Vision-based Automated Bridge Component Recognition Integrated With High-level Scene Understanding0
Predicting Reaction Time to Comprehend Scenes with Foveated Scene Understanding Maps0
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving0
Prediction of Scene Plausibility0
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network0
CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds0
City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning0
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario0
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation0
Probabilistic Future Prediction for Video Scene Understanding0
ChatSplat: 3D Conversational Gaussian Splatting0
ChatBEV: A Visual Language Model that Understands BEV Maps0
ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation0
Prospective Role of Foundation Models in Advancing Autonomous Vehicles0
Vision-Centric Representation-Efficient Fine-Tuning for Robust Universal Foreground Segmentation0
PSDR-Room: Single Photo to Scene using Differentiable Rendering0
Pseudo Label-Guided Multi Task Learning for Scene Understanding0
PT-ResNet: Perspective Transformation-Based Residual Network for Semantic Road Image Segmentation0
Challenges for Monocular 6D Object Pose Estimation in Robotics0
Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM0
Quantifying the synthetic and real domain gap in aerial scene understanding0
Vision-Language Embodiment for Monocular Depth Estimation0
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding0
Show:102550
← PrevPage 47 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified