SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 13511375 of 1723 papers

TitleStatusHype
Visual Lexicon: Rich Image Features in Language Space0
Visual Semantic Parsing: From Images to Abstract Meaning Representation0
Visual-Semantic Scene Understanding by Sharing Labels in a Context Network0
Visual Traffic Knowledge Graph Generation from Scene Images0
Visual Vibrometry: Estimating MaterialProperties from Small Motions in Video0
Visual Vibrometry: Estimating Material Properties From Small Motion in Video0
Visual Vibrometry: Estimating Material Properties from Small Motions in Video0
Visuomotor Understanding for Representation Learning of Driving Scenes0
VLM-E2E: Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion0
VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry0
VLP: Vision Language Planning for Autonomous Driving0
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding0
VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding0
vS-Graphs: Integrating Visual SLAM and Situational Graphs through Multi-level Scene Understanding0
Waymo Open Dataset: Panoramic Video Panoptic Segmentation0
Weakly Supervised 3D Instance Segmentation without Instance-level Annotations0
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment0
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning0
Weakly Supervised Learning of Affordances0
Weakly Supervised Point Clouds Transformer for 3D Object Detection0
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots0
What Demands Attention in Urban Street Scenes? From Scene Understanding towards Road Safety: A Survey of Vision-driven Datasets and Studies0
What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?0
When Neural Networks Using Different Sensors Create Similar Features0
When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach0
Show:102550
← PrevPage 55 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified