SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 13511400 of 1723 papers

TitleStatusHype
Visual Lexicon: Rich Image Features in Language Space0
Visual Semantic Parsing: From Images to Abstract Meaning Representation0
Visual-Semantic Scene Understanding by Sharing Labels in a Context Network0
Visual Traffic Knowledge Graph Generation from Scene Images0
Visual Vibrometry: Estimating MaterialProperties from Small Motions in Video0
Visual Vibrometry: Estimating Material Properties From Small Motion in Video0
Visual Vibrometry: Estimating Material Properties from Small Motions in Video0
Visuomotor Understanding for Representation Learning of Driving Scenes0
VLM-E2E: Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion0
VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry0
VLP: Vision Language Planning for Autonomous Driving0
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding0
VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding0
vS-Graphs: Integrating Visual SLAM and Situational Graphs through Multi-level Scene Understanding0
Waymo Open Dataset: Panoramic Video Panoptic Segmentation0
Weakly Supervised 3D Instance Segmentation without Instance-level Annotations0
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment0
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning0
Weakly Supervised Learning of Affordances0
Weakly Supervised Point Clouds Transformer for 3D Object Detection0
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots0
What Demands Attention in Urban Street Scenes? From Scene Understanding towards Road Safety: A Survey of Vision-driven Datasets and Studies0
What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?0
When Neural Networks Using Different Sensors Create Similar Features0
When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach0
Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks0
Wireless Sensing With Deep Spectrogram Network and Primitive Based Autoregressive Hybrid Channel Model0
YETI (YET to Intervene) Proactive Interventions by Multimodal AI Agents in Augmented Reality Tasks0
You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects0
You Only Speak Once to See0
Zero-Shot 4D Lidar Panoptic Segmentation0
Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models0
Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models0
Zero-Shot Semantic Segmentation via Spatial and Multi-Scale Aware Visual Class Embedding0
ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding0
Polarimetric Spatio-Temporal Light Transport Probing0
Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments0
PoSeg: Pose-Aware Refinement Network for Human Instance Segmentation0
PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation0
Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos0
Predicting Reaction Time to Comprehend Scenes with Foveated Scene Understanding Maps0
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving0
Prediction of Scene Plausibility0
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network0
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario0
Probabilistic Future Prediction for Video Scene Understanding0
ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation0
Prospective Role of Foundation Models in Advancing Autonomous Vehicles0
PSDR-Room: Single Photo to Scene using Differentiable Rendering0
Pseudo Label-Guided Multi Task Learning for Scene Understanding0
Show:102550
← PrevPage 28 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified