SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 11511200 of 1723 papers

TitleStatusHype
ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling0
Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos0
Vision-based Automated Bridge Component Recognition Integrated With High-level Scene Understanding0
Predicting Reaction Time to Comprehend Scenes with Foveated Scene Understanding Maps0
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving0
Prediction of Scene Plausibility0
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network0
CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds0
City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning0
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario0
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation0
Probabilistic Future Prediction for Video Scene Understanding0
ChatSplat: 3D Conversational Gaussian Splatting0
ChatBEV: A Visual Language Model that Understands BEV Maps0
ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation0
Prospective Role of Foundation Models in Advancing Autonomous Vehicles0
Vision-Centric Representation-Efficient Fine-Tuning for Robust Universal Foreground Segmentation0
PSDR-Room: Single Photo to Scene using Differentiable Rendering0
Pseudo Label-Guided Multi Task Learning for Scene Understanding0
PT-ResNet: Perspective Transformation-Based Residual Network for Semantic Road Image Segmentation0
Challenges for Monocular 6D Object Pose Estimation in Robotics0
Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM0
Quantifying the synthetic and real domain gap in aerial scene understanding0
Vision-Language Embodiment for Monocular Depth Estimation0
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding0
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding0
Category-Level and Open-Set Object Pose Estimation for Robotics0
Radiation Search Operations using Scene Understanding with Autonomous UAV and UGV0
Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images0
RAFT: Robust Augmentation of FeaTures for Image Segmentation0
RailSem19: A Dataset for Semantic Rail Scene Understanding0
RangeSeg: Range-Aware Real Time Segmentation of 3D LiDAR Point Clouds0
Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning0
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry0
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration0
RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation0
Vision-Language Models for Autonomous Driving: CLIP-Based Dynamic Scene Understanding0
REACT: Recognize Every Action Everywhere All At Once0
RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation0
Vision-Language Models Struggle to Align Entities across Modalities0
Real time backbone for semantic segmentation0
Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL)0
Real-Time Semantic Stereo Matching0
Reasoning About Physical Interactions with Object-Centric Models0
Reasoning About Physical Interactions with Object-Oriented Prediction and Planning0
Reasoning with shapes: profiting cognitive susceptibilities to infer linear mapping transformations between shapes0
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation0
Recognizing Dynamic Scenes with Deep Dual Descriptor based on Key Frames and Key Segments0
Recognizing Material Properties from Images0
Reconstructing Animals and the Wild0
Show:102550
← PrevPage 24 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified