SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 851900 of 1723 papers

TitleStatusHype
Learning Segmented 3D Gaussians via Efficient Feature Unprojection for Zero-shot Neural Scene Segmentation0
CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting0
COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation0
Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving0
Cross-modal Learning for Multi-modal Video Categorization0
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos0
DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion0
DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning0
DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference0
Data-Driven Scene Understanding with Adaptively Retrieved Exemplars0
DAWN: Vehicle Detection in Adverse Weather Nature Dataset0
Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks0
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding0
Deep Contextual Attention for Human-Object Interaction Detection0
Deep cross-domain building extraction for selective depth estimation from oblique aerial imagery0
Deep ensembles based on Stochastic Activation Selection for Polyp Segmentation0
Deep Learned Full-3D Object Completion from Single View0
Deep Learning Advances in Vision-Based Traffic Accident Anticipation: A Comprehensive Review of Methods,Datasets,and Future Directions0
Deep Learning Techniques for Geospatial Data Analysis0
Deeply Learned Attributes for Crowded Scene Understanding0
Deep Optics for Monocular Depth Estimation and 3D Object Detection0
Deep Point Cloud Simplification for High-quality Surface Reconstruction0
Deep Robust Single Image Depth Estimation Neural Network Using Scene Understanding0
Deep Scene Text Detection with Connected Component Proposals0
Deep Semantic Segmentation of Natural and Medical Images: A Review0
Deep Structured Scene Parsing by Learning with Image Descriptions0
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding0
Dense RGB-D semantic mapping with Pixel-Voxel neural network0
Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds0
Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization0
DepthCut: Improved Depth Edge Estimation Using Multiple Unreliable Channels0
Depth Estimation using Weighted-loss and Transfer Learning0
Depth Not Needed - An Evaluation of RGB-D Feature Encodings for Off-Road Scene Understanding by Convolutional Neural Network0
Design and Evaluation of Deep Learning-Based Dual-Spectrum Image Fusion Methods0
Designing Deep Networks for Surface Normal Estimation0
Designing DNNs for a trade-off between robustness and processing performance in embedded devices0
DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction0
Diagnostics in Semantic Segmentation0
Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis0
DiffSDFSim: Differentiable Rigid-Body Dynamics With Implicit Shapes0
Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer0
Diffusion Models in 3D Vision: A Survey0
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data0
Digital Divides in Scene Recognition: Uncovering Socioeconomic Biases in Deep Learning Systems0
DINeMo: Learning Neural Mesh Models with no 3D Annotations0
Direction-Aware Semi-Dense SLAM0
DirectShape: Direct Photometric Alignment of Shape Priors for Visual Vehicle Pose and Shape Estimation0
Disaster Anomaly Detector via Deeper FCDDs for Explainable Initial Responses0
Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization0
Discriminative Multi-Modal Feature Fusion for RGBD Indoor Scene Recognition0
Show:102550
← PrevPage 18 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified