SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 10511100 of 1723 papers

TitleStatusHype
Robust Category-Level 3D Pose Estimation from Synthetic Data0
Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments0
PanoContext-Former: Panoramic Total Scene Understanding with a Transformer0
Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamics Audio-Visual ScenariosCode0
Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding0
Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene GraphsCode0
MetaMorphosis: Task-oriented Privacy Cognizant Feature Generation for Multi-task Learning0
Transavs: End-To-End Audio-Visual Segmentation With Transformer0
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs0
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding0
Living in a Material World: Learning Material Properties from Full-Waveform Flash Lidar Data for Semantic Segmentation0
Learning-based Relational Object Matching Across Views0
ArK: Augmented Reality with Knowledge Interactive Emergent Ability0
Neural Implicit Dense Semantic SLAM0
Compositional 3D Human-Object Neural Animation0
ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding0
Factored Neural Representation for Scene Understanding0
360^ High-Resolution Depth Estimation via Uncertainty-aware Structural Knowledge Transfer0
Semantic Segmentation with High Inference Speed in Off-Road EnvironmentsCode0
Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation0
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene UnderstandingCode0
Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings0
OVeNet: Offset Vector Network for Semantic SegmentationCode0
Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation0
Uni-Fusion: Universal Continuous Mapping0
Semantic segmentation of surgical hyperspectral images under geometric domain shifts0
Content Adaptive Front End For Audio Classification0
Efficient Computation Sharing for Multi-Task Visual Scene UnderstandingCode0
Shifted-Windows Transformers for the Detection of Cerebral Aneurysms in Microsurgery0
PENet: A Joint Panoptic Edge Detection NetworkCode0
Generalized 3D Self-supervised Learning Framework via Prompted Foreground-Aware Feature Contrast0
Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts, Datasets and Metrics0
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP0
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media ReasoningCode0
Unified Perception: Efficient Depth-Aware Video Panoptic Segmentation with Minimal Annotation Costs0
APARATE: Adaptive Adversarial Patch for CNN-based Monocular Depth Estimation for Autonomous Navigation0
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning0
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors0
RemoteNet: Remote Sensing Image Segmentation Network based on Global-Local Information0
Open Challenges for Monocular Single-shot 6D Object Pose Estimation0
Explicit3D: Graph Network with Spatial Inference for Single Image 3D Object Detection0
Structured Generative Models for Scene Understanding0
Object-Centric Scene Representations using Active Inference0
A Flexible Framework for Virtual Omnidirectional Vision to Improve Operator Situation Awareness0
Learning from Mistakes: Self-Regularizing Hierarchical Representations in Point Cloud Semantic Segmentation0
Model-based inexact graph matching on top of CNNs for semantic scene understandingCode0
Long Range Pooling for 3D Large-Scale Scene Understanding0
A Comprehensive Review of Modern Object Segmentation Approaches0
Neural Radiance Field CodebooksCode0
Seeing With Sound: Long-range Acoustic Beamforming for Multimodal Scene Understanding0
Show:102550
← PrevPage 22 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified