SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 801825 of 1723 papers

TitleStatusHype
Transavs: End-To-End Audio-Visual Segmentation With Transformer0
Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and BeyondCode1
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs0
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding0
Living in a Material World: Learning Material Properties from Full-Waveform Flash Lidar Data for Semantic Segmentation0
Learning-based Relational Object Matching Across Views0
ArK: Augmented Reality with Knowledge Interactive Emergent Ability0
TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene UnderstandingCode2
DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric VoxelizationCode1
Neural Implicit Dense Semantic SLAM0
A Review of Panoptic Segmentation for Mobile Mapping Point CloudsCode1
Compositional 3D Human-Object Neural Animation0
ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding0
RGB-D Indiscernible Object Counting in Underwater ScenesCode1
Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic SegmentationCode1
Advances in Deep Concealed Scene UnderstandingCode1
Factored Neural Representation for Scene Understanding0
RS2G: Data-Driven Scene-Graph Extraction and Embedding for Robust Autonomous Perception and Scenario UnderstandingCode1
360^ High-Resolution Depth Estimation via Uncertainty-aware Structural Knowledge Transfer0
STRAP: Structured Object Affordance Segmentation with Point SupervisionCode1
Learning How To Robustly Estimate Camera Pose in Endoscopic VideosCode1
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction DetectionCode1
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene UnderstandingCode2
iDisc: Internal Discretization for Monocular Depth EstimationCode3
Graph-based Topology Reasoning for Driving ScenesCode2
Show:102550
← PrevPage 33 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified