SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 14511475 of 1723 papers

TitleStatusHype
Assessing the generalization performance of SAM for ureteroscopy scene understanding0
A Sentence Is Worth a Thousand Pixels0
A Semantic Communication System for Real-time 3D Reconstruction Tasks0
Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph0
Semantic and structural image segmentation for prosthetic vision0
Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning0
You Only Speak Once to See0
Structured agents for physical construction0
Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization0
Structured Generative Models for Scene Understanding0
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning0
Neural Language of Thought Models0
A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation0
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos0
Submodular Field Grammars: Representation, Inference, and Application to Image Parsing0
A Robotic 3D Perception System for Operating Room Environment Awareness0
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite0
SUPER: A Novel Lane Detection System0
ArK: Augmented Reality with Knowledge Interactive Emergent Ability0
SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians0
Weakly Supervised Learning of Affordances0
Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review0
SurgiSAM2: Fine-tuning a foundational model for surgical video anatomy segmentation and detection0
SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks0
Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models0
Show:102550
← PrevPage 59 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified