SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 14511500 of 1723 papers

TitleStatusHype
Assessing the generalization performance of SAM for ureteroscopy scene understanding0
A Sentence Is Worth a Thousand Pixels0
A Semantic Communication System for Real-time 3D Reconstruction Tasks0
Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph0
Semantic and structural image segmentation for prosthetic vision0
Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning0
You Only Speak Once to See0
Structured agents for physical construction0
Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization0
Structured Generative Models for Scene Understanding0
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning0
Neural Language of Thought Models0
A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation0
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos0
Submodular Field Grammars: Representation, Inference, and Application to Image Parsing0
A Robotic 3D Perception System for Operating Room Environment Awareness0
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite0
SUPER: A Novel Lane Detection System0
ArK: Augmented Reality with Knowledge Interactive Emergent Ability0
SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians0
Weakly Supervised Learning of Affordances0
Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review0
SurgiSAM2: Fine-tuning a foundational model for surgical video anatomy segmentation and detection0
SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks0
Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models0
A Review on Visual-SLAM: Advancements from Geometric Modelling to Learning-based Semantic Scene Understanding0
SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field0
Survey of Action Recognition, Spotting and Spatio-Temporal Localization in Soccer -- Current Trends and Research Perspectives0
A Review and A Robust Framework of Data-Efficient 3D Scene Parsing with Traditional/Learned 3D Descriptors0
A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators0
3D-Aware Instance Segmentation and Tracking in Egocentric Videos0
Symbolic Graph Inference for Compound Scene Understanding0
Synergizing Contrastive Learning and Optimal Transport for 3D Point Cloud Domain Adaptation0
Syn-Mediverse: A Multimodal Synthetic Dataset for Intelligent Scene Understanding of Healthcare Facilities0
A Reinforcement Learning Approach to Target Tracking in a Camera Network0
SynthCam3D: Semantic Understanding With Synthetic Indoor Scenes0
Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery0
A Reflectance Based Method For Shadow Detection and Removal0
Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander0
Tactile MNIST: Benchmarking Active Tactile Perception0
TADFormer : Task-Adaptive Dynamic Transformer for Efficient Multi-Task Learning0
TADFormer: Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning0
Are Cars Just 3D Boxes? - Jointly Estimating the 3D Shape of Multiple Objects0
AquaticCLIP: A Vision-Language Foundation Model for Underwater Scene Analysis0
Talk-to-Resolve: Combining scene understanding and spatial dialogue to resolve granular task ambiguity for a collocated robot0
TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs0
Weakly Supervised Point Clouds Transformer for 3D Object Detection0
TARS: Traffic-Aware Radar Scene Flow Estimation0
Zero-Shot 4D Lidar Panoptic Segmentation0
A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance0
Show:102550
← PrevPage 30 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified