SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 14011425 of 1723 papers

TitleStatusHype
Single Image 3D Without a Single 3D Image0
Single Image Depth Estimation: An Overview0
Single-Input Multi-Output Model Merging: Leveraging Foundation Models for Dense Multi-Task Learning0
3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning0
AutoLaparo: A New Dataset of Integrated Multi-tasks for Image-guided Surgical Automation in Laparoscopic Hysterectomy0
You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects0
Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture0
Augmented Efficiency: Reducing Memory Footprint and Accelerating Inference for 3D Semantic Segmentation through Hybrid Vision0
Waymo Open Dataset: Panoramic Video Panoptic Segmentation0
SkyScenes: A Synthetic Dataset for Aerial Scene Understanding0
SLGaussian: Fast Language Gaussian Splatting in Sparse Views0
Weakly Supervised 3D Instance Segmentation without Instance-level Annotations0
Small Drone Field Experiment: Data Collection & Processing0
Small-Variance Nonparametric Clustering on the Hypersphere0
Smart Infrastructure: A Research Junction0
Audiovisual Highlight Detection in Videos0
SNeL: A Structured Neuro-Symbolic Language for Entity-Based Multimodal Scene Understanding0
Audio-visual Event Localization on Portrait Mode Short Videos0
Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition0
3D Gated Recurrent Fusion for Semantic Scene Completion0
Software-Defined FPGA Accelerator Design for Mobile Deep Learning Applications0
SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from Monocular images0
So you think you can track?0
SparseLGS: Sparse View Language Embedded Gaussian Splatting0
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models0
Show:102550
← PrevPage 57 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified