SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 651675 of 1723 papers

TitleStatusHype
Prospective Role of Foundation Models in Advancing Autonomous Vehicles0
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object DetectionCode1
IGFNet: Illumination-Guided Fusion Network for Semantic Scene Understanding using RGB-Thermal ImagesCode0
Repurposing Diffusion-Based Image Generators for Monocular Depth EstimationCode4
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAMCode4
A Review and A Robust Framework of Data-Efficient 3D Scene Parsing with Traditional/Learned 3D Descriptors0
Segment Any 3D Gaussians0
Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language AlignmentCode3
Gaussian Grouping: Segment and Edit Anything in 3D ScenesCode2
Language Embedded 3D Gaussians for Open-Vocabulary Scene UnderstandingCode1
SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene SegmentationCode1
HAtt-Flow: Hierarchical Attention-Flow Mechanism for Group Activity Scene Graph Generation in Videos0
Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames0
Panoptic Video Scene Graph GenerationCode1
REACT: Recognize Every Action Everywhere All At Once0
FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding0
Multi-task Planar Reconstruction with Feature Warping GuidanceCode0
GPT-4V Takes the Wheel: Promises and Challenges for Pedestrian Behavior Prediction0
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense KnowledgeCode1
SeaDSC: A video-based unsupervised method for dynamic scene change detection in unmanned surface vehicles0
GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding0
SpectralGPT: Spectral Remote Sensing Foundation ModelCode2
Two Stream Scene Understanding on Graph Embedding0
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal ModelsCode3
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous DrivingCode2
Show:102550
← PrevPage 27 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified