SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 15511575 of 1723 papers

TitleStatusHype
Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance VotingCode0
Multi-task Planar Reconstruction with Feature Warping GuidanceCode0
Multi-task Geometric Estimation of Depth and Surface Normal from Monocular 360° ImagesCode0
Holistic 3D Scene Parsing and Reconstruction from a Single RGB ImageCode0
Multi-Resolution Multi-Modal Sensor Fusion For Remote Sensing Data With Label UncertaintyCode0
ShelfNet for Fast Semantic SegmentationCode0
Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth EstimationCode0
Deep Depth from Defocus: how can defocus blur improve 3D estimation using dense neural networks?Code0
BACS: Background Aware Continual Semantic SegmentationCode0
RESSCAL3D++: Joint Acquisition and Semantic Segmentation of 3D Point CloudsCode0
ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed dataCode0
Hierarchical Superpixel Segmentation via Structural Information TheoryCode0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language NavigationCode0
Veritatem Dies Aperit- Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding ApproachCode0
Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding ApproachCode0
Hierarchical Context Transformer for Multi-level Semantic Scene UnderstandingCode0
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agentsCode0
Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic SurgeryCode0
MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and ClassificationCode0
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep ThinkingCode0
MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object LocalizationCode0
Monocular 3D Object Detection with Pseudo-LiDAR Point CloudCode0
DC-Scene: Data-Centric Learning for 3D Scene UnderstandingCode0
RIO: 3D Object Instance Re-Localization in Changing Indoor EnvironmentsCode0
Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene UnderstandingCode0
Show:102550
← PrevPage 63 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified