SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 10011025 of 1723 papers

TitleStatusHype
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding0
P3Depth: Monocular Depth Estimation with a Piecewise Planarity PriorCode1
BinsFormer: Revisiting Adaptive Bins for Monocular Depth EstimationCode2
Online panoptic 3D reconstruction as a Linear Assignment ProblemCode1
Point Scene Understanding via Disentangled Instance Mesh ReconstructionCode1
Collaborative Transformers for Grounded Situation RecognitionCode1
Multi-Task Learning for Visual Scene Understanding0
Learning to Answer Questions in Dynamic Audio-Visual ScenariosCode1
Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering0
Self-Supervised Road Layout Parsing with Graph Auto-EncodingCode0
Towards 3D Scene Understanding by Referring Synthetic Models0
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows0
Deep Point Cloud Simplification for High-quality Surface Reconstruction0
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question AnsweringCode1
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans0
WeakM3D: Towards Weakly Supervised Monocular 3D Object DetectionCode1
Deep learning for radar data exploitation of autonomous vehicleCode1
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene UnderstandingCode2
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry0
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with TransformersCode2
On Steering Multi-Annotations per Sample for Multi-Task Learning0
Fast Neural Architecture Search for Lightweight Dense Prediction Networks0
Hybrid Optimized Deep Convolution Neural Network based Learning Model for Object Detection0
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic SegmentationCode1
Show:102550
← PrevPage 41 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified