SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 10011050 of 1723 papers

TitleStatusHype
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding0
P3Depth: Monocular Depth Estimation with a Piecewise Planarity PriorCode1
BinsFormer: Revisiting Adaptive Bins for Monocular Depth EstimationCode2
Online panoptic 3D reconstruction as a Linear Assignment ProblemCode1
Point Scene Understanding via Disentangled Instance Mesh ReconstructionCode1
Collaborative Transformers for Grounded Situation RecognitionCode1
Multi-Task Learning for Visual Scene Understanding0
Learning to Answer Questions in Dynamic Audio-Visual ScenariosCode1
Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering0
Self-Supervised Road Layout Parsing with Graph Auto-EncodingCode0
Towards 3D Scene Understanding by Referring Synthetic Models0
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows0
Deep Point Cloud Simplification for High-quality Surface Reconstruction0
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question AnsweringCode1
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans0
WeakM3D: Towards Weakly Supervised Monocular 3D Object DetectionCode1
Deep learning for radar data exploitation of autonomous vehicleCode1
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene UnderstandingCode2
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry0
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with TransformersCode2
On Steering Multi-Annotations per Sample for Multi-Task Learning0
Fast Neural Architecture Search for Lightweight Dense Prediction Networks0
Hybrid Optimized Deep Convolution Neural Network based Learning Model for Object Detection0
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic SegmentationCode1
TransKD: Transformer Knowledge Distillation for Efficient Semantic SegmentationCode1
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep LearningCode1
RescueNet: A High Resolution UAV Semantic Segmentation Benchmark Dataset for Natural Disaster Damage AssessmentCode1
GroupViT: Semantic Segmentation Emerges from Text SupervisionCode2
ReorientBot: Learning Object Reorientation for Specific-Posed PlacementCode1
Movies2Scenes: Using Movie Metadata to Learn Scene Representation0
3DRM:Pair-wise relation module for 3D object detectionCode1
CARL-D: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentationCode0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
HAKE: A Knowledge Engine Foundation for Human Activity UnderstandingCode2
SafePicking: Learning Safe Object Extraction via Object-Level MappingCode1
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera IntrinsicsCode1
Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL)0
StandardSim: A Synthetic Dataset For Retail Environments0
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient EvaluationCode0
Unsupervised Single-shot Depth Estimation using Perceptual ReconstructionCode0
Global-Reasoned Multi-Task Learning Model for Surgical Scene UnderstandingCode1
MonoDistill: Learning Spatial Features for Monocular 3D Object DetectionCode1
Moving Beyond Navigation with Active Neural SLAM0
Towards holistic scene understanding: Semantic segmentation and beyond0
Interactive Attention AI to translate low light photos to captions for night scene understanding in women safety0
Scene Graph Generation: A Comprehensive Survey0
Weakly Supervised Segmentation on Outdoor 4D Point Clouds With Temporal Matching and Spatial Graph PropagationCode0
Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic Segmentation0
Glass Segmentation Using Intensity and Spectral Polarization Cues0
Show:102550
← PrevPage 21 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified