SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 301350 of 1723 papers

TitleStatusHype
Uncertainty-aware Panoptic SegmentationCode1
IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic EnvironmentsCode1
MGNet: Monocular Geometric Scene Understanding for Autonomous DrivingCode1
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive LearningCode1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and ReasoningCode1
Spatiality-guided Transformer for 3D Dense Captioning on Point CloudsCode1
P3Depth: Monocular Depth Estimation with a Piecewise Planarity PriorCode1
Online panoptic 3D reconstruction as a Linear Assignment ProblemCode1
Point Scene Understanding via Disentangled Instance Mesh ReconstructionCode1
Collaborative Transformers for Grounded Situation RecognitionCode1
Learning to Answer Questions in Dynamic Audio-Visual ScenariosCode1
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question AnsweringCode1
WeakM3D: Towards Weakly Supervised Monocular 3D Object DetectionCode1
Deep learning for radar data exploitation of autonomous vehicleCode1
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic SegmentationCode1
TransKD: Transformer Knowledge Distillation for Efficient Semantic SegmentationCode1
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep LearningCode1
RescueNet: A High Resolution UAV Semantic Segmentation Benchmark Dataset for Natural Disaster Damage AssessmentCode1
ReorientBot: Learning Object Reorientation for Specific-Posed PlacementCode1
3DRM:Pair-wise relation module for 3D object detectionCode1
SafePicking: Learning Safe Object Extraction via Object-Level MappingCode1
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera IntrinsicsCode1
Global-Reasoned Multi-Task Learning Model for Surgical Scene UnderstandingCode1
MonoDistill: Learning Spatial Features for Monocular 3D Object DetectionCode1
Point Cloud Pre-Training With Natural 3D StructuresCode1
MSeg: A Composite Dataset for Multi-domain Semantic SegmentationCode1
Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth EstimationCode1
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene ManipulationCode1
ScanQA: 3D Question Answering for Spatial Scene UnderstandingCode1
Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic SegmentationCode1
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic SegmentationCode1
Behind the Curtain: Learning Occluded Shapes for 3D Object DetectionCode1
AirObject: A Temporally Evolving Graph Embedding for Object IdentificationCode1
Instance-wise Occlusion and Depth Orders in Natural ScenesCode1
Cerberus Transformer: Joint Semantic, Affordance and Attribute ParsingCode1
Grounded Situation Recognition with TransformersCode1
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D DataCode1
Learning Object-Centric Representations of Multi-Object Scenes from Multiple ViewsCode1
Panoptic 3D Scene Reconstruction From a Single RGB ImageCode1
3DP3: 3D Scene Perception via Probabilistic ProgrammingCode1
A Versatile and Efficient Reinforcement Learning Framework for Autonomous DrivingCode1
PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB ImageCode1
Structured Bird's-Eye-View Traffic Scene Understanding from Onboard ImagesCode1
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3DCode1
Semantic Segmentation-assisted Scene Completion for LiDAR Point CloudsCode1
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal EstimationCode1
PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point CloudsCode1
Spatio-temporal Self-Supervised Representation Learning for 3D Point CloudsCode1
From General to Specific: Informative Scene Graph Generation via Balance AdjustmentCode1
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based OptimizationCode1
Show:102550
← PrevPage 7 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified