SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 351400 of 1723 papers

TitleStatusHype
ODAM: Object Detection, Association, and Mapping using Posed RGB VideoCode1
Instance Segmentation in 3D Scenes using Semantic Superpoint Tree NetworksCode1
A Hybrid Sparse-Dense Monocular SLAM System for Autonomous DrivingCode1
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action RecognitionCode1
One-Shot Object Affordance Detection in the WildCode1
Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View ImagesCode1
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object DetectionCode1
Arabic Scene Text Recognition in the Deep Learning Era: Analysis on A Novel DatasetCode1
ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic SegmentationCode1
Photon-Starved Scene Inference using Single Photon CamerasCode1
Class-Incremental Domain Adaptation with Smoothing and Calibration for Surgical Report GenerationCode1
SynPick: A Dataset for Dynamic Bin Picking Scene UnderstandingCode1
A Survey on Deep Learning Technique for Video SegmentationCode1
P2T: Pyramid Pooling Transformer for Scene UnderstandingCode1
EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic SegmentationCode1
Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-View TransformationCode1
Part-aware Panoptic SegmentationCode1
Vision Transformers with Hierarchical AttentionCode1
Light Field Networks: Neural Scene Representations with Single-Evaluation RenderingCode1
Lane Graph Estimation for Scene Understanding in Urban DrivingCode1
RelTransformer: A Transformer-Based Long-Tail Visual Relationship RecognitionCode1
SSPC-Net: Semi-supervised Semantic 3D Point Cloud Segmentation NetworkCode1
Visiting the Invisible: Layer-by-Layer Completed Scene DecompositionCode1
Semantic Scene Completion via Integrating Instances and Scene in-the-LoopCode1
Affordance Transfer Learning for Human-Object Interaction DetectionCode1
Learning Triadic Belief Dynamics in Nonverbal Communication from VideosCode1
Multi-View Radar Semantic SegmentationCode1
SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D SequencesCode1
Bidirectional Projection Network for Cross Dimension Scene UnderstandingCode1
Tracking Pedestrian Heads in Dense CrowdCode1
Relation-aware Instance Refinement for Weakly Supervised Visual GroundingCode1
OFFSEG: A Semantic Segmentation Framework For Off-Road DrivingCode1
Detecting Human-Object Interaction via Fabricated Compositional LearningCode1
Monte Carlo Scene Search for 3D Scene UnderstandingCode1
Holistic 3D Scene Understanding from a Single Image with Implicit RepresentationCode1
Affect2MM: Affective Analysis of Multimedia Content Using Emotion CausalityCode1
Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph AnalysisCode1
Panoramic Panoptic Segmentation: Towards Complete Surrounding Understanding via Unsupervised Contrastive LearningCode1
FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud SegmentationCode1
Boundary-induced and scene-aggregated network for monocular depth predictionCode1
4D Panoptic LiDAR SegmentationCode1
RGB-D Railway Platform Monitoring and Scene Understanding for Enhanced Passenger SafetyCode1
Weakly Supervised Learning of Rigid 3D Scene FlowCode1
A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed ImagesCode1
Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical PanoramasCode1
OpenGF: An Ultra-Large-Scale Ground Filtering Dataset Built Upon Open ALS Point Clouds Around the WorldCode1
Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor SetupsCode1
Grounding Consistency: Distilling Spatial Common Sense for Precise Visual Relationship DetectionCode1
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene ContextsCode1
Event-based Motion Segmentation with Spatio-Temporal Graph CutsCode1
Show:102550
← PrevPage 8 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified