SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 11511200 of 1723 papers

TitleStatusHype
A Survey on Deep Learning Technique for Video SegmentationCode1
An Analysis of State-of-the-Art Models for Situated Interactive MultiModal Conversations (SIMMC)0
Egocentric Image Captioning for Privacy-Preserved Passive Dietary Intake Monitoring0
Unsupervised Image Segmentation by Mutual Information Maximization and Adversarial Regularization0
IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement0
False Negative Reduction in Video Instance Segmentation using Uncertainty EstimatesCode0
SDOF-Tracker: Fast and Accurate Multiple Human Tracking by Skipped-Detection and Optical-FlowCode0
OffRoadTranSeg: Semi-Supervised Segmentation using Transformers on OffRoad environments0
iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability0
P2T: Pyramid Pooling Transformer for Scene UnderstandingCode1
EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic SegmentationCode1
Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-View TransformationCode1
OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets0
Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion0
Part-aware Panoptic SegmentationCode1
Vision Transformers with Hierarchical AttentionCode1
Light Field Networks: Neural Scene Representations with Single-Evaluation RenderingCode1
Towards urban scenes understanding through polarization cues0
Polarimetric Spatio-Temporal Light Transport Probing0
Egocentric Activity Recognition and Localization on a 3D Map0
SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data0
Image interpretation by iterative bottom-up top-down processingCode0
Scene Understanding for Autonomous Driving0
Lane Graph Estimation for Scene Understanding in Urban DrivingCode1
ACDC: The Adverse Conditions Dataset with Correspondences for Robust Semantic Driving Scene Perception0
RelTransformer: A Transformer-Based Long-Tail Visual Relationship RecognitionCode1
Aerial Scene Understanding in The Wild: Multi-Scene Recognition via Prototype-based Memory NetworksCode0
Wireless Sensing With Deep Spectrogram Network and Primitive Based Autoregressive Hybrid Channel Model0
MonoGRNet: A General Framework for Monocular 3D Object Detection0
SSPC-Net: Semi-supervised Semantic 3D Point Cloud Segmentation NetworkCode1
Single Image Depth Estimation: An Overview0
Visiting the Invisible: Layer-by-Layer Completed Scene DecompositionCode1
Semantic Scene Completion via Integrating Instances and Scene in-the-LoopCode1
Affordance Transfer Learning for Human-Object Interaction DetectionCode1
Learning Triadic Belief Dynamics in Nonverbal Communication from VideosCode1
Deep ensembles based on Stochastic Activation Selection for Polyp Segmentation0
Evaluation of Multimodal Semantic Segmentation using RGB-D Data0
Multi-View Radar Semantic SegmentationCode1
PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN0
SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D SequencesCode1
Bidirectional Projection Network for Cross Dimension Scene UnderstandingCode1
Input-Output Balanced Framework for Long-tailed LiDAR Semantic Segmentation0
Tracking Pedestrian Heads in Dense CrowdCode1
Relation-aware Instance Refinement for Weakly Supervised Visual GroundingCode1
OFFSEG: A Semantic Segmentation Framework For Off-Road DrivingCode1
Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving0
Knowledge-Guided Object Discovery with Acquired Deep ImpressionsCode0
A Comprehensive Survey of Scene Graphs: Generation and Application0
Lite-HDSeg: LiDAR Semantic Segmentation Using Lite Harmonic Dense Convolutions0
Detecting Human-Object Interaction via Fabricated Compositional LearningCode1
Show:102550
← PrevPage 24 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified