SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 251275 of 1723 papers

TitleStatusHype
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object DetectionCode1
Digging Into Self-Supervised Monocular Depth EstimationCode1
Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor SetupsCode1
Detecting Human-Object Interaction via Fabricated Compositional LearningCode1
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D ScansCode1
ALFWorld: Aligning Text and Embodied Environments for Interactive LearningCode1
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map ConstructionCode1
Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D GaussianCode1
Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy PredictionCode1
Learning to Answer Questions in Dynamic Audio-Visual ScenariosCode1
Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based AdaptationCode1
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny ObjectsCode1
LiON: Learning Point-wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic DataCode1
Learning Triadic Belief Dynamics in Nonverbal Communication from VideosCode1
A Two-Stage Masked Autoencoder Based Network for Indoor Depth CompletionCode1
AirObject: A Temporally Evolving Graph Embedding for Object IdentificationCode1
Deep learning for radar data exploitation of autonomous vehicleCode1
A Hybrid Sparse-Dense Monocular SLAM System for Autonomous DrivingCode1
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based OptimizationCode1
Learning Object-level Point Augmentor for Semi-supervised 3D Object DetectionCode1
Learning Visual Commonsense for Robust Scene Graph GenerationCode1
Learning Human-Object Interaction Detection using Interaction PointsCode1
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action RecognitionCode1
Learning and Reasoning with the Graph Structure Representation in Robotic SurgeryCode1
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose EstimationCode1
Show:102550
← PrevPage 11 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified