SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 351375 of 1723 papers

TitleStatusHype
Dual-Hybrid Attention Network for Specular Highlight RemovalCode1
Segmenting Known Objects and Unseen Unknowns without Prior KnowledgeCode1
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous DrivingCode1
Deep Learning for Event-based Vision: A Comprehensive Survey and BenchmarksCode1
AeroRIT: A New Scene for Hyperspectral Image AnalysisCode1
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene UnderstandingCode1
IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic EnvironmentsCode1
PeakConv: Learning Peak Receptive Field for Radar Semantic SegmentationCode1
ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic SegmentationCode1
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based OptimizationCode1
Behind the Curtain: Learning Occluded Shapes for 3D Object DetectionCode1
Multi-Path Region Mining For Weakly Supervised 3D Semantic Segmentation on Point CloudsCode1
PointContrast: Unsupervised Pre-training for 3D Point Cloud UnderstandingCode1
Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color ContrastCode1
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny ObjectsCode1
Image Masking for Robust Self-Supervised Monocular Depth EstimationCode1
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map ConstructionCode1
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic SegmentationCode1
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object DetectionCode1
Polysemy Deciphering Network for Robust Human-Object Interaction DetectionCode1
Multimodal Dataset for Localization, Mapping and Crop Monitoring in Citrus Tree FarmsCode1
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIPCode1
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene ManipulationCode1
RGB-D Indiscernible Object Counting in Underwater ScenesCode1
Multimodal Fusion and Vision-Language Models: A Survey for Robot VisionCode1
Show:102550
← PrevPage 15 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified