SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 301325 of 1723 papers

TitleStatusHype
From General to Specific: Informative Scene Graph Generation via Balance AdjustmentCode1
General Geometry-aware Weakly Supervised 3D Object DetectionCode1
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D UnderstandingCode1
A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed ImagesCode1
A Survey on Deep Learning Technique for Video SegmentationCode1
4D Panoptic LiDAR SegmentationCode1
A Two-Stage Masked Autoencoder Based Network for Indoor Depth CompletionCode1
Context Prior for Scene SegmentationCode1
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving ScenesCode1
Few-Shot Object Detection and Viewpoint Estimation for Objects in the WildCode1
CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic SurgeryCode1
Human-centric Scene Understanding for 3D Large-scale ScenariosCode1
A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine IntelligenceCode1
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous DrivingCode1
CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World EnvironmentsCode1
Image Segmentation Using Deep Learning: A SurveyCode1
Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene UnderstandingCode1
A Survey of World Models for Autonomous DrivingCode1
Affect2MM: Affective Analysis of Multimedia Content Using Emotion CausalityCode1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous DrivingCode1
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D ScansCode1
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene UnderstandingCode1
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3DCode1
Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic SegmentationCode1
FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene UnderstandingCode1
Show:102550
← PrevPage 13 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified