SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 126150 of 1723 papers

TitleStatusHype
Deep learning for radar data exploitation of autonomous vehicleCode1
DC-SAM: In-Context Segment Anything in Images and Videos via Dual ConsistencyCode1
F-ViTA: Foundation Model Guided Visible to Thermal TranslationCode1
DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image FusionCode1
Arabic Scene Text Recognition in the Deep Learning Era: Analysis on A Novel DatasetCode1
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object DetectionCode1
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving ScenesCode1
FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier ConvolutionsCode1
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion FramesCode1
Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene UnderstandingCode1
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny ObjectsCode1
From General to Specific: Informative Scene Graph Generation via Balance AdjustmentCode1
Global-Reasoned Multi-Task Learning Model for Surgical Scene UnderstandingCode1
CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World EnvironmentsCode1
Few-Shot Object Detection and Viewpoint Estimation for Objects in the WildCode1
CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic SegmentationCode1
OK-VQA: A Visual Question Answering Benchmark Requiring External KnowledgeCode1
A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed ImagesCode1
FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene UnderstandingCode1
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene ContextsCode1
A Data-Centric Revisit of Pre-Trained Vision Models for Robot LearningCode1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and ReasoningCode1
Context Prior for Scene SegmentationCode1
3DRM:Pair-wise relation module for 3D object detectionCode1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous DrivingCode1
Show:102550
← PrevPage 6 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified