SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 401425 of 1723 papers

TitleStatusHype
RADIATE: A Radar Dataset for Automotive Perception in Bad WeatherCode1
Real-Time Semantic Segmentation using Hyperspectral Images for Mapping Unstructured and Unknown EnvironmentsCode1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous DrivingCode1
Relation-aware Instance Refinement for Weakly Supervised Visual GroundingCode1
RELLIS-3D Dataset: Data, Benchmarks and AnalysisCode1
RelTransformer: A Transformer-Based Long-Tail Visual Relationship RecognitionCode1
RescueNet: A High Resolution UAV Semantic Segmentation Benchmark Dataset for Natural Disaster Damage AssessmentCode1
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene UnderstandingCode1
RGB-D Railway Platform Monitoring and Scene Understanding for Enhanced Passenger SafetyCode1
Distilled Semantics for Comprehensive Scene Understanding from VideosCode1
A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed ImagesCode1
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene ContextsCode1
DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object DetectionCode1
ROOT: VLM based System for Indoor Scene Understanding and BeyondCode1
Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise BinarizationCode1
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation ModelCode1
Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph AnalysisCode1
SafePicking: Learning Safe Object Extraction via Object-Level MappingCode1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and ReasoningCode1
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D DataCode1
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene UnderstandingCode1
Scene Completeness-Aware Lidar Depth Completion for Driving ScenarioCode1
A Data-Centric Revisit of Pre-Trained Vision Models for Robot LearningCode1
DPF: Learning Dense Prediction Fields with Weak SupervisionCode1
Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth EstimationCode1
Show:102550
← PrevPage 17 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified