SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 151175 of 1723 papers

TitleStatusHype
FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene UnderstandingCode1
From General to Specific: Informative Scene Graph Generation via Balance AdjustmentCode1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and ReasoningCode1
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving EnvironmentsCode1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous DrivingCode1
Bidirectional Projection Network for Cross Dimension Scene UnderstandingCode1
Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View ImagesCode1
A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed ImagesCode1
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object DetectionCode1
GFF: Gated Fully Fusion for Semantic SegmentationCode1
Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic SegmentationCode1
Estimating Generic 3D Room Structures from 2D AnnotationsCode1
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal EstimationCode1
Event-aided Semantic Scene CompletionCode1
Event-based Motion Segmentation with Spatio-Temporal Graph CutsCode1
EndoChat: Grounded Multimodal Large Language Model for Endoscopic SurgeryCode1
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense KnowledgeCode1
3DP3: 3D Scene Perception via Probabilistic ProgrammingCode1
Explainable Object-induced Action Decision for Autonomous VehiclesCode1
Dynamic Scene Understanding through Object-Centric Voxelization and Neural RenderingCode1
Dynamic Graph Message Passing Networks for Visual RecognitionCode1
ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic SegmentationCode1
AVSegFormer: Audio-Visual Segmentation with TransformerCode1
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic SegmentationCode1
Dynamic Graph Message Passing NetworksCode1
Show:102550
← PrevPage 7 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified