SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 101125 of 1723 papers

TitleStatusHype
ARKit LabelMaker: A New Scale for Indoor 3D Scene UnderstandingCode2
Scene-Centric Unsupervised Panoptic SegmentationCode2
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and FutureCode2
GALIP: Generative Adversarial CLIPs for Text-to-Image SynthesisCode2
SpectralGPT: Spectral Remote Sensing Foundation ModelCode2
Chameleon: Fast-slow Neuro-symbolic Lane Topology ExtractionCode2
GroupViT: Semantic Segmentation Emerges from Text SupervisionCode2
Stag-1: Towards Realistic 4D Driving Simulation with Video Generation ModelCode2
Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous DrivingCode2
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data PretrainingCode2
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AICode2
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-ExpertsCode2
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with TransformersCode2
CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D RecognitionCode2
COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian SplittingCode2
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial ReasoningCode2
Event-aided Semantic Scene CompletionCode1
Estimating Generic 3D Room Structures from 2D AnnotationsCode1
Event-based Motion Segmentation with Spatio-Temporal Graph CutsCode1
3DMIT: 3D Multi-modal Instruction Tuning for Scene UnderstandingCode1
A Review of Panoptic Segmentation for Mobile Mapping Point CloudsCode1
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense KnowledgeCode1
Context Prior for Scene SegmentationCode1
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal EstimationCode1
CoNav: Collaborative Cross-Modal Reasoning for Embodied NavigationCode1
Show:102550
← PrevPage 5 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified