SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 876900 of 1723 papers

TitleStatusHype
Mining Conditional Part Semantics with Occluded Extrapolation for Human-Object Interaction Detection0
Content Adaptive Front End For Audio Classification0
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models0
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation0
Unified Representation Space for 3D Visual Grounding0
Unified Scene Representation and Reconstruction for 3D Large Language Models0
DriveGuard: Robustification of Automated Driving Systems with Deep Spatio-Temporal Convolutional Autoencoder0
MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements0
MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency0
MNEW: Multi-domain Neighborhood Embedding and Weighting for Sparse Point Clouds Segmentation0
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving0
Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding0
DreamAnywhere: Object-Centric Panoramic 3D Scene Generation0
Uni-Fusion: Universal Continuous Mapping0
UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations0
Modeling human intuitions about liquid flow with particle-based simulation0
Modeling Uncertainty in 3D Gaussian Splatting through Continuous Semantic Splatting0
DORSal: Diffusion for Object-centric Representations of Scenes et al0
DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation0
A Comprehensive Review of Modern Object Segmentation Approaches0
Monocular BEV Perception of Road Scenes via Front-to-Top View Projection0
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs0
Monocular Depth Estimation with Sharp Boundary0
Does CLIP perceive art the same way we do?0
MonoGRNet: A General Framework for Monocular 3D Object Detection0
Show:102550
← PrevPage 36 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified