SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 15761600 of 1723 papers

TitleStatusHype
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation dataCode0
Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object RepresentationsCode0
General-Purpose Deep Point Cloud Feature ExtractorCode0
Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many SynthesisCode0
APCoTTA: Continual Test-Time Adaptation for Semantic Segmentation of Airborne LiDAR Point CloudsCode0
Gated Driver Attention PredictorCode0
A Critical Assessment of Visual Sound Source Localization Models Including Negative AudioCode0
Model-based inexact graph matching on top of CNNs for semantic scene understandingCode0
Gated2Depth: Real-time Dense Lidar from Gated ImagesCode0
GaIA: Graphical Information Gain based Attention Network for Weakly Supervised Point Cloud Semantic SegmentationCode0
MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and ModalitiesCode0
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the WildCode0
Rotation Invariant Convolutions for 3D Point Clouds Deep LearningCode0
MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic ScenariosCode0
Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object ExchangeCode0
DA-RNN: Semantic Mapping with Data Associated Recurrent Neural NetworksCode0
MGNiceNet: Unified Monocular Geometric Scene UnderstandingCode0
MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth EstimationCode0
Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point SupervisionCode0
Improving Social Awareness Through DANTE: A Deep Affinity Network for Clustering Conversational InteractantsCode0
DADA: Driver Attention Prediction in Driving Accident ScenariosCode0
Structure-Aware Residual Pyramid Network for Monocular Depth EstimationCode0
METEOR Guided Divergence for Video CaptioningCode0
MC-PanDA: Mask Confidence for Panoptic Domain AdaptationCode0
Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene GraphsCode0
Show:102550
← PrevPage 64 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified