SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 110 of 1723 papers

TitleStatusHype
City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning0
Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models0
Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection0
Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis0
Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander0
Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based AdaptationCode1
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments0
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene UnderstandingCode0
MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation0
What Demands Attention in Urban Street Scenes? From Scene Understanding towards Road Safety: A Survey of Vision-driven Datasets and Studies0
Show:102550
← PrevPage 1 of 173Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified