SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 11761200 of 1723 papers

TitleStatusHype
BBBD: Bounding Box Based Detector for Occlusion Detection and Order Recovery0
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text0
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection0
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?0
SELMA: SEmantic Large-scale Multimodal Acquisitions in Variable Weather, Daytime and Viewpoints0
Attention Mechanism based Cognition-level Scene Understanding0
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding0
Multi-Task Learning for Visual Scene Understanding0
Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering0
Self-Supervised Road Layout Parsing with Graph Auto-EncodingCode0
Towards 3D Scene Understanding by Referring Synthetic Models0
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows0
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans0
Deep Point Cloud Simplification for High-quality Surface Reconstruction0
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry0
On Steering Multi-Annotations per Sample for Multi-Task Learning0
Fast Neural Architecture Search for Lightweight Dense Prediction Networks0
Hybrid Optimized Deep Convolution Neural Network based Learning Model for Object Detection0
Movies2Scenes: Using Movie Metadata to Learn Scene Representation0
CARL-D: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentationCode0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL)0
StandardSim: A Synthetic Dataset For Retail Environments0
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient EvaluationCode0
Show:102550
← PrevPage 48 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified