SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 16761700 of 1723 papers

TitleStatusHype
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video UnderstandingCode0
Efficient Computation Sharing for Multi-Task Visual Scene UnderstandingCode0
DualMLP: a two-stream fusion model for 3D point cloud classificationCode0
Road Scene Understanding by Occupancy Grid Learning from Sparse Radar Clusters using Semantic SegmentationCode0
Self-Supervised Partial Cycle-Consistency for Multi-View MatchingCode0
Learning Monocular Depth by Distilling Cross-domain Stereo NetworksCode0
Boundary-Seeking Generative Adversarial NetworksCode0
Dual-Glance Model for Deciphering Social RelationshipsCode0
Self-Supervised Road Layout Parsing with Graph Auto-EncodingCode0
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry PriorsCode0
Self-supervised Vision Transformers for 3D Pose Estimation of Novel ObjectsCode0
Zoom in on the Plant: Fine-grained Analysis of Leaf, Stem and Vein InstancesCode0
Language-based Colorization of Scene SketchesCode0
Label-Attention Transformer with Geometrically Coherent Objects for Image CaptioningCode0
Adversarial Attacks on Monocular Pose EstimationCode0
Visually Grounded VQA by Lattice-based RetrievalCode0
The ADUULM-360 Dataset -- A Multi-Modal Dataset for Depth Estimation in Adverse WeatherCode0
DRRNet: Macro-Micro Feature Fusion and Dual Reverse Refinement for Camouflaged Object DetectionCode0
Doubly Contrastive End-to-End Semantic Segmentation for Autonomous Driving under Adverse WeatherCode0
A Review on Deep Learning Techniques Applied to Semantic SegmentationCode0
Semantic Foreground Inpainting from Weak SupervisionCode0
BOLD5000: A public fMRI dataset of 5000 imagesCode0
DOCTR: Disentangled Object-Centric Transformer for Point Scene UnderstandingCode0
UniNet: A Unified Scene Understanding Network and Exploring Multi-Task Relationships through the Lens of Adversarial AttacksCode0
Knowledge-Guided Object Discovery with Acquired Deep ImpressionsCode0
Show:102550
← PrevPage 68 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified