SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 11511200 of 1723 papers

TitleStatusHype
Neural Groundplans: Persistent Neural Scene Representations from a Single Image0
SeasoNet: A Seasonal Scene Classification, segmentation and Retrieval dataset for satellite Imagery over Germany0
Adversarial Attacks on Monocular Pose EstimationCode0
BlindSpotNet: Seeing Where We Cannot See0
Distance Matters in Human-Object Interaction DetectionCode0
Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases0
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation0
Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge FindingsCode0
SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene UnderstandingCode0
A Dynamic Data Driven Approach for Explainable Scene Understanding0
On Efficient Real-Time Semantic Segmentation: A Survey0
Waymo Open Dataset: Panoramic Video Panoptic Segmentation0
A Multi-purpose Realistic Haze Benchmark with Quantifiable Haze Levels and Ground Truth0
Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label DiffusionCode0
Extracting Zero-shot Common Sense from Large Language Models for Robot 3D Scene Understanding0
Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields0
Scan2Part: Fine-grained and Hierarchical Part-level Understanding of Real-World 3D Scans0
A Memory System of a Robot Cognitive Architecture and its Implementation in ArmarX0
Towards Improving the Generation Quality of Autoregressive Slot VAEsCode0
SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment0
Facing the Void: Overcoming Missing Data in Multi-View ImageryCode0
Review on Panoramic Imaging and Its Applications in Scene Understanding0
Unsupervised Discovery and Composition of Object Light Fields0
Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects0
RangeSeg: Range-Aware Real Time Segmentation of 3D LiDAR Point Clouds0
BBBD: Bounding Box Based Detector for Occlusion Detection and Order Recovery0
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text0
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection0
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?0
SELMA: SEmantic Large-scale Multimodal Acquisitions in Variable Weather, Daytime and Viewpoints0
Attention Mechanism based Cognition-level Scene Understanding0
MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding0
Multi-Task Learning for Visual Scene Understanding0
Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering0
Self-Supervised Road Layout Parsing with Graph Auto-EncodingCode0
Towards 3D Scene Understanding by Referring Synthetic Models0
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows0
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans0
Deep Point Cloud Simplification for High-quality Surface Reconstruction0
RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry0
On Steering Multi-Annotations per Sample for Multi-Task Learning0
Fast Neural Architecture Search for Lightweight Dense Prediction Networks0
Hybrid Optimized Deep Convolution Neural Network based Learning Model for Object Detection0
Movies2Scenes: Using Movie Metadata to Learn Scene Representation0
CARL-D: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentationCode0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL)0
StandardSim: A Synthetic Dataset For Retail Environments0
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient EvaluationCode0
Show:102550
← PrevPage 24 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified