SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 401450 of 1723 papers

TitleStatusHype
Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor EnvironmentsCode1
Understanding Bird's-Eye View of Road Semantics using an Onboard CameraCode1
FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene UnderstandingCode1
Towards Part-Based Understanding of RGB-D ScansCode1
Group Contextual Encoding for 3D Point CloudsCode1
RfD-Net: Point Scene Understanding by Semantic Instance ReconstructionCode1
Visual place recognition: A survey from deep learning perspectiveCode1
RELLIS-3D Dataset: Data, Benchmarks and AnalysisCode1
SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple EnvironmentsCode1
Towards Efficient Scene Understanding via Squeeze ReasoningCode1
Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic SegmentationCode1
Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce ModelCode1
RADIATE: A Radar Dataset for Automotive Perception in Bad WeatherCode1
ALFWorld: Aligning Text and Embodied Environments for Interactive LearningCode1
MLRSNet: A Multi-label High Spatial Resolution Remote Sensing Dataset for Semantic Scene UnderstandingCode1
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving EnvironmentsCode1
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and ChallengesCode1
Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor SceneCode1
Polysemy Deciphering Network for Robust Human-Object Interaction DetectionCode1
Pose-based Modular Network for Human-Object Interaction DetectionCode1
Polysemy Deciphering Network for Human-Object Interaction DetectionCode1
Weakly Supervised 3D Object Detection from Point CloudsCode1
Virtual Multi-view Fusion for 3D Semantic SegmentationCode1
Few-Shot Object Detection and Viewpoint Estimation for Objects in the WildCode1
PointContrast: Unsupervised Pre-training for 3D Point Cloud UnderstandingCode1
ThreeDWorld: A Platform for Interactive Multi-Modal Physical SimulationCode1
Learning and Reasoning with the Graph Structure Representation in Robotic SurgeryCode1
A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine IntelligenceCode1
Learning Visual Commonsense for Robust Scene Graph GenerationCode1
Benchmarking Unsupervised Object Representations for Video SequencesCode1
0-MMS: Zero-Shot Multi-Motion Segmentation With A Monocular Event CameraCode1
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous DrivingCode1
VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban EnvironmentsCode1
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene UnderstandingCode1
Self-Supervised Scene De-occlusionCode1
Context Prior for Scene SegmentationCode1
PointGroup: Dual-Set Point Grouping for 3D Instance SegmentationCode1
Semantic Segmentation of Underwater Imagery: Dataset and BenchmarkCode1
Occlusion-Aware Depth Estimation with Adaptive Normal ConstraintsCode1
Distilled Semantics for Comprehensive Scene Understanding from VideosCode1
Learning Human-Object Interaction Detection using Interaction PointsCode1
LayoutMP3D: Layout Annotation of Matterport3DCode1
Multi-Path Region Mining For Weakly Supervised 3D Semantic Segmentation on Point CloudsCode1
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D NetworksCode1
SaccadeNet: A Fast and Accurate Object DetectorCode1
Who2com: Collaborative Perception via Learnable Handshake CommunicationCode1
Explainable Object-induced Action Decision for Autonomous VehiclesCode1
Toronto-3D: A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban RoadwaysCode1
Scene Completeness-Aware Lidar Depth Completion for Driving ScenarioCode1
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single ImageCode1
Show:102550
← PrevPage 9 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified