SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 351400 of 1723 papers

TitleStatusHype
Efficient Multi-Task RGB-D Scene Analysis for Indoor EnvironmentsCode1
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous DrivingCode1
Lane Graph Estimation for Scene Understanding in Urban DrivingCode1
Deep Learning for Event-based Vision: A Comprehensive Survey and BenchmarksCode1
AeroRIT: A New Scene for Hyperspectral Image AnalysisCode1
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene UnderstandingCode1
ODAM: Object Detection, Association, and Mapping using Posed RGB VideoCode1
ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic SegmentationCode1
Image Masking for Robust Self-Supervised Monocular Depth EstimationCode1
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based OptimizationCode1
Behind the Curtain: Learning Occluded Shapes for 3D Object DetectionCode1
Image Segmentation Using Deep Learning: A SurveyCode1
Estimating Generic 3D Room Structures from 2D AnnotationsCode1
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIPCode1
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny ObjectsCode1
RGB-D Indiscernible Object Counting in Underwater ScenesCode1
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic SegmentationCode1
Instance Segmentation in 3D Scenes using Semantic Superpoint Tree NetworksCode1
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene ManipulationCode1
ReorientBot: Learning Object Reorientation for Specific-Posed PlacementCode1
Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation CorrentropyCode1
Beyond Appearances: Material Segmentation with Embedded Spectral Information from RGB-D imageryCode1
Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity CollaborationCode1
Instance-wise Occlusion and Depth Orders in Natural ScenesCode1
OFFSEG: A Semantic Segmentation Framework For Off-Road DrivingCode1
Online 3D reconstruction and dense tracking in endoscopic videosCode1
Class-Incremental Domain Adaptation with Smoothing and Calibration for Surgical Report GenerationCode1
Dual-Hybrid Attention Network for Specular Highlight RemovalCode1
OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic SegmentationCode1
Joint 2D-3D-Semantic Data for Indoor Scene UnderstandingCode1
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3DCode1
Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor EnvironmentsCode1
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map ConstructionCode1
Dynamic Graph Message Passing NetworksCode1
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language ModelsCode1
DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric VoxelizationCode1
Bidirectional Projection Network for Cross Dimension Scene UnderstandingCode1
Learning and Reasoning with the Graph Structure Representation in Robotic SurgeryCode1
Learning How To Robustly Estimate Camera Pose in Endoscopic VideosCode1
Learning Human-Object Interaction Detection using Interaction PointsCode1
NeuSyRE: Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph EnrichmentCode1
NODIS: Neural Ordinary Differential Scene UnderstandingCode1
LiON: Learning Point-wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic DataCode1
Learning Object-level Point Augmentor for Semi-supervised 3D Object DetectionCode1
Dynamic Graph Message Passing Networks for Visual RecognitionCode1
Digging Into Self-Supervised Monocular Depth EstimationCode1
DPF: Learning Dense Prediction Fields with Weak SupervisionCode1
No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen RepresentationsCode1
Object Pose Estimation via the Aggregation of Diffusion FeaturesCode1
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene UnderstandingCode1
Show:102550
← PrevPage 8 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified