SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 16011650 of 1723 papers

TitleStatusHype
Matterport3D: Learning from RGB-D Data in Indoor EnvironmentsCode0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate RepresentationCode0
Structured Label Inference for Visual UnderstandingCode0
AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene UnderstandingCode0
From Feature Importance to Natural Language Explanations Using LLMs with RAGCode0
m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural NetworksCode0
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose EstimationCode0
Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR ScansCode0
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene UnderstandingCode0
LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual SemanticsCode0
Loss Switching Fusion with Similarity Search for Video ClassificationCode0
Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer DistanceCode0
AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual LearningCode0
Continual Learning of Unsupervised Monocular Depth from VideosCode0
FlowGrad: Using Motion for Visual Sound Source LocalizationCode0
An Information-Theoretic Metric of Transferability for Task Transfer LearningCode0
SceneAware: Scene-Constrained Pedestrian Trajectory Prediction with LLM-Guided WalkabilityCode0
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action RecognitionCode0
Lightweight integration of 3D features to improve 2D image segmentationCode0
Surgical Scene Segmentation by Transformer With Asymmetric Feature EnhancementCode0
Constructing a Visual Relationship Authenticity DatasetCode0
Confidence-Aware Paced-Curriculum Learning by Label Smoothing for Surgical Scene UnderstandingCode0
Computational Imaging for Machine Perception: Transferring Semantic Segmentation beyond AberrationsCode0
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene UnderstandingCode0
Flow-based GAN for 3D Point Cloud Generation from a Single ImageCode0
Scene Graph Generation from Objects, Phrases and Region CaptionsCode0
Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph GenerationCode0
Auxiliary Tasks in Multi-task LearningCode0
Auto-Embedding Generative Adversarial Networks for High Resolution Image SynthesisCode0
Implicit Background Estimation for Semantic SegmentationCode0
SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground TruthCode0
Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge FindingsCode0
SceneNet: Understanding Real World Indoor Scenes With Synthetic DataCode0
Fast Scene Understanding for Autonomous DrivingCode0
Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object SearchCode0
Cognitive Visual Commonsense Reasoning Using Dynamic Working MemoryCode0
Leveraging Acoustic Images for Effective Self-Supervised Audio Representation LearningCode0
Attend, Infer, Repeat: Fast Scene Understanding with Generative ModelsCode0
A New Lightweight Hybrid Graph Convolutional Neural Network -- CNN Scheme for Scene Classification using Object Detection InferenceCode0
False Negative Reduction in Video Instance Segmentation using Uncertainty EstimatesCode0
3D Semantic Segmentation of Modular Furniture using rjMCMCCode0
Uncertainty-aware LiDAR Panoptic SegmentationCode0
Facing the Void: Overcoming Missing Data in Multi-View ImageryCode0
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural ImagesCode0
CNN-based Lidar Point Cloud De-Noising in Adverse WeatherCode0
AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene UnderstandingCode0
An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutionsCode0
SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene UnderstandingCode0
Extremely Fine-Grained Visual Classification over Resembling Glyphs in the WildCode0
Show:102550
← PrevPage 33 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified