SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 676700 of 1723 papers

TitleStatusHype
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural ImagesCode0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer DistanceCode0
From Feature Importance to Natural Language Explanations Using LLMs with RAGCode0
CNN-based Lidar Point Cloud De-Noising in Adverse WeatherCode0
Loss Switching Fusion with Similarity Search for Video ClassificationCode0
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action RecognitionCode0
LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual SemanticsCode0
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene UnderstandingCode0
Lightweight integration of 3D features to improve 2D image segmentationCode0
Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from VideoCode0
Leveraging Acoustic Images for Effective Self-Supervised Audio Representation LearningCode0
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene UnderstandingCode0
FlowGrad: Using Motion for Visual Sound Source LocalizationCode0
Flow-based GAN for 3D Point Cloud Generation from a Single ImageCode0
Aerial Scene Understanding in The Wild: Multi-Scene Recognition via Prototype-based Memory NetworksCode0
Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field EstimationCode0
Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph GenerationCode0
ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention UnderstandingCode0
Matterport3D: Learning from RGB-D Data in Indoor EnvironmentsCode0
Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge FindingsCode0
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry PriorsCode0
Learning Monocular Depth by Distilling Cross-domain Stereo NetworksCode0
Learning Panoptic Segmentation from Instance ContoursCode0
Language-based Colorization of Scene SketchesCode0
Show:102550
← PrevPage 28 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified