Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 676–700 of 1723 papers

Title	Date	Tasks	Status	Score
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images	Jan 26, 2016	DiversityGeneral Classification	CodeCode Available	5
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection	Feb 15, 2022	Generalized Zero-Shot Object DetectionScene Understanding	CodeCode Available	5
Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance	Sep 10, 2024	Bilevel OptimizationPoint Cloud Completion	CodeCode Available	5
From Feature Importance to Natural Language Explanations Using LLMs with RAG	Jul 30, 2024	counterfactualCounterfactual Reasoning	CodeCode Available	5
CNN-based Lidar Point Cloud De-Noising in Adverse Weather	Dec 9, 2019	Autonomous VehiclesScene Understanding	CodeCode Available	5
Loss Switching Fusion with Similarity Search for Video Classification	Jun 27, 2019	ClassificationClustering	CodeCode Available	5
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition	Nov 27, 2024	Action RecognitionGraph Attention	CodeCode Available	5
LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics	Apr 16, 2018	NavigateScene Understanding	CodeCode Available	5
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding	Apr 4, 2023	Autonomous DrivingDomain Adaptation	CodeCode Available	5
Lightweight integration of 3D features to improve 2D image segmentation	Dec 16, 2022	Image SegmentationScene Understanding	CodeCode Available	5
Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video	May 27, 2019	Inductive BiasModel Predictive Control	CodeCode Available	5
Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning	Aug 1, 2020	Cross-Modal RetrievalRepresentation Learning	CodeCode Available	5
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding	Apr 18, 2025	Deep LearningPoint Cloud Completion	CodeCode Available	5
FlowGrad: Using Motion for Visual Sound Source Localization	Nov 15, 2022	Optical Flow EstimationScene Understanding	CodeCode Available	5
Flow-based GAN for 3D Point Cloud Generation from a Single Image	Oct 8, 2022	Point Cloud GenerationScene Understanding	CodeCode Available	5
Aerial Scene Understanding in The Wild: Multi-Scene Recognition via Prototype-based Memory Networks	Apr 22, 2021	RetrievalScene Recognition	CodeCode Available	5
Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation	Apr 12, 2018	Optical Flow EstimationScene Flow Estimation	CodeCode Available	5
Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph Generation	May 30, 2023	Graph GenerationImage Generation	CodeCode Available	5
ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding	Jul 28, 2024	Contrastive LearningIntention-oriented Segmentation	CodeCode Available	5
Matterport3D: Learning from RGB-D Data in Indoor Environments	Sep 18, 2017	General ClassificationScene Understanding	CodeCode Available	5
Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge Findings	Jun 24, 2022	Scene UnderstandingSemantic Segmentation	CodeCode Available	5
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors	May 30, 2025	3D geometryLarge Language Model	CodeCode Available	5
Learning Monocular Depth by Distilling Cross-domain Stereo Networks	Aug 20, 2018	Autonomous DrivingDepth Estimation	CodeCode Available	5
Learning Panoptic Segmentation from Instance Contours	Oct 16, 2020	ClusteringInstance Segmentation	CodeCode Available	5
Language-based Colorization of Scene Sketches	Nov 17, 2019	ColorizationImage Generation	CodeCode Available	5

Show:10 25 50

← PrevPage 28 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified