Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–725 of 1723 papers

Title	Date	Tasks	Status	Score
ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding	Jul 28, 2024	Contrastive LearningIntention-oriented Segmentation	CodeCode Available	5
Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange	Apr 11, 2024	ObjectScene Understanding	CodeCode Available	5
Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge Findings	Jun 24, 2022	Scene UnderstandingSemantic Segmentation	CodeCode Available	5
Learning Regional Purity for Instance Segmentation on 3D Point Clouds	Nov 3, 2020	3D Instance Segmentation3D Semantic Segmentation	CodeCode Available	5
Learning Panoptic Segmentation from Instance Contours	Oct 16, 2020	ClusteringInstance Segmentation	CodeCode Available	5
Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation	Apr 12, 2018	Optical Flow EstimationScene Flow Estimation	CodeCode Available	5
Learning Monocular Depth by Distilling Cross-domain Stereo Networks	Aug 20, 2018	Autonomous DrivingDepth Estimation	CodeCode Available	5
Fast Scene Understanding for Autonomous Driving	Aug 8, 2017	Autonomous DrivingDecoder	CodeCode Available	5
Artificial Color Constancy via GoogLeNet with Angular Loss Function	Nov 20, 2018	Color ConstancyObject Recognition	CodeCode Available	5
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions	Sep 19, 2024	Audio captioningLanguage Modeling	CodeCode Available	5
False Negative Reduction in Video Instance Segmentation using Uncertainty Estimates	Jun 28, 2021	Depth EstimationInstance Segmentation	CodeCode Available	5
Implicit Background Estimation for Semantic Segmentation	May 23, 2019	Scene UnderstandingSegmentation	CodeCode Available	5
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors	May 30, 2025	3D geometryLarge Language Model	CodeCode Available	5
InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction	Jul 17, 2024	Scene UnderstandingSurface Reconstruction	CodeCode Available	5
Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning	Aug 1, 2020	Cross-Modal RetrievalRepresentation Learning	CodeCode Available	5
Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning	Sep 16, 2021	DecoderImage Captioning	CodeCode Available	5
Knowledge-Guided Object Discovery with Acquired Deep Impressions	Mar 19, 2021	ObjectObject Discovery	CodeCode Available	5
Facing the Void: Overcoming Missing Data in Multi-View Imagery	May 21, 2022	Classificationimage-classification	CodeCode Available	5
Joint stereo 3D object detection and implicit surface reconstruction	Nov 25, 2021	3D Object DetectionHallucination	CodeCode Available	5
JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds with Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields	Apr 1, 2019	3D Instance Segmentation3D Semantic Instance Segmentation	CodeCode Available	5
Extremely Fine-Grained Visual Classification over Resembling Glyphs in the Wild	Aug 25, 2024	Contrastive LearningFine-Grained Image Classification	CodeCode Available	5
Adversarial Attacks on Monocular Pose Estimation	Jul 14, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	5
Language-based Colorization of Scene Sketches	Nov 17, 2019	ColorizationImage Generation	CodeCode Available	5
Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation	Aug 21, 2024	3D Semantic SegmentationData Augmentation	CodeCode Available	5
Interpretable Visual Understanding with Cognitive Attention Network	Aug 6, 2021	Scene UnderstandingVisual Commonsense Reasoning	CodeCode Available	5

Show:10 25 50

← PrevPage 29 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified