Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 1723 papers

Title	Date	Tasks	Status	Hype
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection	Apr 17, 2023	Human-Object Interaction DetectionQuantization	CodeCode Available	1
Learning How To Robustly Estimate Camera Pose in Endoscopic Videos	Apr 17, 2023	3D ReconstructionCamera Pose Estimation	CodeCode Available	1
RS2G: Data-Driven Scene-Graph Extraction and Embedding for Robust Autonomous Perception and Scenario Understanding	Apr 17, 2023	Autonomous VehiclesGraph Learning	CodeCode Available	1
STRAP: Structured Object Affordance Segmentation with Point Supervision	Apr 17, 2023	ObjectScene Understanding	CodeCode Available	1
Complementary Random Masking for RGB-Thermal Semantic Segmentation	Mar 30, 2023	Scene UnderstandingSemantic Segmentation	CodeCode Available	1
DPF: Learning Dense Prediction Fields with Weak Supervision	Mar 29, 2023	Intrinsic Image DecompositionPrediction	CodeCode Available	1
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation	Mar 28, 2023	Panoptic Scene Graph GenerationScene Graph Generation	CodeCode Available	1
Real-Time Semantic Segmentation using Hyperspectral Images for Mapping Unstructured and Unknown Environments	Mar 27, 2023	Autonomous NavigationReal-Time Semantic Segmentation	CodeCode Available	1
You Only Need One Thing One Click: Self-Training for Weakly Supervised 3D Scene Understanding	Mar 26, 2023	3D Instance SegmentationInstance Segmentation	CodeCode Available	1
Viewpoint Equivariance for Multi-View 3D Object Detection	Mar 25, 2023	3D Object DetectionObject	CodeCode Available	1
Self-distillation for surgical action recognition	Mar 22, 2023	Action RecognitionMedical Image Analysis	CodeCode Available	1
Constructing Metric-Semantic Maps using Floor Plan Priors for Long-Term Indoor Localization	Mar 20, 2023	3D Object DetectionIndoor Localization	CodeCode Available	1
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection	Mar 14, 2023	3D Object DetectionDecoder	CodeCode Available	1
Traffic Scene Parsing through the TSP6K Dataset	Mar 6, 2023	Autonomous DrivingDecoder	CodeCode Available	1
CEKD: Cross-Modal Edge-Privileged Knowledge Distillation for Semantic Scene Understanding Using Only Thermal Images	Feb 22, 2023	Knowledge DistillationScene Understanding	CodeCode Available	1
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks	Feb 17, 2023	DeblurringDeep Learning	CodeCode Available	1
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation	Feb 7, 2023	6D Pose Estimation6D Pose Estimation using RGB	CodeCode Available	1
OvarNet: Towards Open-vocabulary Object Attribute Recognition	Jan 23, 2023	AttributeKnowledge Distillation	CodeCode Available	1
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection	Jan 22, 2023	3D Object DetectionAutonomous Vehicles	CodeCode Available	1
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP	Jan 12, 2023	3D Semantic SegmentationContrastive Learning	CodeCode Available	1
Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction	Jan 1, 2023	3D Scene ReconstructionImage Segmentation	CodeCode Available	1
PeakConv: Learning Peak Receptive Field for Radar Semantic Segmentation	Jan 1, 2023	ObjectScene Understanding	CodeCode Available	1
PointVST: Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation	Dec 29, 2022	Contrastive LearningImage Generation	CodeCode Available	1
Learning Object-level Point Augmentor for Semi-supervised 3D Object Detection	Dec 19, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1
Towards Holistic Surgical Scene Understanding	Dec 8, 2022	Action RecognitionAtomic action recognition	CodeCode Available	1
LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for Autonomous Driving	Dec 7, 2022	Autonomous DrivingInstance Segmentation	CodeCode Available	1
Towards Scene Understanding for Autonomous Operations on Airport Aprons	Dec 4, 2022	Autonomous DrivingBenchmarking	CodeCode Available	1
Language-Assisted 3D Feature Learning for Semantic Scene Understanding	Nov 25, 2022	DescriptiveInstance Segmentation	CodeCode Available	1
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	Nov 17, 2022	3D Object DetectionDepth Estimation	CodeCode Available	1
RGB-T Semantic Segmentation with Location, Activation, and Sharpening	Oct 26, 2022	DecoderScene Understanding	CodeCode Available	1
Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data	Oct 25, 2022	Autonomous DrivingGPU	CodeCode Available	1
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models	Oct 18, 2022	image-classificationImage Classification	CodeCode Available	1
SQA3D: Situated Question Answering in 3D Scenes	Oct 14, 2022	Question AnsweringReferring Expression	CodeCode Available	1
Image Masking for Robust Self-Supervised Monocular Depth Estimation	Oct 5, 2022	Autonomous DrivingDepth Estimation	CodeCode Available	1
FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions	Oct 4, 2022	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Uncertainty-Driven Active Vision for Implicit Scene Reconstruction	Oct 3, 2022	Scene Understanding	CodeCode Available	1
Dynamic Graph Message Passing Networks for Visual Recognition	Sep 20, 2022	image-classificationImage Classification	CodeCode Available	1
Segmenting Known Objects and Unseen Unknowns without Prior Knowledge	Sep 12, 2022	Instance SegmentationObject Detection	CodeCode Available	1
Leveraging Large (Visual) Language Models for Robot 3D Scene Understanding	Sep 12, 2022	Common Sense ReasoningScene Classification	CodeCode Available	1
MassMIND: Massachusetts Maritime INfrared Dataset	Sep 9, 2022	Instance SegmentationScene Understanding	CodeCode Available	1
SemSegDepth: A Combined Model for Semantic Segmentation and Depth Completion	Sep 1, 2022	Depth CompletionScene Understanding	CodeCode Available	1
Semantic Segmentation-Assisted Instance Feature Fusion for Multi-Level 3D Part Instance Segmentation	Aug 9, 2022	3D Instance Segmentation3D Part Segmentation	CodeCode Available	1
TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation	Aug 3, 2022	Answer GenerationQuestion-Answer-Generation	CodeCode Available	1
MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud	Jul 28, 2022	Scene Understanding	CodeCode Available	1
CENet: Toward Concise and Efficient LiDAR Semantic Segmentation for Autonomous Driving	Jul 26, 2022	3D Semantic SegmentationAutonomous Driving	CodeCode Available	1
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models	Jul 23, 2022	Scene Understanding	CodeCode Available	1
Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise Binarization	Jul 22, 2022	3D Instance Segmentation3D Object Detection	CodeCode Available	1
Egocentric Scene Understanding via Multimodal Spatial Rectifier	Jul 14, 2022	Scene UnderstandingSurface Normal Estimation	CodeCode Available	1
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments	Jul 10, 2022	Instance SegmentationPanoptic Segmentation	CodeCode Available	1
MCTS with Refinement for Proposals Selection Games in Scene Understanding	Jul 7, 2022	Scene Understanding	CodeCode Available	1

Show:10 25 50

← PrevPage 6 of 35Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified