Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1451–1475 of 1723 papers

Title	Date	Tasks	Status
Robust Visual Localization via Semantic-Guided Multi-Scale Transformer	Jun 10, 2025	regressionScene Understanding	—Unverified
Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms	Dec 10, 2021	3D ReconstructionAutonomous Navigation	—Unverified
Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness	Apr 2, 2025	Scene Understanding	—Unverified
RS-RAG: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model	Apr 7, 2025	Image Captioningimage-classification	—Unverified
S^3M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving	Jan 21, 2024	Autonomous DrivingScene Understanding	—Unverified
S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation	Nov 4, 2020	Autonomous DrivingEdge-computing	—Unverified
S4C: Self-Supervised Semantic Scene Completion with Neural Fields	Oct 11, 2023	Image SegmentationNavigate	—Unverified
Safety Assessment for Autonomous Systems' Perception Capabilities	Aug 17, 2022	Decision MakingScene Understanding	—Unverified
SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data	May 18, 2021	object-detectionObject Detection	—Unverified
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes	Jun 2, 2025	Scene Understanding	—Unverified
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation	May 30, 2024	Instruction Followingparameter-efficient fine-tuning	—Unverified
SAM-Guided Masked Token Prediction for 3D Scene Understanding	Oct 16, 2024	3D Object DetectionKnowledge Distillation	—Unverified
SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment	Jun 1, 2022	Motion PlanningQuestion Answering	—Unverified
Scale-aware Neural Network for Semantic Segmentation of Multi-resolution Remote Sensing Images	Mar 14, 2021	Scene UnderstandingSegmentation	—Unverified
SANPO: A Scene Understanding, Accessibility and Human Navigation Dataset	Sep 21, 2023	Autonomous VehiclesDepth Estimation	—Unverified
Scan2Part: Fine-grained and Hierarchical Part-level Understanding of Real-World 3D Scans	Jun 6, 2022	Scene Understanding	—Unverified
Sce2DriveX: A Generalized MLLM Framework for Scene-to-Drive Learning	Feb 19, 2025	Autonomous DrivingBench2Drive	—Unverified
Scenarios: A New Representation for Complex Scene Understanding	Feb 16, 2018	Image RetrievalObject Recognition	—Unverified
Scene-aware Human Pose Generation using Transformer	Aug 4, 2023	Knowledge DistillationScene Understanding	—Unverified
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation	Jul 5, 2022	Dialogue GenerationDialogue Understanding	—Unverified
SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis	Jun 12, 2025	Novel View SynthesisScene Understanding	—Unverified
Counterfactual Critic Multi-Agent Training for Scene Graph Generation	Dec 6, 2018	counterfactualGraph Generation	—Unverified
Planning Safety Trajectories with Dual-Phase, Physics-Informed, and Transportation Knowledge-Driven Large Language Models	Apr 6, 2025	Computational EfficiencyGeneral Knowledge	CodeCode Available
Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video	May 27, 2019	Inductive BiasModel Predictive Control	CodeCode Available
PENet: A Joint Panoptic Edge Detection Network	Mar 15, 2023	Edge DetectionMulti-Task Learning	CodeCode Available

Show:10 25 50

← PrevPage 59 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified