Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–575 of 1723 papers

Title	Date	Tasks	Status	Hype
Semantic Is Enough: Only Semantic Information For NeRF Reconstruction	Mar 24, 2024	NeRFobject-detection	—Unverified	0
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans	Mar 24, 2024	3D Instance SegmentationInstance Segmentation	CodeCode Available	1
Multi-Task Learning with Multi-Task Optimization	Mar 24, 2024	Automated Theorem Provingimage-classification	—Unverified	0
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting	Mar 22, 2024	Instance SegmentationObject Localization	—Unverified	0
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data	Mar 22, 2024	DenoisingScene Understanding	—Unverified	0
Exosense: A Vision-Based Scene Understanding System For Exoskeletons	Mar 21, 2024	Language ModellingMotion Planning	—Unverified	0
3D Object Detection from Point Cloud via Voting Step Diffusion	Mar 21, 2024	3D Object DetectionObject	CodeCode Available	0
SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field	Mar 21, 2024	3D Scene ReconstructionAutonomous Driving	—Unverified	0
Volumetric Environment Representation for Vision-Language Navigation	Mar 21, 2024	3D geometryMulti-Task Learning	CodeCode Available	2
What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models	Mar 20, 2024	counterfactualHallucination	CodeCode Available	1
Geometric Constraints in Deep Learning Frameworks: A Survey	Mar 19, 2024	Deep LearningDepth Estimation	—Unverified	0
Instance-Warp: Saliency Guided Image Warping for Unsupervised Domain Adaptation	Mar 19, 2024	Domain AdaptationObject	CodeCode Available	0
HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting	Mar 19, 2024	Novel View SynthesisScene Understanding	—Unverified	0
M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving	Mar 19, 2024	Autonomous DrivingAutonomous Vehicles	—Unverified	0
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding	Mar 18, 2024	ObjectRelation Prediction	—Unverified	0
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation	Mar 18, 2024	3D Reconstruction3D Scene Reconstruction	CodeCode Available	0
Urban Scene Diffusion through Semantic Occupancy Map	Mar 18, 2024	Image GenerationScene Understanding	—Unverified	0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation	Mar 18, 2024	Common Sense ReasoningEfficient Exploration	CodeCode Available	0
Agent3D-Zero: An Agent for Zero-shot 3D Understanding	Mar 18, 2024	Language ModellingScene Understanding	—Unverified	0
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields	Mar 17, 2024	3D ReconstructionNeRF	CodeCode Available	0
Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation	Mar 16, 2024	Instance SegmentationObject	—Unverified	0
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields	Mar 16, 2024	Scene Understanding	—Unverified	0
Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning	Mar 15, 2024	Autonomous DrivingHuman-Object Interaction Detection	—Unverified	0
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding	Mar 14, 2024	Contrastive LearningRepresentation Learning	CodeCode Available	1
MoAI: Mixture of All Intelligence for Large Language and Vision Models	Mar 12, 2024	AllMixture-of-Experts	CodeCode Available	3

Show:10 25 50

← PrevPage 23 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified