Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 476–500 of 1723 papers

Title	Date	Tasks	Status	Hype
EvSegSNN: Neuromorphic Semantic Segmentation for Event Data	Jun 20, 2024	Autonomous VehiclesDecoder	—Unverified	0
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images	Jun 19, 2024	Object RecognitionScene Understanding	CodeCode Available	2
DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features	Jun 17, 2024	3D geometry3D Semantic Occupancy Prediction	—Unverified	0
Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding	Jun 17, 2024	3D Object Detection3D Semantic Segmentation	—Unverified	0
MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report	Jun 14, 2024	Autonomous DrivingScene Understanding	—Unverified	0
A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion	Jun 14, 2024	3D ReconstructionAutonomous Driving	CodeCode Available	1
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding	Jun 13, 2024	Multiple-choiceScene Understanding	CodeCode Available	1
Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment	Jun 12, 2024	3D ReconstructionScene Understanding	CodeCode Available	0
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent	Jun 11, 2024	AI AgentDescriptive	CodeCode Available	2
FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping	Jun 4, 2024	3DGSScene Understanding	—Unverified	0
EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding	Jun 3, 2024	Domain AdaptationOpen Vocabulary Semantic Segmentation	—Unverified	0
Object Aware Egocentric Online Action Detection	Jun 3, 2024	Action DetectionObject	—Unverified	0
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos	Jun 3, 2024	Graph GenerationScene Graph Generation	—Unverified	0
Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024	Jun 2, 2024	Scene ParsingScene Understanding	—Unverified	0
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation	May 30, 2024	Instruction Followingparameter-efficient fine-tuning	—Unverified	0
Learning 3D Robotics Perception using Inductive Priors	May 30, 2024	3D ReconstructionImage Generation	—Unverified	0
Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding	May 29, 2024	Scene UnderstandingSegmentation	—Unverified	0
GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane	May 27, 2024	3DGSfeature selection	—Unverified	0
Open-Vocabulary SAM3D: Towards Training-free Open-Vocabulary 3D Scene Understanding	May 24, 2024	Scene UnderstandingZero Shot Segmentation	—Unverified	0
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis	May 23, 2024	Novel View SynthesisScene Understanding	—Unverified	0
Transformers for Image-Goal Navigation	May 23, 2024	NavigateScene Understanding	—Unverified	0
CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments	May 23, 2024	Pose EstimationScene Understanding	CodeCode Available	1
TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System	May 22, 2024	3D Object Detection3D Semantic Segmentation	—Unverified	0
GameVLM: A Decision-making Framework for Robotic Task Planning Based on Visual Language Models and Zero-sum Games	May 22, 2024	Code GenerationDecision Making	—Unverified	0
Anticipating Object State Changes in Long Procedural Videos	May 21, 2024	ObjectObject State Change Classification	—Unverified	0

Show:10 25 50

← PrevPage 20 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified