Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–475 of 1723 papers

Title	Date	Tasks	Status	Hype
Training-Free Model Merging for Multi-target Domain Adaptation	Jul 18, 2024	Domain AdaptationMulti-target Domain Adaptation	—Unverified	0
InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction	Jul 17, 2024	Scene UnderstandingSurface Reconstruction	CodeCode Available	0
Dual-Hybrid Attention Network for Specular Highlight Removal	Jul 17, 2024	highlight removalObject Recognition	CodeCode Available	1
Benchmarking Vision Language Models for Cultural Understanding	Jul 15, 2024	BenchmarkingQuestion Answering	—Unverified	0
No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations	Jul 15, 2024	AllImage Retrieval	CodeCode Available	1
Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data	Jul 14, 2024	3D Object Detection3D Semantic Segmentation	CodeCode Available	0
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding	Jul 13, 2024	Scene UnderstandingZero-Shot Learning	—Unverified	0
BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight	Jul 11, 2024	Autonomous DrivingBEV Segmentation	—Unverified	0
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences	Jul 10, 2024	Multi-Task LearningScene Understanding	—Unverified	0
Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search	Jul 10, 2024	Few-Shot LearningGPU	CodeCode Available	0
Joint prototype and coefficient prediction for 3D instance segmentation	Jul 9, 2024	3D Instance SegmentationInstance Segmentation	—Unverified	0
LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition	Jul 9, 2024	Instruction FollowingRepresentation Learning	—Unverified	0
Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness	Jul 7, 2024	Activity RecognitionScene Understanding	—Unverified	0
Hybrid Primal Sketch: Combining Analogy, Qualitative Representations, and Computer Vision for Scene Understanding	Jul 5, 2024	Scene Understanding	—Unverified	0
A Unified Framework for 3D Scene Understanding	Jul 3, 2024	Contrastive LearningKnowledge Distillation	CodeCode Available	2
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders	Jul 2, 2024	Boundary DetectionHuman Parsing	CodeCode Available	1
Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation	Jul 1, 2024	Autonomous DrivingDecoder	CodeCode Available	1
PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction	Jul 1, 2024	3D Panoptic SegmentationInstance Segmentation	—Unverified	0
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes	Jul 1, 2024	Autonomous VehiclesImage Segmentation	CodeCode Available	1
ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding	Jun 30, 2024	Graph GenerationGraph Neural Network	—Unverified	0
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting	Jun 28, 2024	Human-Object Interaction DetectionObject	—Unverified	0
PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation	Jun 28, 2024	DecoderImage Segmentation	—Unverified	0
3D-MVP: 3D Multiview Pretraining for Robotic Manipulation	Jun 26, 2024	DecoderRobot Manipulation	—Unverified	0
GPT-4V Explorations: Mining Autonomous Driving	Jun 24, 2024	Autonomous DrivingDecision Making	—Unverified	0
AudioBench: A Universal Benchmark for Audio Large Language Models	Jun 23, 2024	Audio Scene UnderstandingInstruction Following	CodeCode Available	3

Show:10 25 50

← PrevPage 19 of 69Next →

All datasets Semantic Scene Understanding Challenge (passive actuation & ground-truth localisation)ADE20K val Semantic Scene Understanding Challenge (active actuation & ground-truth localisation)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.44	—	Unverified
2	Team VGAI (TCS Research)	OMQ	0.37	—	Unverified
3	Demo_semantic_SLAM	OMQ	0.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CPN(ResNet-101)	Mean IoU	46.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ACRV Baseline	OMQ	0.35	—	Unverified