SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 10511100 of 1723 papers

TitleStatusHype
DAWN: Vehicle Detection in Adverse Weather Nature Dataset0
Data-Driven Scene Understanding with Adaptively Retrieved Exemplars0
OpenSplat3D: Open-Vocabulary 3D Instance Segmentation using Gaussian Splatting0
OpenSU3D: Open World 3D Scene Understanding using Foundation Models0
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding0
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation0
Open-Vocabulary Octree-Graph for 3D Scene Understanding0
Open-Vocabulary SAM3D: Towards Training-free Open-Vocabulary 3D Scene Understanding0
Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments0
OW-Rep: Open World Object Detection with Instance Representation Learning0
Optical flow and scene flow estimation: A survey0
Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction0
DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference0
DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning0
DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion0
Using Image Priors to Improve Scene Understanding0
Out of the Room: Generalizing Event-Based Dynamic Motion Segmentation for Complex Scenes0
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos0
V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving0
Overlap-Aware Feature Learning for Robust Unsupervised Domain Adaptation for 3D Semantic Segmentation0
Accelerating deep neural networks for efficient scene understanding in automotive cyber-physical systems0
Cross-modal Learning for Multi-modal Video Categorization0
Panoptic Edge Detection0
Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving0
COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation0
P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding0
PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing0
PADriver: Towards Personalized Autonomous Driving0
PAg-NeRF: Towards fast and efficient end-to-end panoptic 3D representations for agricultural robotics0
PanoContext-Former: Panoramic Total Scene Understanding with a Transformer0
PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding0
PanoMixSwap Panorama Mixing via Structural Swapping for Indoor Scene Understanding0
CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting0
Learning Segmented 3D Gaussians via Efficient Feature Unprojection for Zero-shot Neural Scene Segmentation0
CoPa-SG: Dense Scene Graphs with Parametric and Proto-Relations0
Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding0
Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding0
Panoptic Out-of-Distribution Segmentation0
Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation0
PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction0
Context-Dependent Diffusion Network for Visual Relationship Detection0
Panoptic Segmentation Meets Remote Sensing0
PanopticSplatting: End-to-End Panoptic Gaussian Splatting0
Context-Aware Human Behavior Prediction Using Multimodal Large Language Models: Challenges and Insights0
Wireless Sensing With Deep Spectrogram Network and Primitive Based Autoregressive Hybrid Channel Model0
Content-Aware Preserving Image Generation0
Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars0
Real-time Approximate Bayesian Computation for Scene Understanding0
PAPooling: Graph-based Position Adaptive Aggregation of Local Geometry in Point Clouds0
VideoGameBunny: Towards vision assistants for video games0
Show:102550
← PrevPage 22 of 35Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified