SOTAVerified

Scene Understanding

Scene understanding involves interpreting the visual information of a scene, including objects, their spatial relationships, and the overall layout. It goes beyond simple object recognition by considering the context and how objects relate to each other and the environment.

Papers

Showing 201225 of 1723 papers

TitleStatusHype
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene ContextsCode1
Human-centric Scene Understanding for 3D Large-scale ScenariosCode1
IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous DrivingCode1
CamContextI2V: Context-aware Controllable Video GenerationCode1
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving EnvironmentsCode1
Image Segmentation Using Deep Learning: A SurveyCode1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and ReasoningCode1
Bootstraping Clustering of Gaussians for View-consistent 3D Scene UnderstandingCode1
AVSegFormer: Audio-Visual Segmentation with TransformerCode1
Instance-wise Occlusion and Depth Orders in Natural ScenesCode1
Explainable Object-induced Action Decision for Autonomous VehiclesCode1
Boundary-induced and scene-aggregated network for monocular depth predictionCode1
IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal EstimationCode1
Joint 2D-3D-Semantic Data for Indoor Scene UnderstandingCode1
Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph AnalysisCode1
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous DrivingCode1
Estimating Generic 3D Room Structures from 2D AnnotationsCode1
Language-Assisted 3D Feature Learning for Semantic Scene UnderstandingCode1
Cerberus Transformer: Joint Semantic, Affordance and Attribute ParsingCode1
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation ModelsCode1
Learning and Reasoning with the Graph Structure Representation in Robotic SurgeryCode1
Learning How To Robustly Estimate Camera Pose in Endoscopic VideosCode1
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal EstimationCode1
Learning to Answer Questions in Dynamic Audio-Visual ScenariosCode1
Event-aided Semantic Scene CompletionCode1
Show:102550
← PrevPage 9 of 69Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.44Unverified
2Team VGAI (TCS Research)OMQ0.37Unverified
3Demo_semantic_SLAMOMQ0.11Unverified
#ModelMetricClaimedVerifiedStatus
1CPN(ResNet-101)Mean IoU46.3Unverified
#ModelMetricClaimedVerifiedStatus
1ACRV BaselineOMQ0.35Unverified