SOTAVerified

Scene Recognition

Papers

Showing 150 of 207 papers

TitleStatusHype
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous DrivingCode2
An Empirical Study of Remote Sensing PretrainingCode2
Omnivore: A Single Model for Many Visual ModalitiesCode2
PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation LearningCode1
NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup AnnotationsCode1
A Prior Instruction Representation Framework for Remote Sensing Image-text RetrievalCode1
NarrativeXL: A Large-scale Dataset For Long-Term Memory ModelsCode1
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D DatasetsCode1
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision ResearchCode1
MovieCLIP: Visual Scene Recognition in MoviesCode1
Where in the World is this Image? Transformer-based Geo-localization in the WildCode1
Object-to-Scene: Learning to Transfer Object Knowledge to Indoor Scene RecognitionCode1
BORM: Bayesian Object Relation Model for Indoor Scene RecognitionCode1
MultiScene: A Large-scale Dataset and Benchmark for Multi-scene Recognition in Single Aerial ImagesCode1
Bidirectional Projection Network for Cross Dimension Scene UnderstandingCode1
A Study of Face Obfuscation in ImageNetCode1
Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-Grained RecognitionCode1
Self-supervised Video Representation Learning by Uncovering Spatio-temporal StatisticsCode1
Visual Memorability for Robotic Interestingness via Unsupervised Online LearningCode1
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene RecognitionCode1
When CNNs Meet Random RNNs: Towards Multi-Level Analysis for RGB-D Object and Scene RecognitionCode1
Unsupervised Model Personalization while Preserving Privacy and Scalability: An Open ProblemCode1
Indoor Scene Recognition in 3DCode1
Deep Attentional Structured Representation Learning for Visual RecognitionCode1
Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments0
Lightweight Multimodal Artificial Intelligence Framework for Maritime Multi-Scene Recognition0
Contrastive Visual Data Augmentation0
Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments0
Advancing ALS Applications with Large-Scale Pre-training: Dataset Development and Downstream AssessmentCode0
Vision-Language Models for Autonomous Driving: CLIP-Based Dynamic Scene Understanding0
Sound Bridge: Associating Egocentric and Exocentric Videos via Audio CuesCode0
SAIST: Segment Any Infrared Small Target Model Guided by Contrastive Language-Image Pretraining0
Movement Control of Smart Mosque's Domes using CSRNet and Fuzzy Logic Techniques0
A Retention-Centric Framework for Continual Learning with Guaranteed Model Developmental SafetyCode0
Rethinking VLMs and LLMs for Image Classification0
Less yet robust: crucial region selection for scene recognition0
CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads0
Indoor scene recognition from images under visual corruptions0
Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces0
Multi-task Prompt Words Learning for Social Media Content Generation0
Advancing Ubiquitous Wireless Connectivity through Channel Twinning0
Non-negative Subspace Feature Representation for Few-shot Learning in Medical Imaging0
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model0
A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions0
Leveraging Self-Supervised Learning for Scene Classification in Child Sexual Abuse Imagery0
Digital Divides in Scene Recognition: Uncovering Socioeconomic Biases in Deep Learning Systems0
Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition0
Inter-object Discriminative Graph Modeling for Indoor Scene Recognition0
Counting Manatee Aggregations using Deep Neural Networks and Anisotropic Gaussian KernelCode0
A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction0
Show:102550
← PrevPage 1 of 5Next →

No leaderboard results yet.