SOTAVerified

Image Description

Papers

Showing 101150 of 154 papers

TitleStatusHype
WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization0
Zero-Resource Neural Machine Translation with Multi-Agent Communication Game0
Focused Evaluation for Image Description with Binary Forced-Choice Tasks0
From phonemes to images: levels of representation in a recurrent neural model of visually-grounded language learning0
Generating Image Captions in Arabic using Root-Word Based Recurrent Neural Networks and Deep Neural Networks0
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation0
Im2Text: Describing Images Using 1 Million Captioned Photographs0
Image Description Dataset for Language Learners0
Image Description using Visual Dependency Representations0
Image Pivoting for Learning Multilingual Multimodal Representations0
Impressions: Understanding Visual Semiotics and Aesthetic Impact0
Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments0
InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models0
Fan-Beam Binarization Difference Projection (FB-BDP): A Novel Local Object Descriptor for Fine-Grained Leaf Image RetrievalCode0
The Treasure beneath Convolutional Layers: Cross-convolutional-layer Pooling for Image ClassificationCode0
On Architectures for Including Visual Information in Neural Language Models for Image DescriptionCode0
Bridging Languages through Images with Deep Partial Canonical Correlation AnalysisCode0
Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human GazeCode0
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face DescriptionsCode0
Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline)Code0
How Do Image Description Systems Describe People? A Targeted Assessment of System Competence in the PEOPLE-domainCode0
IDEA: Image Description Enhanced CLIP-AdapterCode0
Deep Imbalanced Attribute Classification using Visual Attention AggregationCode0
Skeletal Human Action Recognition using Hybrid Attention based Graph Convolutional NetworkCode0
Unsupervised Image CaptioningCode0
Compositional Obverter Communication Learning From Raw Visual InputCode0
Efficient Decentralized Visual Place Recognition From Full-Image DescriptorsCode0
Talking about other people: an endless range of possibilitiesCode0
Human Attention in Image Captioning: Dataset and AnalysisCode0
Unsupervised Visual Sense Disambiguation for Verbs using Multimodal EmbeddingsCode0
CIDEr-R: Robust Consensus-based Image Description EvaluationCode0
Pragmatic factors in image description: the case of negationsCode0
Large Language Models can Share Images, Too!Code0
Cross-linguistic differences and similarities in image descriptionsCode0
Varying image description tasks: spoken versus written descriptionsCode0
Localized Symbolic Knowledge Distillation for Visual Commonsense ModelsCode0
Long-term Recurrent Convolutional Networks for Visual Recognition and DescriptionCode0
Improving Visual-Semantic Embeddings by Learning Semantically-Enhanced Hard Negatives for Cross-modal Information RetrievalCode0
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal DatasetsCode0
Measuring the Diversity of Automatic Image DescriptionsCode0
MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning StepsCode0
What a neural language model tells us about spatial relationsCode0
Does Multimodality Help Human and Machine for Translation and Image Captioning?Code0
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMsCode0
Describing Videos by Exploiting Temporal StructureCode0
VisBias: Measuring Explicit and Implicit Social Biases in Vision Language ModelsCode0
Multi30K: Multilingual English-German Image DescriptionsCode0
Contextualize, Show and Tell: A Neural Visual StorytellerCode0
Multilingual Image Description with Neural Sequence ModelsCode0
Room for improvement in automatic image description: an error analysisCode0
Show:102550
← PrevPage 3 of 4Next →

No leaderboard results yet.