| DiffCap: Exploring Continuous Diffusion on Image Captioning | May 20, 2023 | Caption GenerationDiversity | —Unverified | 0 |
| Fan-Beam Binarization Difference Projection (FB-BDP): A Novel Local Object Descriptor for Fine-Grained Leaf Image Retrieval | Jan 1, 2023 | BinarizationImage Description | CodeCode Available | 0 |
| Improving Visual-Semantic Embeddings by Learning Semantically-Enhanced Hard Negatives for Cross-modal Information Retrieval | Oct 10, 2022 | Cross-Modal Information RetrievalImage Description | CodeCode Available | 0 |
| Facial Expression Recognition and Image Description Generation in Vietnamese | Aug 12, 2022 | DescriptiveEmotion Recognition | —Unverified | 0 |
| Skeletal Human Action Recognition using Hybrid Attention based Graph Convolutional Network | Jul 12, 2022 | Action RecognitionImage Description | CodeCode Available | 0 |
| Image Description Dataset for Language Learners | Jun 1, 2022 | Image DescriptionSentence | —Unverified | 0 |
| Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset | Jun 1, 2022 | Caption Generationimage-classification | —Unverified | 0 |
| Face2Text revisited: Improved data set and baseline results | May 24, 2022 | Image DescriptionTransfer Learning | —Unverified | 0 |
| Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation | May 2, 2022 | Image DescriptionMachine Translation | —Unverified | 0 |
| Multimodal fusion via cortical network inspired losses | May 1, 2022 | Emotion RecognitionImage Description | —Unverified | 0 |