| Towards a text-based quantitative and explainable histopathology image analysis | Jul 10, 2024 | image-classificationImage Classification | CodeCode Available | 0 |
| A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation | Oct 8, 2016 | Handwriting RecognitionImage to text | CodeCode Available | 0 |
| GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language Models | Jul 30, 2024 | Image to textImage-to-Text Retrieval | CodeCode Available | 0 |
| RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models | Apr 21, 2023 | Cross-Modal RetrievalImage-text matching | CodeCode Available | 0 |
| Pragmatic Radiology Report Generation | Nov 28, 2023 | Image to text | CodeCode Available | 0 |
| MultiQG-TI: Towards Question Generation from Multi-modal Sources | Jul 7, 2023 | Image to textOptical Character Recognition | CodeCode Available | 0 |
| Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation | May 23, 2024 | Image to textSentence | CodeCode Available | 0 |
| Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions | Mar 10, 2018 | Image DescriptionImage to text | CodeCode Available | 0 |
| Self-Supervised Image-to-Text and Text-to-Image Synthesis | Dec 9, 2021 | Image GenerationImage to text | CodeCode Available | 0 |
| Multi-LLM Collaborative Caption Generation in Scientific Documents | Jan 5, 2025 | Caption GenerationImage to text | CodeCode Available | 0 |
| BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval | Jun 14, 2024 | Image RetrievalImage to text | CodeCode Available | 0 |
| Exploration into Translation-Equivariant Image Quantization | Dec 1, 2021 | Image GenerationImage to text | CodeCode Available | 0 |
| Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Jun 30, 2023 | Image to textobject-detection | CodeCode Available | 0 |
| VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations | Apr 25, 2024 | Image to textSensitivity | CodeCode Available | 0 |
| A Data-Driven Guided Decoding Mechanism for Diagnostic Captioning | Jun 20, 2024 | DiagnosticImage to text | CodeCode Available | 0 |
| SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects | Nov 1, 2018 | Image to textObject | CodeCode Available | 0 |
| Benchmarking Vision-Language Contrastive Methods for Medical Representation Learning | Jun 11, 2024 | BenchmarkingContrastive Learning | CodeCode Available | 0 |
| Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task | Oct 8, 2019 | Cross-Modal RetrievalImage to text | CodeCode Available | 0 |
| Effective Use of Word Order for Text Categorization with Convolutional Neural Networks | Dec 1, 2014 | General ClassificationImage to text | CodeCode Available | 0 |
| Survey on Abstractive Text Summarization: Dataset, Models, and Metrics | Dec 22, 2024 | Abstractive Text SummarizationGeneral Knowledge | CodeCode Available | 0 |
| CLIP-FSAC++: Few-Shot Anomaly Classification with Anomaly Descriptor Based on CLIP | Dec 5, 2024 | Anomaly ClassificationAnomaly Detection | CodeCode Available | 0 |