| BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval | Jun 14, 2024 | Image RetrievalImage to text | CodeCode Available | 0 |
| CMC-Bench: Towards a New Paradigm of Visual Signal Compression | Jun 13, 2024 | Image CompressionImage to text | CodeCode Available | 1 |
| Benchmarking Vision-Language Contrastive Methods for Medical Representation Learning | Jun 11, 2024 | BenchmarkingContrastive Learning | CodeCode Available | 0 |
| Fetch-A-Set: A Large-Scale OCR-Free Benchmark for Historical Document Retrieval | Jun 11, 2024 | Image RetrievalImage to text | —Unverified | 0 |
| AICoderEval: Improving AI Domain Code Generation of Large Language Models | Jun 7, 2024 | Code GenerationImage to text | —Unverified | 0 |
| Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design | May 29, 2024 | Dataset GenerationImage to text | CodeCode Available | 1 |
| Faithful Chart Summarization with ChaTS-Pi | May 29, 2024 | Image to textSentence | —Unverified | 0 |
| Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning | May 26, 2024 | Image to textImage-to-Text Retrieval | —Unverified | 0 |
| Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation | May 23, 2024 | Image to textSentence | CodeCode Available | 0 |
| Language-Oriented Semantic Latent Representation for Image Transmission | May 16, 2024 | Image to textSemantic Communication | CodeCode Available | 1 |