MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise

2024-05-20Code Available2· sign in to hype

Ruiqi Wu, Chenran Zhang, Jianle Zhang, Yi Zhou, Tao Zhou, Huazhu Fu

Code Available — Be the first to reproduce this paper.

Code

github.com/lxirich/mm-retinal
OfficialIn paperpytorch★ 98

Abstract

Current fundus image analysis models are predominantly built for specific tasks relying on individual datasets. The learning process is usually based on data-driven paradigm without prior knowledge, resulting in poor transferability and generalizability. To address this issue, we propose MM-Retinal, a multi-modal dataset that encompasses high-quality image-text pairs collected from professional fundus diagram books. Moreover, enabled by MM-Retinal, we present a novel Knowledge-enhanced foundational pretraining model which incorporates Fundus Image-Text expertise, called KeepFIT. It is designed with image similarity-guided text revision and mixed training strategy to infuse expert knowledge. Our proposed fundus foundation model achieves state-of-the-art performance across six unseen downstream tasks and holds excellent generalization ability in zero-shot and few-shot scenarios. MM-Retinal and KeepFIT are available at https://github.com/lxirich/MM-Retinal.

MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise

Code

Abstract

Reproductions