SOTAVerified

Synergizing Deep Learning and Biological Heuristics for Extreme Long-Tail White Blood Cell Classification

2026-03-19Code Available0· sign in to hype

Duc T. Nguyen, Hoang-Long Nguyen, Huy-Hieu Pham

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Automated white blood cell (WBC) classification is essential for leukemia screening yet remains challenging under extreme class imbalance and domain shift. These limitations often cause deep models to overfit dominant classes while failing to generalize to rare pathological subtypes. To address this issue, we propose a three-stage hybrid framework. First, a self-supervised Pix2Pix restoration module mitigates synthetic noise and restores high frequency cytoplasmic details. Second, we integrate a Swin Transformer ensemble with MedSigLIP contrastive embeddings to enhance rare-class semantic representation. Finally, we introduce a biologically inspired refinement strategy combining geometric spikiness analysis and Mahalanobis-based morphological constraints to explicitly rescue suppressed minority predictions. Our hybrid framework achieves a Macro-F1 score of 0.77139 on the private leaderboard, demonstrating strong robustness under extreme long-tail distributions. The code is available at https://github.com/trongduc-nguyen/WBCBench2026.

Reproductions