SOTAVerified

RARE disease detection from Capsule Endoscopic Videos based on Vision Transformers

2026-03-16Unverified0· sign in to hype

X. Gao, C. Chien, G. Liu, A. Manullang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This work is corresponding to the Gastro Competition for multi-label classification from capsule endoscopic videos (CEV). Deep learning network based on Transformers are fined-tune for this task. The based online mode is Google Vision Transformer (ViT) batch16 with 224 x 224 resolutions. In total, 17 labels are classified, which are mouth, esophagus, stomach, small intestine, colon, z-line, pylorus, ileocecal valve, active bleeding, angiectasia, blood, erosion, erythema, hematin, lymphangioectasis, polyp, and ulcer. For test dataset of three videos, the overall mAP @0.5 is 0.0205 whereas the overall mAP @0.95 is 0.0196.

Reproductions