SOTAVerified

Efficient Self-Ensemble for Semantic Segmentation

2021-11-26arXiv 2021Code Available1· sign in to hype

Walid Bousselham, Guillaume Thibault, Lucas Pagano, Archana Machireddy, Joe Gray, Young Hwan Chang, Xubo Song

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Ensemble of predictions is known to perform better than individual predictions taken separately. However, for tasks that require heavy computational resources, e.g. semantic segmentation, creating an ensemble of learners that needs to be trained separately is hardly tractable. In this work, we propose to leverage the performance boost offered by ensemble methods to enhance the semantic segmentation, while avoiding the traditional heavy training cost of the ensemble. Our self-ensemble approach takes advantage of the multi-scale features set produced by feature pyramid network methods to feed independent decoders, thus creating an ensemble within a single model. Similar to the ensemble, the final prediction is the aggregation of the prediction made by each learner. In contrast to previous works, our model can be trained end-to-end, alleviating the traditional cumbersome multi-stage training of ensembles. Our self-ensemble approach outperforms the current state-of-the-art on the benchmark datasets Pascal Context and COCO-Stuff-10K for semantic segmentation and is competitive on ADE20K and Cityscapes. Code is publicly available at github.com/WalBouss/SenFormer.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
ADE20KSenFormer (BEiT-L)Validation mIoU57.1Unverified
ADE20KSenFormer (Swin-L)Validation mIoU54.2Unverified
ADE20K valSenFormer (Swin-L)mIoU54.2Unverified
ADE20K valSenFormer (BEiT-L)mIoU57.1Unverified
COCO-Stuff testSenFormer (Swin-L)mIoU50.1Unverified
PASCAL ContextSenFormer (ResNet-101)mIoU56.6Unverified
PASCAL ContextSenFormer (Swin-L)mIoU64Unverified

Reproductions