Open-source framework for detecting bias and overfitting for large pathology images

2025-03-03Code Available0· sign in to hype

Anders Sildnes, Nikita Shvetsov, Masoud Tafavvoghi, Vi Ngoc-Nha Tran, Kajsa Møllersen, Lill-Tove Rasmussen Busund, Thomas K. Kilvær, Lars Ailo Bongo

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/uit-hdl/feature-inspect
OfficialIn paperpytorch★ 0
github.com/uit-hdl/code-overfit-detection-framework
OfficialIn paperpytorch★ 0

Abstract

Even foundational models that are trained on datasets with billions of data samples may develop shortcuts that lead to overfitting and bias. Shortcuts are non-relevant patterns in data, such as the background color or color intensity. So, to ensure the robustness of deep learning applications, there is a need for methods to detect and remove such shortcuts. Today's model debugging methods are time consuming since they often require customization to fit for a given model architecture in a specific domain. We propose a generalized, model-agnostic framework to debug deep learning models. We focus on the domain of histopathology, which has very large images that require large models - and therefore large computation resources. It can be run on a workstation with a commodity GPU. We demonstrate that our framework can replicate non-image shortcuts that have been found in previous work for self-supervised learning models, and we also identify possible shortcuts in a foundation model. Our easy to use tests contribute to the development of more reliable, accurate, and generalizable models for WSI analysis. Our framework is available as an open-source tool available on github.

Tasks

GPU Self-Supervised Learning

Open-source framework for detecting bias and overfitting for large pathology images

Code

Abstract

Tasks

Reproductions