SOTAVerified

Interpreting single-cell and spatial omics data using deep neural network training dynamics

2024-12-04Nature Computational Science 2024Code Available1· sign in to hype

Jonathan Karin, Reshef Mintz, Barak Raveh, Mor Nitzan

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Single-cell and spatial omics datasets can be organized and interpreted by annotating single cells to distinct types, states, locations or phenotypes. However, cell annotations are inherently ambiguous, as discrete labels with subjective interpretations are assigned to heterogeneous cell populations on the basis of noisy, sparse and high-dimensional data. Here we developed Annotatability, a framework for identifying annotation mismatches and characterizing biological data structure by monitoring the dynamics and difficulty of training a deep neural network over such annotated data. Following this, we developed a signal-aware graph embedding method that enables downstream analysis of biological signals. This embedding captures cellular communities associated with target signals. Using Annotatability, we address key challenges in the interpretation of genomic data, demonstrated over eight single-cell RNA sequencing and spatial omics datasets, including identifying erroneous annotations and intermediate cell states, delineating developmental or disease trajectories, and capturing cellular heterogeneity. These results underscore the broad applicability of annotation-trainability analysis via Annotatability for unraveling cellular diversity and interpreting collective cell behaviors in health and disease.

Tasks

Reproductions