scCorrect: Cross-modality label transfer from scRNA-seq to scATAC-seq using domain adaptation.

Journal: Analytical Biochemistry
Published:
Abstract

Cell type annotation in single-cell chromatin accessibility sequencing (scATAC-seq) is crucial for enabling researchers to identify subpopulations of cells associated with specific diseases, elucidate gene regulatory networks, and discover markers indicative of disease states. The prevailing approach for cell type annotation in single-cell research involves transferring well-delineated cell types from single-cell RNA sequencing (scRNA-seq) data to scATAC-seq data using a label propagation algorithm. However, the inherent modal discrepancies (i.e.biological interpretation) between scRNA-seq and scATAC-seq data, coupled with the intrinsic sparsity and high dimensionality of scATAC-seq data, pose significant challenges to the efficacy of this strategy. To address these challenges, we introduce a novel neural network framework, scCorrect, which operates in two distinct phases. In the first phase, scCorrect aligns the scRNA-seq and scATAC-seq datasets, generating initial annotation results. The second phase involves training a corrective network specifically designed to amend any erroneous annotations produced during the first phase. Empirical tests across multiple datasets have demonstrated that scCorrect consistently achieves superior recognition accuracy, underscoring its significant potential to enhance disease-related research in humans.

Authors
Yan Liu, Wenyi Pei, Li Chen, Yu Xia, He Yan, Xiaohua Hu