scAnnotate: An Automated Cell Type Annotation Tool for Single-Cell RNA-Sequencing Data




Tsao, Danielle

Journal Title

Journal ISSN

Volume Title



Single-cell RNA-sequencing (scRNA-seq) technology enables researchers to investigate a genome at the single-cell level with unprecedented resolution. An organism consists of a heterogeneous collection of cell types, each of which plays a distinct role in various biological processes; thus, the first step of scRNA-seq data analysis often is to distinguish cell types so that they can be investigated separately. Dropout is a crucial characteristic of scRNA-seq data that, although widely used in differential expression analysis, is not explicitly used by existing supervised learning methods for cell annotation. We present scAnnotate, an automated cell annotation tool that utilizes dropout information via a mixture model based ensemble learning approach. We demonstrate through real scRNA-seq data that scAnnotate is competitive against other supervised machine-learning methods and accurately annotates cells when training and test data are similar, cross-platform, or cross-species.



single-cell RNA-sequencing, statistical learning, Cell type annotation