Date of Award
Master of Science in Computer Science - (M.S.)
Usman W. Roshan
In the biological field, the smallest unit of organisms in most biological systems is the single cell, and the classification of cells is an everlasting problem. A central task for analysis of single-cell RNA-seq data is to identify and characterize novel cell types. Currently, there are several classical methods, such as K-means algorithm, spectral clustering, and Gaussian Mixture Models (GMMs), which are widely used to cluster the cells. Furthermore, typical dimensional reduction methods such as PCA, t-SNE, and ZIDA have been introduced to overcome “the curse of dimensionality”. A more recent method scDeepCluster has demonstrated improved and promising performances in clustering single-cell data. In this study, a clustering method is proposed to optimize scDeepCluster with Siamese networks, which will learn more reliable functions for mapping inputs to the latent space. Also, the spectral clustering based on the SpectralNet algorithm is employed to improve clustering performances. Extensive experiments are conducted to demonstrate its superior performance in comparison with the current state-of-art methods.
Meng, Zixia, "Model-based deep siamese autoencoder for clustering single cell rna-seq data" (2020). Theses. 1794.