A convolutional neural network model for survival prediction based on prognosis-related cascaded Wx feature selection
Document Type
Article
Publication Date
10-1-2022
Abstract
Great advances in deep learning have provided effective solutions for prediction tasks in the biomedical field. However, accurate prognosis prediction using cancer genomics data remains challenging due to the severe overfitting problem caused by curse of dimensionality inherent to high-throughput sequencing data. Moreover, there are unique challenges to perform survival analysis, arising from the difficulty in utilizing censored samples whose events of interest are not observed. Convolutional neural network (CNN) models provide us the opportunity to extract meaningful hierarchical features to characterize cancer subtype and prognosis outcomes. On the other hand, feature selection can mitigate overfitting and reduce subsequent model training computation burden by screening out significant genes from redundant genes. To accomplish model simplification, we developed a concise and efficient survival analysis model, named CNN-Cox model, which combines a special CNN framework with prognosis-related feature selection cascaded Wx, with the advantage of less computation demand utilizing light training parameters. Experiment results show that CNN-Cox model achieved consistent higher C-index values and better survival prediction performance across seven cancer type datasets in The Cancer Genome Atlas cohort, including bladder carcinoma, head and neck squamous cell carcinoma, kidney renal cell carcinoma, brain low-grade glioma, lung adenocarcinoma (LUAD), lung squamous cell carcinoma, and skin cutaneous melanoma, compared with the existing state-of-the-art survival analysis methods. As an illustration of model interpretation, we examined potential prognostic gene signatures of LUAD dataset using the proposed CNN-Cox model. We conducted protein–protein interaction network analysis to identify potential prognostic genes and further analyzed the biological function of 13 hub genes, including ANLN, RACGAP1, KIF4A, KIF20A, KIF14, ASPM, CDK1, SPC25, NCAPG, MKI67, HJURP, EXO1, HMMR, whose high expression is significantly associated with poor survival of LUAD patients. These findings confirmed that CNN-Cox model is effective in extracting not only prognosis factors but also biologically meaningful gene features. The codes are available at the GitHub website: https://github.com/wangwangCCChen/CNN-Cox.
Identifier
85133655294 (Scopus)
Publication Title
Laboratory Investigation
External Full Text Location
https://doi.org/10.1038/s41374-022-00801-y
e-ISSN
15300307
ISSN
00236837
PubMed ID
35810236
First Page
1064
Last Page
1074
Issue
10
Volume
102
Grant
12001418
Fund Ref
National Natural Science Foundation of China
Recommended Citation
Yin, Qingyan; Chen, Wangwang; Zhang, Chunxia; and Wei, Zhi, "A convolutional neural network model for survival prediction based on prognosis-related cascaded Wx feature selection" (2022). Faculty Publications. 2623.
https://digitalcommons.njit.edu/fac_pubs/2623