Prediction of continuous phenotypes in mouse, fly, and rice genome wide association studies with support vector regression SNPs and ridge regression classifier
Document Type
Conference Proceeding
Publication Date
3-2-2016
Abstract
The ranking of SNPs and prediction of phenotypes in continuous genome wide association studies is a subject of increasing interest with applications in personalized medicine and animal and plant breeding. The ranking of SNPs in case control (discrete label) genome wide association studies has been examined in several previous studies with machine learning techniques but this is poorly explored for studies with quantitative labels. Here we study ranking of SNPs in mouse, fly, and rice continuous genome wide association studies given by the popular univariate Pearson correlation coefficient and the multivariate support vector regression and ridge regression. We perform cross-validation with the support vector regression and ridge regression models on top ranked SNPs and compute correlation coefficients between true and predicted phenotypes. Our results show that ridge regression prediction with top ranked support vector regression SNPs gives the highest accuracy. On all datasets we achieve accuracies comparable to previously published values but with fewer SNPs. Our work shows we can learn parsimonious SNP models for predicting continuous labels in genome wide studies.
Identifier
84969641304 (Scopus)
ISBN
[9781509002870]
Publication Title
Proceedings 2015 IEEE 14th International Conference on Machine Learning and Applications Icmla 2015
External Full Text Location
https://doi.org/10.1109/ICMLA.2015.224
First Page
1246
Last Page
1250
Recommended Citation
Aljouie, Abdulrhman and Roshan, Usman, "Prediction of continuous phenotypes in mouse, fly, and rice genome wide association studies with support vector regression SNPs and ridge regression classifier" (2016). Faculty Publications. 10642.
https://digitalcommons.njit.edu/fac_pubs/10642
