Document Type

Thesis

Date of Award

Fall 1-31-2010

Degree Name

Master of Science in Bioinformatics - (M.S.)

Department

Computer Science

First Advisor

Jason T. L. Wang

Second Advisor

Chengjun Liu

Third Advisor

David Nassimi

Abstract

This thesis presents a novel method, RNAMultifold, for development of a non-coding RNA (ncRNA) classification model based on features derived from folding the consensus sequence of multiple sequence alignments using different folding programs: RNAalifold, CentroidFold, and RSpredict. The method ranks these folding features according to a Class Separation Measure (CSM) that quantifies the ability of the features to differentiate between samples from positive and negative test sets. The set of top-ranked features is then used to construct classification models: Naive Bayes, Fisher Linear Discriminant, and Support Vector Machine (SVM). These models are compared to the performance of the same models with a baseline feature set and with an existing classification tool, RNAz.

The Support Vector Machine classification model with a radial basis function kernel, using the top 11 ranked features, is shown to be more sensitive than other models, including another ncRNA prediction program, RNAz, across all specificity values for the RNA families under study. In addition, the target feature set outperforms the baseline feature set of z score and structure conservation index across all classification methods, with the exception of Fisher Linear Discriminant. The RNAMultifold method is then used to search the genome of a Trypanosome species (Trypanosoma brucei) for novel ncRNAs. The results of this search are compared with known ncRNAs and with results from RNAz.

Recommended Citation

Griesmer, Stephen J., "In silico prediction of non-coding RNAs using supervised learning and feature ranking methods" (2010). Theses. 47.
https://digitalcommons.njit.edu/theses/47

Download

Included in

Bioinformatics Commons, Computer Sciences Commons

COinS

Theses

In silico prediction of non-coding RNAs using supervised learning and feature ranking methods

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Theses

In silico prediction of non-coding RNAs using supervised learning and feature ranking methods

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links