Date of Award

Fall 2013

Document Type

Thesis

Degree Name

Master of Science in Bioinformatics - (M.S.)

Department

Computer Science

First Advisor

Zhi Wei

Second Advisor

Usman W. Roshan

Third Advisor

Dimitri Theodoratos

Abstract

In molecular biology research, RNA-seq is a relatively new method for transcriptome profiling. It utilizes the next generation sequencing technology to provide huge amount information about the variety and abundance of RNA present in an organism of interest at a specific state and a given time. One of the most important tasks of RNA-seq analysis is finding genes that are expressed differently in different subject groups. A lot of differential expression analysis tools for RNA-seq have been developed, but there is no golden standard in this field. In this research, four commonly used tools (DESeq, edgeR, limma, and cuffdiff) are studied by comparing their performances in the normalization of different subject group data, and also in the sensitivity and specificity of selection of genes with differential expression. In addition, their performances on genes which only express in one condition are compared. The data used are SEQC and melanoma. The result shows that in differential expression analysis, DESeq is slightly better than other tools in normalization, while DESeq, edgeR, and limma, in general, display good sensitivity and specificity, and limma outputs less false positive predictions. In cases where genes of interest are absent in one of the conditions, limma has the best performance.

Share

COinS