Date of Award

Spring 2015

Document Type

Thesis

Degree Name

Master of Science in Bioinformatics - (M.S.)

Department

Computer Science

First Advisor

Usman W. Roshan

Second Advisor

Jason T. L. Wang

Third Advisor

Zhi Wei

Abstract

The increase in the volume of genomic data due to the decrease in the cost of whole genome sequencing techniques has opened up new avenues of research in the field of Bioinformatics, like comparative genomics and evolutionary dynamics. The fundamental task in these studies is to align the genome sequences accurately. Sequence alignment helps to identify regions of similarity between the sequences to establish their functional, evolutionary and structural relationship. The thesis investigates the performance of two sequence alignment programs LASTZ, a hash table based faster method and SSEARCH, a slower but more rigorous Smith-Waterman based approach, on whole genome sequences from primates and mammals. An exact genome alignment technique is used by breaking the entire genome into fragments and aligning these fragments with the reference genome using the Smith-Waterman based method. A comparison of the two methods reveals that the second approach performs better for genomes from closely related species.

Share

COinS