Document Type

Thesis

Date of Award

Spring 5-31-2006

Degree Name

Master of Science in Computational Biology - (M.S.)

Department

Computer Science

First Advisor

Jason T. L. Wang

Second Advisor

Qun Ma

Third Advisor

Vincent Oria

Abstract

SYSTERS is a biological information integration system containing protein sequences from many protein databases such as Swiss-Prot and TrEMBL and also protein sequences from complete genomes available at Ensembl, The Arabidopsis Information Resource, SGD and GeneDB. For some protein sequences their encoding nucleotide sequences can be found in their corresponding websites. However, for some protein sequences their encoding nucleotide sequences are missing.

The goal of this thesis is to. collect all nucleotide sequences for the protein sequences in SYSTERS and store them in a common database. There are two cases. The first case is that if the nucleotide sequences can be found, we collect them and put them in our database. The second case is that if the nucleotide sequences are missing, we use back-translation and use TBLASTN to search the nucleotide sequences and store them in our database.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.