Document Type


Date of Award

Spring 5-31-2016

Degree Name

Master of Science in Computer Science - (M.S.)


Computer Science

First Advisor

Jason T. L. Wang

Second Advisor

Xiaoning Ding

Third Advisor

Chase Qishi Wu


MapReduce comes from a traditional problem solving method: separating a big problem and solving each small parts. With the target of computing larger dataset in more efficient and cheaper way, this is implement into a programming mode to deal with massive quantity of data. The users get a map function and use it to abstract dataset into key / value logical pair and then use a reduce function to group all value with the same key. With this mode, task can be automatic spread the job into clusters grouped by lots of normal computers. MapReduce program can be easily implemented and gain much more efficiency than tradition computing programs. In this paper there are some sample programs and one GRN detection algorithm program to study about it.

Detecting gene regulatory networks (GRN), the regulatory molecules connection among various genes, is one of the main subjects in understanding gene biology. Although there are algorithms developed for this target, the increase of gene size and their complexity make the processing time more and more hard and slow. MapReduce mode with parallelize computing can be one way to overcome these problems. In this paper, a well-defined framework to parallelize mutual information algorithm is presented. The experiments and result performances shows the improvement of using parallelizing MapReduce model.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.