Date of Award

Spring 2014

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Mathematical Sciences - (Ph.D.)

Department

Mathematical Sciences

First Advisor

Wenge Guo

Second Advisor

Sunil Kumar Dhar

Third Advisor

Ji Meng Loh

Fourth Advisor

Sundarraman Subramanian

Fifth Advisor

Xiaodong Lin

Abstract

The hypotheses in many multiple testing problems often have some inherent structure based on prior information such as Gene Ontology in gene expression data. However, few false discovery rate (FDR) controlling procedures take advantage of this inherent structure. In this dissertation, we develop FDR controlling methods which exploit the structural information of the hypotheses.

First, we study the fixed sequence structure where the testing order of the hypotheses has been pre-specified. We are motivated to study this structure since it is the most basic of structures, yet, it has been largely ignored in the literature on large scale multiple testing. We first develop procedures using the conventional fixed sequence method, where the procedures stop testing after the first hypothesis is accepted. Then, we extend the method and develop procedures which stop after a pre- specified number of acceptances. A simulation study and real data analysis show that these procedures can be a powerful alternative to the standard Benj amini- Hochberg and Benjamini-Yekutieli procedures.

Next, we consider the testing of hierarchically ordered hypotheses where hypotheses are arranged in a tree-like structure. First, we introduce a new multiple testing method called the generalized stepwise procedure and use it to create a general approach for testing hierarchically order hypotheses. Then, we develop several hierarchical testing procedures which control the FDR under various forms of dependence. Our simulation studies and real data analysis show that these proposed methods can be more powerful than alternative hierarchical testing methods, such as the method by Yekutieli (2008b).

Finally, we focus on testing hypotheses along a directed acyclic graph (DAG). First, we introduce a novel approach to develop procedures for controlling error rates appropriate for large scale multiple testing. Then, we use this approach to develop an FDR controlling procedure which tests hypotheses along the DAG. To our knowledge, no other FDR controlling procedure exists to test hypotheses with this structure. The procedure is illustrated through a real microarray data analysis where Gene Ontology terms forming a DAG are tested for significance.

In summary, this dissertation offers new FDR controlling methods which utilize the inherent structural information among the tested hypotheses.

Included in

Mathematics Commons

Share

COinS