Date of Award
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Ji Meng Loh
Jason T. L. Wang
Program mutation is the process of generating versions of a base program by applying elementary syntactic modifications; this technique has been used in program testing in a variety of applications, most notably to assess the quality of a test data set. A good test set will discover the difference between the original program and mutant except if the mutant is semantically equivalent to the original program, despite being syntactically distinct.
Equivalent mutants are a major nuisance in the practice of mutation testing, because they introduce a significant amount of bias and uncertainty in the analysis of test results; indeed, mutants are useful only to the extent that they define distinct functions from the base program. Yet, despite several decades of research, the identification of equivalent mutants remains a tedious, inefficient, ineffective and error prone process.
The approach that is adopted in this dissertation is to turn away from the goal of identifying individual mutants which are semantically equivalent to the base program, in favor of an approach that merely focuses on estimating their number. To this effect, the following question is considered: what makes a base program P prone to produce equivalent mutants? The position taken in this work is that what makes a program prone to generate equivalent mutants is the same property that makes a program fault tolerant, since fault tolerance is by definition the ability to maintain correct behavior despite the presence and sensitization of faults; whether these faults stem from poor design or from mutation operators does not matter. Hence if we could only quantify the redundancy of a program, we should be able to use the redundancy metrics to estimate the ratio of equivalent mutants (REM for short) of a program.
Using redundancy metrics that were previously defined to reflect the state redundancy of a program, its functional redundancy, its non injectivity and its non-determinacy, this dissertation makes the following contributions:
- The design and implementation of a Java compiler, using compiler generation technology, to analyze Java code and compute its redundancy metrics.
- An empirical study on standard mutation testing benchmarks to analyze the statistical relationships between the REM of a program and its redundancy metrics.
- The derivation of regression models to estimate the REM of a program from its compiler generated redundancy metrics, for a variety of mutation policies.
- The use of the REM to address a number of mutation related issues, including: estimating the level of redundancy between non-equivalent mutants; redefining the mutation score of a test data set to take into account the possibility that mutants may be semantically equivalent to each other; using the REM to derive a minimal set of mutants without having to analyze all the pairs of mutants for equivalence.
The main conclusions of this work are the following:
- The REM plays a very important role in the mutation analysis of a program, as it gives many useful insights into the properties of its mutants.
- All the attributes that can be computed from the REM of a program are very sensitive to the exact value of the REM; Hence the REM must be estimated with great precision.
Consequently, the focus of future research is to revisit the Java compiler and enhance the precision of its estimation of redundancy metrics, and to revisit the regression models accordingly.
Ayad, Amani M., "Quantitative metrics for mutation testing" (2019). Dissertations. 1429.