Auditing National Cancer Institute thesaurus neoplasm concepts in groups of high error concentration
Document Type
Article
Publication Date
1-1-2017
Abstract
The National Cancer Institute thesaurus is an important knowledge resource that should ideally be error-free. We investigated the occurrence of errors in the Neoplasm subhierarchy, which is a part of the National Cancer Institute thesaurus Disease, Disorder or Finding hierarchy. There are five key findings in this study. (1) Errors in the Neoplasm subhierarchy are not uniformly distributed. (2) A partial-area taxonomy, which is a compact network for summarizing the structure and content of an ontology, helped uncover groups of concepts, called "small partial-areas," in the Neoplasm subhierarchy. (3) The rate of errors in "small partial-areas" is twice as large as in "large partial-areas" (44% versus 22%), satisfying statistical significance. Thus, we conclude that higher error concentrations exist in small partial-areas. (4) Group-based auditing can be used successfully to identify additional suspicious concepts in a small group, once a few members of the group are already known as erroneous. (5) Error correction propagation can be used successfully and with minimal effort to correct additional errors in the Neoplasm subhierarchy that occur outside of an initial small group of erroneous concepts. We present examples of errors and examples of how corrections transform and simplify the partial-area taxonomy.
Identifier
85022187752 (Scopus)
Publication Title
Applied Ontology
External Full Text Location
https://doi.org/10.3233/AO-170179
e-ISSN
18758533
ISSN
15705838
First Page
113
Last Page
130
Issue
2
Volume
12
Grant
R01CA190779
Fund Ref
National Institutes of Health
Recommended Citation
Zheng, Ling; Min, Hua; Chen, Yan; Xu, Julia; Geller, James; and Perl, Yehoshua, "Auditing National Cancer Institute thesaurus neoplasm concepts in groups of high error concentration" (2017). Faculty Publications. 10051.
https://digitalcommons.njit.edu/fac_pubs/10051
