Generating Training Data for Concept-Mining for an 'Interface Terminology' Annotating Cardiology EHRs
Document Type
Conference Proceeding
Publication Date
12-16-2020
Abstract
Clinical data stored in EHRs could provide valuable knowledge for research if it were annotated properly. However, almost no EHR notes are currently annotated as the performance of off the shelf annotation tools is unsatisfactory. Concentrating on the cardiology specialty, we propose to design a Cardiology Interface Terminology dedicated to the annotation of EHR notes in cardiology. This interface terminology will be developed by the addition of high granularity concepts, mined from cardiology EHR notes, to an initial version reusing SNOMED CT cardiology subhierarchies. Using text mining NLP tools with machine learning for extending this interface terminology requires proper training data. In this paper, we discuss concept-mining of EHR notes, using concatenation and anchoring operations iteratively to create such training data. This approach can be applied to other medical specialties.
Identifier
85100342933 (Scopus)
ISBN
[9781728162157]
Publication Title
Proceedings 2020 IEEE International Conference on Bioinformatics and Biomedicine Bibm 2020
External Full Text Location
https://doi.org/10.1109/BIBM49941.2020.9313435
First Page
1728
Last Page
1735
Recommended Citation
Keloth, Vipina K.; Zhou, Shuxin; Einstein, Andrew J.; Elhanan, Gai; Chen, Yan; Geller, James; and Perl, Yehoshua, "Generating Training Data for Concept-Mining for an 'Interface Terminology' Annotating Cardiology EHRs" (2020). Faculty Publications. 4736.
https://digitalcommons.njit.edu/fac_pubs/4736
