Generating Training Data for Concept-Mining for an 'Interface Terminology' Annotating Cardiology EHRs

Document Type

Conference Proceeding

Publication Date

12-16-2020

Abstract

Clinical data stored in EHRs could provide valuable knowledge for research if it were annotated properly. However, almost no EHR notes are currently annotated as the performance of off the shelf annotation tools is unsatisfactory. Concentrating on the cardiology specialty, we propose to design a Cardiology Interface Terminology dedicated to the annotation of EHR notes in cardiology. This interface terminology will be developed by the addition of high granularity concepts, mined from cardiology EHR notes, to an initial version reusing SNOMED CT cardiology subhierarchies. Using text mining NLP tools with machine learning for extending this interface terminology requires proper training data. In this paper, we discuss concept-mining of EHR notes, using concatenation and anchoring operations iteratively to create such training data. This approach can be applied to other medical specialties.

Identifier

85100342933 (Scopus)

ISBN

[9781728162157]

Publication Title

Proceedings 2020 IEEE International Conference on Bioinformatics and Biomedicine Bibm 2020

External Full Text Location

https://doi.org/10.1109/BIBM49941.2020.9313435

First Page

1728

Last Page

1735

This document is currently not available here.

Share

COinS