Mining Concepts for a COVID Interface Terminology for Annotation of EHRs

Document Type

Conference Proceeding

Publication Date

12-10-2020

Abstract

The COVID-19 pandemic has overwhelmed the healthcare services of many countries with increased number of patients and also with a deluge of medical data. Furthermore, the emergence and global spread of new infectious diseases are highly likely to continue in the future. Incomplete data about presentations, signs, and symptoms of COVID-19 has had adverse effects on healthcare delivery. The EHRs of US hospitals have ingested huge volumes of relevant, up-to-date data about patients, but the lack of a proper system to annotate this data has greatly reduced its usefulness. We propose to design a COVID interface terminology for the annotation of EHR notes of COVID-19 patients. The initial version of this interface terminology was created by integrating COVID concepts from existing ontologies. Further enrichment of the interface terminology is performed by mining high granularity concepts from EHRs, because such concepts are usually not present in the existing reference terminologies. We use the techniques of concatenation and anchoring iteratively to extract high granularity phrases from the clinical text. In addition to increasing the conceptual base of the COVID interface terminology, this will also help in generating training data for large scale concept mining using machine learning techniques. Having the annotated clinical notes of COVID-19 patients available will help in speeding up research in this field.

Identifier

85103840997 (Scopus)

ISBN

[9781728162515]

Publication Title

Proceedings 2020 IEEE International Conference on Big Data Big Data 2020

External Full Text Location

https://doi.org/10.1109/BigData50022.2020.9377981

First Page

3753

Last Page

3760

Grant

UL1TR003017

Fund Ref

National Institutes of Health

This document is currently not available here.

Share

COinS