Piecewise synonyms for enhanced UMLS source terminology integration.
Document Type
Article
Publication Date
1-1-2007
Abstract
The UMLS contains more than 100 source vocabularies and is growing via the integration of others. When integrating a new source, the source terms already in the UMLS must first be found. The easiest approach to this is simple string matching. However, string matching usually does not find all concepts that should be found. A new methodology, based on the notion of piecewise synonyms, for enhancing the process of concept discovery in the UMLS is presented. This methodology is supported by first creating a general synonym dictionary based on the UMLS. Each multi-word source term is decomposed into its component words, allowing for the generation of separate synonyms for each word from the general synonym dictionary. The recombination of these synonyms into new terms creates an expanded pool of matching candidates for terms from the source. The methodology is demonstrated with respect to an existing UMLS source. It shows a 34% improvement over simple string matching.
Identifier
56149088205 (Scopus)
Publication Title
AMIA Annual Symposium Proceedings AMIA Symposium AMIA Symposium
e-ISSN
15594076
PubMed ID
18693854
First Page
339
Last Page
343
Grant
R01LM008445
Fund Ref
U.S. National Library of Medicine
Recommended Citation
Huang, Kuo Chuan; Geller, James; Halper, Michael; and Cimino, James J., "Piecewise synonyms for enhanced UMLS source terminology integration." (2007). Faculty Publications. 13598.
https://digitalcommons.njit.edu/fac_pubs/13598
