Faculty Publications

Automatic extraction for creating a lexical repository of abbreviations in the biomedical literature

Min Song, New Jersey Institute of Technology
Il Yeol Song, College of Computing & Informatics
Ki Jung Lee, College of Computing & Informatics

Document Type

Conference Proceeding

Publication Date

1-1-2006

Abstract

The sheer volume of biomedical text is growing at an exponential rate. This growth creates challenges for both human readers and automatic text processing algorithms. One such challenge arises from common and uncontrolled usages of abbreviations in the biomedical literature. This, in turn, requires that biomedical lexical ontologies be continuously updated. In this paper, we propose a hybrid approach combining lexical analysis techniques and the Support Vector Machine (SVM) to create an automatically generated and maintained lexicon of abbreviations. The proposed technique is differentiated from others in the following aspects: 1) It incorporates lexical analysis techniques to supervised learning for extracting abbreviations. 2) It makes use of text chunking techniques to identify long forms of abbreviations. 3) It significantly improves Recall compared to other techniques. The experimental results show that our approach outperforms the leading abbreviation algorithms, ExtractAbbrev and ALICE, at least by 6% and 13.9%, respectively, in both Precision and Recall on the Gold Standard Development corpus. © Springer-Verlag Berlin Heidelberg 2006.

Identifier

33751383258 (Scopus)

ISBN

[3540377360, 9783540377368]

Publication Title

Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics

External Full Text Location

https://doi.org/10.1007/11823728_37

e-ISSN

16113349

ISSN

03029743

First Page

384

Last Page

393

Volume

4081 LNCS

Recommended Citation

Song, Min; Song, Il Yeol; and Lee, Ki Jung, "Automatic extraction for creating a lexical repository of abbreviations in the biomedical literature" (2006). Faculty Publications. 19210.
https://digitalcommons.njit.edu/fac_pubs/19210

This document is currently not available here.

COinS

DOI

10.1007/11823728_37

Faculty Publications

Automatic extraction for creating a lexical repository of abbreviations in the biomedical literature

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

External Full Text Location

e-ISSN

ISSN

First Page

Last Page

Volume

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

Automatic extraction for creating a lexical repository of abbreviations in the biomedical literature

Authors

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

External Full Text Location

e-ISSN

ISSN

First Page

Last Page

Volume

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links