Faculty Publications

Histogram difference string distance for enhancing ontology integration in bioinformatics

Alex Rudniy, New Jersey Institute of Technology
James Geller, New Jersey Institute of Technology
Min Song, New Jersey Institute of Technology

Document Type

Conference Proceeding

Publication Date

12-1-2012

Abstract

Integration of bioinformatics ontologies is an important research task. This paper presents a family of new methods of string distance computation for improving existing ontology integration and alignment techniques. A histogram, the main tool of the introduced methods, is an associative array for storing the number of occurrences of each character in a string. We use histogram difference in combination with Longest Common Prefix, TFIDF, Smith-Waterman, and Jaccard re-scorers to define the four members of our family of string matching methods. We compare the performance of our methods with several well-known string matching algorithms using five Gene Ontology datasets as test beds. Our methods outperformed those algorithms in terms of average precision on four datasets and for maximum F1 measure on three datasets. On the remaining datasets our results were among the best, compared to these well-known methods.

Identifier

84883638335 (Scopus)

ISBN

[9781618397461]

Publication Title

4th International Conference on Bioinformatics and Computational Biology 2012 Bicob 2012

First Page

108

Last Page

113

Recommended Citation

Rudniy, Alex; Geller, James; and Song, Min, "Histogram difference string distance for enhancing ontology integration in bioinformatics" (2012). Faculty Publications. 17943.
https://digitalcommons.njit.edu/fac_pubs/17943

This document is currently not available here.

COinS

Faculty Publications

Histogram difference string distance for enhancing ontology integration in bioinformatics

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

First Page

Last Page

Recommended Citation

Search

Browse

Author Corner

Links

Faculty Publications

Histogram difference string distance for enhancing ontology integration in bioinformatics

Authors

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

First Page

Last Page

Recommended Citation

Share

Search

Browse

Author Corner

Links