Biomedical text categorization with concept graph representations using a controlled vocabulary

Document Type

Conference Proceeding

Publication Date

9-28-2012

Abstract

Recent work using graph representations for text categorization has shown promising performance over conventional bag-of-words representation of text documents. In this paper we investigate a graph representation of texts for the task of text categorization. In our representation we identify high level concepts extracted from a database of controlled biomedical terms and build a rich graph structure that contains important concepts and relationships. This procedure ensures that graphs are described with a regular vocabulary, leading to increased ease of comparison. We then classify document graphs by applying a set-based graph kernel that is intuitively sensible and able to deal with the disconnectedness of the constructed concept graphs. We compare this approach to standard approaches using non-graph, text-based features. We also do a comparison amongst different kernels that can be used to see which performs better. Copyright 2012 ACM.

Identifier

84866635017 (Scopus)

ISBN

[9781450315524]

Publication Title

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

External Full Text Location

https://doi.org/10.1145/2350176.2350181

First Page

26

Last Page

32

This document is currently not available here.

Share

COinS