Generating better concept hierarchies using automatic document classification

Document Type

Conference Proceeding

Publication Date

12-1-2005

Abstract

This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the initial retrieved documents into topical oriented categories, prior to the actual concept hierarchy generation. The topical categories correspond to different semantic aspects of the query. This is done using a 1-of-n automatic document classification, on the initial set of returned documents. Then, an individual topical concept hierarchy is automatically generated inside each of the resulted categories. Both steps are executed on the fly at retrieval time. Due to the efficiency constraints imposed by the web retrieval context, the algorithm only uses document snippets (rather than full web pages) for both document classification and concept hierarchy generation. Experimental results show that the algorithm is able to improve the quality of the concept hierarchy presented to the searcher; at the same time, the efficiency parameters are kept within reasonable intervals.

Identifier

33745799488 (Scopus)

ISBN

[1595931406, 9781595931405]

Publication Title

International Conference on Information and Knowledge Management Proceedings

External Full Text Location

https://doi.org/10.1145/1099554.1099627

First Page

281

Last Page

282

This document is currently not available here.

Share

COinS