Generating better concept hierarchies using automatic document classification
Document Type
Conference Proceeding
Publication Date
12-1-2005
Abstract
This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the initial retrieved documents into topical oriented categories, prior to the actual concept hierarchy generation. The topical categories correspond to different semantic aspects of the query. This is done using a 1-of-n automatic document classification, on the initial set of returned documents. Then, an individual topical concept hierarchy is automatically generated inside each of the resulted categories. Both steps are executed on the fly at retrieval time. Due to the efficiency constraints imposed by the web retrieval context, the algorithm only uses document snippets (rather than full web pages) for both document classification and concept hierarchy generation. Experimental results show that the algorithm is able to improve the quality of the concept hierarchy presented to the searcher; at the same time, the efficiency parameters are kept within reasonable intervals.
Identifier
33745799488 (Scopus)
ISBN
[1595931406, 9781595931405]
Publication Title
International Conference on Information and Knowledge Management Proceedings
External Full Text Location
https://doi.org/10.1145/1099554.1099627
First Page
281
Last Page
282
Recommended Citation
Bot, Razvan Stefan; Wu, Yi Fang Brook; Chen, Xin; and Li, Quanzhi, "Generating better concept hierarchies using automatic document classification" (2005). Faculty Publications. 19327.
https://digitalcommons.njit.edu/fac_pubs/19327
