Diversification of keyword query result patterns

Document Type

Conference Proceeding

Publication Date

1-1-2016

Abstract

Keyword search allows the users to search for information on tree data without making use of a complex query language and without knowing the schema of the data sources. However, keyword queries are usually ambiguous in expressing the user intent. Most of the current keyword search approaches either filter or use a scoring function to rank the candidate result set. These techniques do not differentiate the results and might return to the user a result set which is not the intended. To address this problem, we introduce in this paper an original approach for diversification of keyword search results on tree data which aims at returning a subset of the candidate result set trading off relevance for diversity. We formally define the problem of diversification of patterns of keyword search results on tree data as an optimization problem. We introduce relevance and diversity measures on result pattern sets. We design a greedy heuristic algorithm that chooses top-k most relevant and diverse result patterns for a given keyword query. Our experimental results show that the introduced relevance and diversity measures can be used effectively and that our algorithm can efficiently compute a set of result patterns for keyword queries which is both relevant and diverse.

Identifier

84976615700 (Scopus)

ISBN

[9783319399577]

Publication Title

Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics

External Full Text Location

https://doi.org/10.1007/978-3-319-39958-4_14

e-ISSN

16113349

ISSN

03029743

First Page

171

Last Page

183

Volume

9659

Grant

61202035

Fund Ref

National Natural Science Foundation of China

This document is currently not available here.

Share

COinS