A prediction model for web search hit counts using word frequencies
Document Type
Article
Publication Date
10-1-2011
Abstract
A search engine user with a well-defined information need is not interested in getting thousands of hits, but a few hits that are all highly relevant to their search. Often search words need to be refined and augmented to narrow results to more relevant pages. However, an overly specific query may lead to no hits at all, while most typical queries lead to thousands or even millions of them, both undesirable outcomes. This paper suggests a query rewriting method for generating alternative query strings and proposes a hit count prediction model for predicting the number of search engine hits for each alternative query string, based on the English language frequencies of the words in the search terms. Using the hit count prediction model, different types of search strategies, such as a lowest hit count query preference, can be utilized to improve users' search experience. We present an evaluation experiment of the hit count prediction model for three major search engines. We also discuss and quantify how far the Google, Yahoo! and Bing search engines diverge from monotonic behav our, considering negative and positive search terms separately. © Chartered Institute of Library and Information Professionals 2011.
Identifier
80054080083 (Scopus)
Publication Title
Journal of Information Science
External Full Text Location
https://doi.org/10.1177/0165551511415183
e-ISSN
17416485
ISSN
01655515
First Page
462
Last Page
475
Issue
5
Volume
37
Recommended Citation
Tian, Tian; Chun, Soon Ae; and Geller, James, "A prediction model for web search hit counts using word frequencies" (2011). Faculty Publications. 11150.
https://digitalcommons.njit.edu/fac_pubs/11150