A prediction model for web search hit counts using word frequencies

Document Type

Article

Publication Date

10-1-2011

Abstract

A search engine user with a well-defined information need is not interested in getting thousands of hits, but a few hits that are all highly relevant to their search. Often search words need to be refined and augmented to narrow results to more relevant pages. However, an overly specific query may lead to no hits at all, while most typical queries lead to thousands or even millions of them, both undesirable outcomes. This paper suggests a query rewriting method for generating alternative query strings and proposes a hit count prediction model for predicting the number of search engine hits for each alternative query string, based on the English language frequencies of the words in the search terms. Using the hit count prediction model, different types of search strategies, such as a lowest hit count query preference, can be utilized to improve users' search experience. We present an evaluation experiment of the hit count prediction model for three major search engines. We also discuss and quantify how far the Google, Yahoo! and Bing search engines diverge from monotonic behav our, considering negative and positive search terms separately. © Chartered Institute of Library and Information Professionals 2011.

Identifier

80054080083 (Scopus)

Publication Title

Journal of Information Science

External Full Text Location

https://doi.org/10.1177/0165551511415183

e-ISSN

17416485

ISSN

01655515

First Page

462

Last Page

475

Issue

5

Volume

37

This document is currently not available here.

Share

COinS