Date of Award

Summer 2014

Document Type


Degree Name

Doctor of Philosophy in Information Systems - (Ph.D.)


Information Systems

First Advisor

Yi-Fang Brook Wu

Second Advisor

Michael Recce

Third Advisor

Lian Duan

Fourth Advisor

Songhua Xu

Fifth Advisor

William Kennedy Browne


With the rapid advancement of the internet, accurate prediction of user's online intent underlying their search queries has received increasing attention from the online advertising community. As a rich source of information on web user's behavior, query logs have been leveraged by advertising companies to deliver personalized advertisements. However, a typical query usually contains very few terms, which only carry a small amount of information about a user's interest. The tendency of users to use short and ambiguous queries makes it difficult to fully describe and distinguish a user's intent. In addition, the query feature space is sparse, as only a small amount of queries appear very often while most queries appear only a few times. Users may use different search terms even if they have the same interests. For example, "Camera", "digital camera", "Sony" and "RX100" are all about cameras. This study aims to address these challenges with user queries in the context of behavioral targeting advertising by proposing a query enhancement mechanism that augments user's queries by leveraging a user query log.

Different from traditional user segmentation methods, which take little semantics of user behaviors into consideration, this study proposes a user segmentation strategy by incorporating the query enhancement mechanism with a topic model to explore the relationships between users and their behaviors in order to segment users in a semantic manner. This research also proposes, in the case that the dataset is sanitized, an alternative to define user's search intent for evaluation purposes. This approach automatically labels users in a click graph, which are then used in training an intent-based user classifier. The empirical evaluation demonstrates that the proposed methodology for query enhancement (QE) achieves greater improvement than the baseline models in both intent-based user classification and user segmentation. Comparing with a classical clustering algorithm, K-means, the experimental results indicate that the proposed user segmentation strategy helps improve behavioral targeting effectiveness significantly. Particularly, the average PUR (Positive User Rate) improvement rates under "K-means + QE" strategy significantly increase over simple K-means strategy in different number of segments across all six domains. The PUR improvement rate can be as high as 136.6% by using the proposed user's intent representation technique with the query enhancement mechanism under the LDA model. By further analysis, the proposed "LDA + QE" strategy significantly exceeds K-means and "K-means + QE".