Exploiting Underrepresented Query Aspects for Automatic Query Expansion
TitleExploiting Underrepresented Query Aspects for Automatic Query Expansion
PublicationThe 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007)
AuthorsDaniel Crabtree, Peter Andreae, Xiaoying Gao
Date12 – 15 August 2007
VenueSan Jose, California, United States
Pages191 – 200
FilesDownload PDF (192 kB), Download Poster (1.58 MB)
Abstract
Users attempt to express their search goals through web search queries. When a search goal has multiple components or aspects, documents that represent all the aspects are likely to be more relevant than those that only represent some aspects. Current web search engines often produce result sets whose top ranking documents represent only a subset of the query aspects. By expanding the query using the right keywords, the search engine can find documents that represent more query aspects and performance improves. This paper describes AbraQ, an approach for automatically finding the right keywords to expand the query. AbraQ identifies the aspects in the query, identifies which aspects are underrepresented in the result set of the original query, and finally, for any particularly underrepresented aspect, identifies keywords that would enhance that aspect's representation and automatically expands the query using the best one. The paper presents experiments that show AbraQ significantly increases the precision of hard queries, whereas traditional automatic query expansion techniques have not improved precision. AbraQ also compared favourably against a range of interactive query expansion techniques that require user involvement including clustering, web-log analysis, relevance feedback, and pseudo relevance feedback.
BibTeX
@INPROCEEDINGS{EQA07,
    AUTHOR =    {Daniel Crabtree and Peter Andreae and Xiaoying Gao},
    TITLE =     {Exploiting Underrepresented Query Aspects for 
                 Automatic Query Expansion},
    BOOKTITLE = {The 13th ACM SIGKDD International Conference on 
                 Knowledge Discovery and Data Mining (KDD 2007)},
    YEAR =      {2007},
    PAGES =     {191--200}
}