loading...
Using Web Search Logs to Identify Query Classification Terms
Las Vegas, Nevada, USA April 02-April 04
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ITNG.2007.202International Conference on Informati ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Isak Taksa, City University of New York
Sarah Zelikovitz, City University of New York
Amanda Spink, Queensland University of Technology
Classification of search queries is a complex and computationally challenging task. Typically, search queries are short, reveal very few features per single query and are therefore a weak source for traditional machine learning. In this paper, we present a method that combines limited manual labeling, computational linguistics and information retrieval to classify a large collection of web search queries. A short set of manually chosen terms that are known a priori to be of interest to a particular class is used to cull a small number of actual queries from a commercial search engine log. These queries are then submitted to a commercial search engine and the returned search results are used to find more class related terms. We examine classification proficiency of the proposed method on a large web search engine query log and show that up to 48% of the unlabeled set could be classified using this method. We discuss results of this research and its implications on the advancement of short text classification.
Index Terms:
web search logs, machine learning, short text classification, labeled sets
Citation:
Isak Taksa, Sarah Zelikovitz, Amanda Spink, "Using Web Search Logs to Identify Query Classification Terms," itng, pp.469-474, International Conference on Information Technology (ITNG'07), 2007
Usage of this product signifies your acceptance of the Terms of Use.