loading...
Cluster-Based FAQ Retrieval Using Latent Term Weights
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2008.23March/April 2008 (vol. 23 no. 2) pp. 58-65
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Harksoo Kim, Kangwon National University
Jungyun Seo, Sogang University
To resolve lexical disagreement problems in FAQ retrieval, we propose a high-performance FAQ retrieval system using query-log clustering. The FAQ retrieval system is divided into two subsystems: a query-log clustering system and a cluster-based retrieval system. During indexing, the query-log clustering subsystem classifies the logs of users' queries into predefined FAQ categories using a dimensionality reduction technique called latent semantic analysis. Then, it groups the query logs according to the classification results. During retrieval, the cluster-based retrieval subsystem smoothes the FAQs using the query-log clusters. Then, it calculates the similarities between the users' queries and the smoothed FAQs. Using the cluster-based retrieval technique, the proposed system can partially bridge lexical chasms between users' queries and FAQs. In addition, the proposed system outperforms the traditional information retrieval systems in FAQ retrieval.

[1] 58 E. Sneiders, "Automated FAQ Answering: Continued Experience with Shallow Language Understanding," Papers from the 1999 AAAI Fall Symp., AAAI Press, 1999, pp. 97–107.
[2] K. Hammond et al., "FAQ Finder: A Case-Based Approach to Knowledge Navigation," Proc. 11th Conf. Artificial Intelligence for Applications, 1995, pp. 80–86.
[3] S.D. Whitehead, "Auto-FAQ: An Experiment in Cyberspace Leveraging," Computer Networks and ISDN Systems, vol. 28, nos. 1–2, 1995, pp. 137–146.
[4] T.K. Landauer, P.W. Foltz, and D. Laham, "Introduction to Latent Semantic Analysis," Discourse Processes, vol. 25, 1998, pp. 259–284.
[5] X. Liu and W.B. Croft, "Cluster-Based Retrieval Using Language Models," Proc. SIGIR2004, ACM Press, 2004, pp. 25–29.
[6] P. Willet, "Recent Trends in Hierarchical Document Clustering: A Critical Review," Information Processing and Management, vol. 24, no. 5, 1988, pp. 577–597.
[7] P. Hellwig, "Dependency Unification Grammar," Proc. 11th Int'l Conf. Computational Linguistics (COLING86), 1986, pp. 195–198.
[8] S.E. Robertson and S. Walker, "Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval," Proc. SIGIR92, ACM Press, 1992, pp. 232–241.
[9] S.E. Robertson et al., "Okapi at TREC-3," Proc. Text Retrieval Conf. (TREC-3), Nat'l Inst. Standards and Technology, 1994, pp. 109–126.
[10] C. Zhai and J. Lafferty, "A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval," Proc. SIGIR2001, ACM Press, 2001, pp. 334–342.

Index Terms:
lexical disagreement problem, latent semantic analysis, query log clustering, FAQ smoothing, cluster-based FAQ retrieval
Citation:
Harksoo Kim, Jungyun Seo, "Cluster-Based FAQ Retrieval Using Latent Term Weights," IEEE Intelligent Systems, vol. 23, no. 2, pp. 58-65, Mar./Apr. 2008, doi:10.1109/MIS.2008.23
Usage of this product signifies your acceptance of the Terms of Use.