loading...
Sparse Word Graphs: A Scalable Algorithm for Capturing Word Correlations in Topic Models
Omaha, Nebraska, USA October 28-October 31
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2007.39Seventh IEEE International Conference ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Statistical topic models such as the Latent Dirichlet Al- location (LDA) have emerged as an attractive framework to model, visualize and summarize large document collections in a completely unsupervised fashion. One of the limitations of this family of models is their assumption of exchangeabil- ity of words within documents, which results in a `bag-of- words' representation for documents as well as topics. As a consequence, precious information that exists in the form of correlations between words is lost in these models. In this work, we adapt recent advances in sparse mod- eling techniques to the problem of modeling word corre- lations within topics and present a new algorithm called Sparse Word Graphs. Our experiments on AP corpus re- veal both long-distance and short-distance word correla- tions within topics that are semantically very meaningful. In addition, the new algorithm is highly scalable to large collections as it captures only the most important correla- tions in a sparse manner.
Citation:
Ramesh Nallapati, Amr Ahmed, William Cohen, Eric Xing, "Sparse Word Graphs: A Scalable Algorithm for Capturing Word Correlations in Topic Models," icdmw, pp.343-348, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.