loading...
Hierarchy-Regularized Latent Semantic Indexing
Houston, Texas November 27-November 30
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.76Fifth IEEE International Conference o ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Yi Huang, University of Munich
Kai Yu, Siemens Corporate Technology
Matthias Schubert, University of Munich
Shipeng Yu, University of Munich
Volker Tresp, Siemens Corporate Technology
Hans-Peter Kriegel, University of Munich
Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reflect additional and important knowledge annotated by experts. It is generally desired to incorporate this information into text mining processes. In this paper, we propose hierarchy-regularized latent semantic indexing, which encodes the hierarchy into a similarity graph of documents and then formulates an optimization problem mapping each document into a low dimensional vector space. The new feature space preserves the intrinsic structure of the original taxonomy and thus provides a meaningful basis for various learning tasks like visualization and classification. Our approach employs the information about class proximity and class specificity, and can naturally cope with multi-labeled documents. Our empirical studies show very encouraging results on two real-world data sets, the new Reuters (RCV1) benchmark and the Swissprot protein database.
Citation:
Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, Volker Tresp, Hans-Peter Kriegel, "Hierarchy-Regularized Latent Semantic Indexing," icdm, pp.178-185, Fifth IEEE International Conference on Data Mining (ICDM'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions