loading...
Incorporating User Provided Constraints into Document Clustering
Omaha, Nebraska, USA October 28-October 31
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2007.672007 Seventh IEEE International Confe ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Document clustering without any prior knowledge or background information is a challenging problem. In this paper, we propose SS-NMF: a semi-supervised nonnegative matrix factorization framework for document clustering. In SS-NMF, users are able to provide supervision for document clustering in terms of pairwise constraints on a few documents specifying whether they "must" or "cannot" be clustered together. Through an iterative algorithm, we perform symmetric tri-factorization of the documentdocument similarity matrix to infer the document clusters. Theoretically, we show that SS-NMF provides a general framework for semi-supervised clustering and that existing approaches can be considered as special cases of SS-NMF. Through extensive experiments conducted on publicly available data sets, we demonstrate the superior performance of SS-NMF for clustering documents.
Citation:
Yanhua Chen, Manjeet Rege, Ming Dong, Jing Hua, "Incorporating User Provided Constraints into Document Clustering," icdm, pp.103-112, 2007 Seventh IEEE International Conference on Data Mining, 2007
Usage of this product signifies your acceptance of the Terms of Use.