loading...
Cluster Ranking with an Application to Mining Mailbox Networks
Hong Kong December 18-December 22
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.35Sixth IEEE International Conference o ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Ziv Bar-Yossef, Technion and Google Inc., Israel
Ido Guy, Technion and IBM Research Lab, Israel
Ronny Lempel, IBM Research Lab, Israel
Yoelle S. Maarek, Google Inc., Israel
Vladimir Soroka, IBM Research Lab, Israel
We initiate the study of a new clustering framework, called cluster ranking. Rather than simply partitioning a network into clusters, a cluster ranking algorithm also orders the clusters by their strength. To this end, we introduce a novel strength measure for clusters--the integrated cohesion--which is applicable to arbitrary weighted networks.

We then present C-Rank: a new cluster ranking algorithm. Given a network with arbitrary pairwise similarity weights, C-Rank creates a list of overlapping clusters and ranks them by their integrated cohesion. We provide extensive theoretical and empirical analysis of C-Rank and show that it is likely to have high precision and recall.

Our experiments focus on mining mailbox networks. A mailbox network is an egocentric social network, consisting of contacts with whom an individual exchanges email. Ties among contacts are represented by the frequency of their co-occurrence on message headers. C-Rank is well suited to mine such networks, since they are abundant with overlapping communities of highly variable strengths. We demonstrate the effectiveness of C-Rank on the Enron data set, consisting of 130 mailbox networks.

Citation:
Ziv Bar-Yossef, Ido Guy, Ronny Lempel, Yoelle S. Maarek, Vladimir Soroka, "Cluster Ranking with an Application to Mining Mailbox Networks," icdm, pp.63-74, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.