loading...
The Effectiveness of Lloyd-Type Methods for the k-Means Problem
Berkeley, California October 21-October 24
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FOCS.2006.7547th Annual IEEE Symposium on Foundat ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Rafail Ostrovsky, UCLA, USA
Yuval Rabani, Israel Institute of Technology, Israel
Leonard J. Schulman, Caltech, USA
Chaitanya Swamy, Caltech, USA
We investigate variants of Lloyd's heuristic for clustering high dimensional data in an attempt to explain its popularity (a half century after its introduction) among practitioners, and in order to suggest improvements in its application. We propose and justify a clusterability criterion for data sets. We present variants of Lloyd's heuristic that quickly lead to provably near-optimal clustering solutions when applied to well-clusterable instances. This is the first performance guarantee for a variant of Lloyd's heuristic. The provision of a guarantee on output quality does not come at the expense of speed: some of our algorithms are candidates for being faster in practice than currently used variants of Lloyd's method. In addition, our other algorithms are faster on well-clusterable instances than recently proposed approximation algorithms, while maintaining similar guarantees on clustering quality. Our main algorithmic contribution is a novel probabilistic seeding process for the starting configuration of a Lloyd-type iteration.
Citation:
Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, Chaitanya Swamy, "The Effectiveness of Lloyd-Type Methods for the k-Means Problem," focs, pp.165-176, 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions