We derive the clustering problem from first principles showing that the goal of achieving a probabilistic, or "hard", multi class clustering result is equivalent to the algebraic problem of a completely positive factorization under a doubly stochastic constraint. We show that spectral clustering, normalized cuts, kernel K-means and the various normalizations of the associated affinity matrix are particular instances and approximations of this general principle. We propose an efficient algorithm for achieving a completely positive factorization and extend the basic clustering scheme to situations where partial label information is available.
Citation:
Ron Zass, Amnon Shashua, "A Unifying Approach to Hard and Probabilistic Clustering," iccv, vol. 1, pp.294-301, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, 2005