General C-Means Clustering Model
|
Partitional clustering is an important part of cluster analysis. Based on various theories, numerous clustering algorithms have been developed, and new clustering algorithms continue to appear in the literature. It is known that Occam's razor plays a pivotal role in data-based models, and partitional clustering is categorized as a data-based model. However, no relation had previously been discovered between Occam's razor and partitional clustering, as we discuss in this paper. The three main contributions of this paper can be summarized as follows: 1) According to a novel definition of the mean, a unifying generative framework for partitional clustering algorithms, called a general c-means clustering model (GCM), is presented and studied. 2) Based on the local optimality test of the GCM, the connection between Occam's razor and partitional clustering is established for the first time. As its application, a comprehensive review of the existing objective function-based clustering algorithms is presented based on GCM. 3) Under a common assumption about partitional clustering, a theoretical guide for devising and implementing clustering algorithm is discovered. These conclusions are verified by numerical experimental results.
[1] 1197 J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum Press, 1981.
[2] A. Baraldi and P. Blonda, “A Survey of Fuzzy Clustering Algorithms for Pattern Recognition-Part I and II,” IEEE Trans. Systems, Man, and Cybernetics, Part B, vol. 29, no. 6, pp. 778-801, 1999.
[3] A.K. Jain, M.N. Murty, and P.J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.
[4] R.N. Dave and R. Krishnapuram, “Robust Clustering Methods: A Unified View,” IEEE Trans. Fuzzy Systems, vol. 5, no. 2, pp. 270-293, 1997.
[5] A.K. Jain, R.P.W. Duin, and J. Mao, “Statistical Pattern Recognition: A Review,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-36, Jan. 2000.
[6] Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data, J.C. Bezdek and S.K. Pal, eds. IEEE Computer Soc. Press, 1992.
[7] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[8] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed., San Diego: Academic Press, Inc., Harcourt Brace Jovanovich, Publishers, 1990.
[9] P. Berkhin, Survey Of Clustering Data Mining Techniques, 2002, http://citeseer.nj.nec.comberkhin02survey.html.
[10] J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.
[11] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons, Inc., 1990.
[12] S.P. Lloyd, “Least Squares Quantization in PCM,” IEEE Trans. Information Theory, vol. 28, pp. 127-135, Mar. 1982.
[13] Y. Linde, A. Buzo, and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., vol. 28, no. 1, pp. 84-95, 1980.
[14] R.M. Gray, “Vector Quantization,” IEEE Acoustics, Speech, and Signal Processing Magazine, vol. 1, pp. 4-29, 1994.
[15] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[16] K. Fukunaga, “The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition,” IEEE Trans. Information Theory, vol. 21, pp. 32-40, 1975.
[17] Y. Cheng, “Mean Shift, Mode Seeking, and Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 790-799, Aug. 1995.
[18] D. Comaniciu, “An Algorithm for Data-Driven Bandwidth Selection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 2, pp. 281-288, Feb. 2003.
[19] A. Baraldi, P. Blonda, F. Parmiggiani, G. Pasquariello, and G. Satalino, “Model Transitions in Descending FLVQ,” IEEE Trans. Neural Networks, vol. 9, no. 5, pp. 724-738, 1998.
[20] R.P. Li and M. Mukaidono, “Gaussian Clustering Method Based on Maximum-Fuzzy-Entropy Interpretation,” Fuzzy Sets and Systems, vol. 102, no. 2, pp. 253-258, 1999.
[21] R.J. Hathaway and J.C. Bezdek, “Optimization of Clustering Criteria by Reformulation,” IEEE Trans. Fuzzy Systems, vol. 3, no. 2, pp. 241-245, 1995.
[22] R.J. Hathaway, J.C. Bezdek, and Y. Hu, “Generalized Fuzzy C-Means Clustering Strategies Using ${\rm{L^p}}$ Norm Distances,” IEEE Trans. Fuzzy Systems, vol. 8, no. 5, pp. 576-582, 2000.
[23] L. Bobrowski and J.C. Bezdek, “C-Means Clustering with the $l_l$ and $l_\infty$ Norms,” IEEE Trans. Systems, Man and Cybernetics, vol. 21, no. 3, pp. 545-554, 1991.
[24] S.Z. Selim and M.A. Ismail, “K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 1, pp. 81-86, Jan. 1999.
[25] J.C. Bezdek, R.J. Hathaway, M.J. Sabin, and W. Tucker, “Convergence Theory for Fuzzy C-Means: Counter-Examples and Repairs,” IEEE Trans. Systems, Man, and Cybernetics, vol. 17, no. 5, pp. 873-877, 1987.
[26] J.C. Bezdek, “A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, pp. 1-8, 1980.
[27] J. Yu et al., “Counterexamples to Convergence Theorem of Maximum Entropy Clustering Algorithm,” Science in China, series F, vol. 46, no. 5, pp. 321-326, 2003.
[28] Y. Mou and J. Yu, “On Convergence of the Maximum Entropy Clustering Algorithm,” J. Northern Jiaotong Univ., vol. 27, no. 5, pp. 26-29, 2003, in Chinese.
[29] J. Buhmann and H. Kuhnel, “Vector Quantization with Complexity Costs,” IEEE Trans. Information Theory, vol. 39, no. 4, pp. 1133-1145, 1993.
[30] X.L. Xie and G. Beni, “A Validity Measure for Fuzzy Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 841-847, Aug. 1991.
[31] N.B. Karayiannis and J.C. Bezdek, “An Integrated Approach to Fuzzy Learning Vector Quantization and Fuzzy C-Means Clustering,” IEEE Trans. Fuzzy Systems, vol. 5, no. 4, pp. 622-628, 1997.
[32] N.B. Karayiannis, “An Axiomatic Approach to Soft Learning Vector Quantization and Clustering,” IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1153-1165, 1999.
[33] M. Menard, V. Courboulay, and P. Dardignac, “Possibistic and Probabilistic Fuzzy Clustering: Unification within the Framework of the Non-Extensive Thermostatistics,” Pattern Recognition, vol. 36, no. 6, pp. 1325-1342, 2003.
[34] N.B. Karayiannis, “MECA: Maximum Entropy Clustering Algorithm,” IEEE World Congress on Computational Intelligence, Proc. Third IEEE Conf. Fuzzy Systems, vol. 1, pp. 630-635, 1994.
[35] R.P. Li and M. Mukaidono, “A Maximum Entropy Approach to Fuzzy Clustering,” Proc. Fourth IEEE Int'l Conf. Fuzzy Systems, pp. 2227-2232, Mar. 1995.
[36] T.A. Runkler and J.C. Bezdek, “Alternating Cluster Estimation: A New Tool for Clustering and Function Approximation,” IEEE Trans. Fuzzy Systems, vol. 7, no. 4, pp. 377-393, 1999.
[37] T.A. Runkler and J.C. Bezdek, “Function Approximation with Polynomial Membership Functions and Alternating Cluster Estimation,” Fuzzy Sets and Systems, vol. 101, pp. 207-218, 1999.
[38] I. Gath and A.B. Geva, “Unsupervised Optimal Fuzzy Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 773-780, July 1989.
[39] J. Abonyi, R. Babuska, and F. Szeifert, “Modified Gath-Geva Fuzzy Clustering for Identification of Takagi-Sugeno Fuzzy Models,” IEEE Trans. Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 32, no. 5, pp. 612-621, 2002.
[40] M. Murph and M.J. Pazzani, “Exploring the Decision Forest: An Empirical Investigation of Occam's Razor in Decision Tree Induction,” J. Artificial Intelligence Research, vol. 1, pp. 257-275, 1994.
[41] J. Yu, H. Huang, and S. Sheng, “An Efficient Optimality Test of the FCM,” Proc. FUZZ-IEEE 2002, vol. 1, pp. 98-103, 2002.
[42] W. Wei and J.M. Mendel, “Optimality Tests for the Fuzzy C-Means Algorithm,” Pattern Recognition, vol. 27, no. 11, pp. 1567-1573, 1994.
[43] M.S. Yang and N.Y. Yu, “Estimation of Parameters in Latent Class Models Using Fuzzy Clustering Algorithms,” European J. Operational Research, vol. 160, pp. 515-531, 2005.
[44] J. Yu, “General C-Means Clustering Model: A Unified Approach to the Theories and Methods of Partitional Clustering Algorithms,” technical report, AI Lab, Northern Jiaotong Univ., Beijing, May 2003.
[45] M. Zhang and J. Yu, “Fuzzy Partitional Clustering Algorithms,” J. Software, in press, in Chinese.
[46] E. Anderson, “The IRISes of the Gaspe Peninsula,” Bull. Am. IRIS Soc., vol. 59, pp. 2-5, 1935.
[47] S. Wang, Y. Liang, S. Xia, and Z. Tang, “Semi-Fuzzy and Robust Semi-Fuzzy Clustering,” NEC Cite Seer search, 2001.
[48] J.C. Bezdek, “Cluster Validity with Fuzzy Sets,” J. Cybernetics, vol. 3, no. 3, pp. 58-72, 1974.
[49] E.U. Ruspini, “A New Approach to Clustering,” Information and Control, vol. 15, pp. 22-32, 1969.
[50] H. Frigui and R. Krishnapuram, “Clustering by Competitive Agglomeration,” Pattern Recognition, vol. 33, no. 7, pp. 1109-1119, 1997.
[51] H. Frigui and R. Krishnapuram, “A Robust Competitive Clustering Algorithm with Applications in Computer Vision,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 5, pp. 450-465, May 1999.
[52] N.R. Pal, S.K. Pal, and J.C. Bezdek, “A Mixed C-Means Clustering Model,” Proc. IEEE Int'l Conf. Fuzzy Systems, vol. 1, pp. 11-21, 1997.
[53] M. Yasuda, T. Furuhashi, M. Matsuzaki, and S. Okuma, “A Study on Statistical Mechanical Characteristics of Fuzzy Clustering,” Proc. 2001 IEEE Int'l Conf. Systems, Man, and Cybernetics, vol. 4, pp. 2415-2420, 2001.
[54] M. Yasuda, T. Furuhashi, M. Matsuzaki, and S. Okuma, “Fuzzy Clustering Using Deterministic Annealing Method and Its Statistical Mechanical Characteristics,” Proc. FUZZ-IEEE '2001, vol. 2, pp. 797-800, 2001.
[55] C. Wei and C. Fahn, “The Multisynapse Neural Network and Its Application to Fuzzy Clustering,” IEEE Trans. Neural Networks, pp. 600-618, 2002.
[56] R. Krishnapuram and J.M. Keller, “A Possibilistic Approach to Clustering,” IEEE Trans. Fuzzy Systems, vol. 1, pp. 98-110, 1993.
[57] R. Krishnapuram and J.M. Keller, “The Possibilistic C-Means Algorithm: Insights and Recommendations,” IEEE Trans. Fuzzy Systems, pp. 4385-4393, 1996.
[58] Y.J. Lee, “An Automated Knowledge Extraction System,” PhD thesis, Computer Science Dept., Univ. of Minnesota, Minneapolis, 1994.
[59] A. Schneider, “Weighted Possibilistic C-Means Clustering Algorithms,” Proc. FUZZ-IEEE 2000, pp. 176-180, 2000.
[60] J.S. Zhang and Y.W. Yeung, “Robust Clustering by Pruning Outliers,” IEEE Trans. Systems, Man, and Cybernetics-Part B, vol. 33, no. 6, pp. 983-998, 2003.
[61] H. Timm, C. Borgelt, C. Dorring, and R. Kruse, “Fuzzy Cluster Analysis with Cluster Repulsion,” Proc. European Symp. Intelligent Technologies (EUNITE), 2001.
[62] K.L. Wu and M.S. Yang, “Alternative C-Means Clustering Algorithms,” Pattern Recognition, vol. 35, pp. 2267-2278, 2002.
[63] R. Krishnapuram, O. Nasraoui, and H. Frigui, “The Fuzzy C Spherical Shells Algorithm: A New Approach,” IEEE Trans. Neural Networks, vol. 3, no. 5, pp. 663-671, 1992.
[64] P.A. Chou, T. Lookabaugh, and R.M. Gray, “Entropy-Constrained Vector Quantization,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 37, no. 1, pp. 31-42, 1989.
[65] E.E. Gustafson and W.C. Kessel, “Fuzzy Clustering with a Fuzzy Covariance Matrix,” Proc. IEEE Conf. Decision and Control, pp. 761-766, 1979.
[66] R.N. Dave, “Fuzzy-Shell Clustering and Applications to Circle Detection in Digital Images,” Int'l J. General Systems, vol. 16, pp. 343-355, 1990.
[67] U. Kaymak and M. Setnes, “Fuzzy Clustering with Volume Prototypes and Adaptive Cluster Merging,” IEEE Trans. Fuzzy Systems, vol. 10, no. 6, pp. 706-712, 2002.
[68] M.S. Yang, K.L. Wu, and J. Yu, “A Novel Fuzzy Clustering Algorithm,” Proc. 2003 IEEE Int'l Symp. Computational Intelligence in Robotics and Automation, CIRA 2003, pp. 647-652, July 2003.
[69] M.S. Yang, “On a Class of Fuzzy Classification Maximum Likelihood Procedures,” Fuzzy Sets and Systems, vol. 57, pp. 365-375, 1993.
[70] J.S. Lin, “Fuzzy Clustering Using a Compensated Fuzzy Hopfield Network,” Neural Processing Letters, vol. 10, no. 1, pp. 35-48, 1999.
[71] D. Özdemir and L. Akarun, “A Fuzzy Algorithm for Color Quantization of Images,” Pattern Recognition, vol. 35, pp. 1785-1791, 2002.
[72] W. Pedrycz, “Conditional Fuzzy C-Means,” Pattern Recognition Letters, vol. 17, pp. 625-632, 1996.
[73] N.B. Karayiannis, “Weighted Fuzzy Learning Vector Quantization and Weighted Generalized Fuzzy C-Means Algorithms,” Proc. FUZZ-IEEE 1996, vol. 2, pp. 8-11, 1996.
[74] Y. Ohashi, “Fuzzy Clustering and Robust Estimation,” Ninth Meeting of SAS Users Grp. Int'l, 1984.
[75] R. Dave, “Characterization and Detection of Noise in Clustering,” Pattern Recognition Letters, vol. 12, no. 11, pp. 657-664, May 1991.
[76] D. Özdemir and L. Akarun, “Fuzzy Algorithms for Combined Quantization and Dithering,” IEEE Trans. Image Processing, vol. 10, no. 6, pp. 923-931, 2001.
[77] J.A. Bilmes, “A Gentle Tutorial of the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models,” http://www.icsi.berkeley.edu/ftp/pub/techreports/ 1997tr-97-021.pdf, 1997.
[78] H. Ichihashi, K. Honda, and N. Tani, “Gaussian Mixture PDF Approximation and Fuzzy C-Means Clustering with Entropy Regularization,” http://citeseer.nj.nec.com404861.html, 2000.
[79] K. Rose, E. Gurewitz, and G.C. Fox, “A Deterministic Annealing to Clustering,” Pattern Recognition Letters, vol. 11, no. 11, pp. 589-594, 1990.
[80] K. Rose, “Deterministic Annealing for Clustering, Compression, Classification, Regression, and Related Optimization Problems” Proc. IEEE, vol. 86, no. 11, pp. 2210-2239, 1998.
[81] D. Tran and M. Wagner, “Fuzzy Entropy Clustering,” Proc. FUZZ IEEE 2000, vol. 1, pp. 152-157, 2000.
[82] K. Rose, E. Gurewitz, and G.C. Fox, “Constrained Clustering as an Optimization Method,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 8, pp. 785-794, Aug. 1993.
[83] T.M. Mitchell, Machine Learning. McGraw-Hill Companies, Inc., 1997.
Index Terms:
Index Terms- Partitional clustering, mean, fixed point, optimality test, cluster validity, density estimator, Occam's razor, Hessian matrix.
Citation:
Jian Yu, "General C-Means Clustering Model," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1197-1211, Aug. 2005, doi:10.1109/TPAMI.2005.160