Lin Cheung, The University of Hong Kong, Hong Kong
Ben Kao, The University of Hong Kong, Hong Kong
We study the problem of pattern-based subspace clustering. Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent pattern of rises and falls in subspaces. Applications of pattern-based subspace clustering include DNA micro-array data analysis, automatic recommendation systems and target marketing systems. Our goal is to devise pattern-based clustering methods that are capable of (1) discovering useful patterns of various shapes, and (2) discovering all significant patterns. We argue that previous solutions in pattern-based subspace clustering do not satisfy both requirements. Our approach is to extend the idea of Order-Preserving Submatrix (or OPSM). We devise a novel algorithm for mining OPSM, show that OPSM can be generalized to cover most existing pattern-based clustering models, and propose a number of extension to the original OPSM model.
Index Terms:
Gene Expression, Data mining, Patternbased clustering
Citation:
Lin Cheung, Kevin Y. Yip, David W. Cheung, Ben Kao, Michael K. Ng, "On Mining Micro-array data by Order-Preserving Submatrix," icdew, pp.1153, 21st International Conference on Data Engineering Workshops (ICDEW'05), 2005