This paper proposes an optimisation to the UK-means algorithm, which generalises the k-means algorithm to han- dle objects whose locations are uncertain. The location of each object is described by a probability density function (pdf). The UK-means algorithm needs to compute expected distances (EDs) between each object and the cluster repre- sentatives. The evaluation of ED from first principles is very costly operation, because the pdf 's are different and arbi- trary. But UK-means needs to evaluate a lot of EDs. This is a major performance burden of the algorithm. In this pa- per, we derive a formula for evaluating EDs efficiently. This tremendously reduces the execution time of UK-means, as demonstrated by our preliminary experiments. We also il- lustrate that this optimised formula effectively reduces the UK-means problem to the traditional clustering algorithm addressed by the k-means algorithm.
Citation:
S. D. Lee, Ben Kao, Reynold Cheng, "Reducing UK-Means to K-Means," icdmw, pp.483-488, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), 2007