We present a new distributed association rule mining (D-ARM) algorithm that demonstrates superlinear speedup with the number of computing nodes. The algorithm is the first D-ARM algorithm to perform a single scan over the database. As such, its performance is unmatched by any previous algorithm. Scale-up experiments over standard synthetic benchmarks demonstrate stable run time regardless of the number of computers. Theoretical analysis reveals a tighter bound on error probability than the one shown in the corresponding sequential algorithm.
Citation:
Assaf Schuster, Ran Wolff, Dan Trock, "A High-Performance Distributed Algorithm for Mining Association Rules," icdm, pp.291, Third IEEE International Conference on Data Mining (ICDM'03), 2003