loading...
LAPIN-SPAM: An Improved Algorithm for Mining Sequential Pattern
Tokyo, Japan April 05-April 08
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2005.23521st International Conference on Data ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Zhenglu Yang, University of Tokyo
Masaru Kitsuregawa, University of Tokyo
Sequence pattern mining is an important research problem because it is the basis of many other applications. Yet how to efficiently implement the mining is difficult due to the inherent characteristic of the problem - the large size of the data set. In this paper, by combining SPAM, we propose a new algorithm called LAst Position INduction Sequential PAttern Mining (abbreviated as LAPIN-SPAM), which can efficiently get all the frequent sequential patterns from a large database. The main difference between our strategy and the previous works is that when judging whether a sequence is a pattern or not, they use S-Matrix by scanning projected database (PrefixSpan) or count the number by joining (SPADE) or ANDing with the candidate item (SPAM). In contrast, LAPIN-SPAM can easily implement this process based on the following fact - if an item?s last position is smaller than the current prefix position, the item can not appear behind the current prefix in the same customer sequence. LAPIN-SPAM could largely reduce the search space during mining process and is considerable effectiveness in mining sequential pattern. Our experimental results show that LAPIN-SPAM outperforms SPAM up to three times on all kinds of dataset.
Citation:
Zhenglu Yang, Masaru Kitsuregawa, "LAPIN-SPAM: An Improved Algorithm for Mining Sequential Pattern," icdew, pp.1222, 21st International Conference on Data Engineering Workshops (ICDEW'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.