loading...
TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases
Hong Kong December 18-December 22
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.161Sixth IEEE International Conference o ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Hui Xiong, Rutgers University, USA
Mark Brodie, IBM TJ Watson
Sheng Ma, Vivido Media Inc.
Recently, there has been considerable interest in computing strongly correlated pairs in large databases. Most previous studies require the specification of a minimum correlation threshold to perform the computation. However, it may be difficult for users to provide an appropriate threshold in practice, since different data sets typically have different characteristics. To this end, we propose an alternative task: mining the top-k strongly correlated pairs. In this paper, we identify a 2-D monotone property of an upper bound of Pearson?s correlation coefficient and develop an efficient algorithm, called TOP-COP to exploit this property to effectively prune many pairs even without computing their correlation coefficients. Our experimental results show that the TOP-COP algorithm can be orders of magnitude faster than brute-force alternatives for mining the top-k strongly correlated pairs.
Citation:
Hui Xiong, Mark Brodie, Sheng Ma, "TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases," icdm, pp.1162-1166, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions