loading...
An Approach of Standardization and Searching based on Hierarchical Bayesian Clustering (HBC) for Record Linkage System
Kyoto University Clock Tower, Kyoto, Japan January 24-January 26
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/C5.2007.5Fifth International Conference on Cre ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Zin War Tun, University of Computer Studies, Yangon, Myanmar
Nilar Thein, University of Computer Studies, Yangon, Myanmar
Information sources on the Web are controlled by different text formats, and have varying inconsistencies. Data form many online sources do not contain enough information to accurately link the records. To link record from different data sources, any system must identify common entities from these sources. Therefore, the major challenges in record linkage are computational complexity and linkage accuracy. To reduce the number of record pairs for comparison, record linkage utilizes similarity search techniques in order to search for candidate similar records. Various searching methods have been used in record linkage systems. In this paper, we propose a record linkage framework and also focus on standardization and enhance the searching method by adopting an advanced feature of cluster-based searching method called Hierarchical Bayesian Clustering (HBC), which is not only for more efficient record pair comparison, but also for speeding up the record linkage accuracy. The purpose of this method is to place similar records into cluster that restricts the search scope for record comparison and also enhances matching accuracy.
Citation:
Zin War Tun, Nilar Thein, "An Approach of Standardization and Searching based on Hierarchical Bayesian Clustering (HBC) for Record Linkage System," c5, pp.54-60, Fifth International Conference on Creating, Connecting and Collaborating through Computing (C5 '07), 2007
Usage of this product signifies your acceptance of the Terms of Use.