loading...
Fast Similarity Search in String Databases
Taipei, Taiwan March 25-March 30
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/AINA.2005.18519th International Conference on Adva ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Simon Sheu, National Tsing Hua University
Alan Chang, National Tsing Hua University
Webber Huang, National Tsing Hua University
Efficient similarity search in large string databases requires effective index support. Since long strings have each numerous substrings of arbitrary length. the effective index designs are of great challenge. The existing solution, namely MRS [10] employs a low-cost lower bound function to sieve out the most similar candidates from the majority of unlikely database substrings. Therefore, only very small portions of string databases require the expensive true edit distance computation to finalize the query. A significant savings in overall query processing cost can be realized by the filtration feature of lower bound functions. In this paper, we seek to improve MRS to its full potential. Specifically, we propose a very simple method that exchanges the roles of databuse strings and query string in the original MRS design. Despite Simplicity, our solution can further improve the query performance by 10 times in terms of disk page accesses while using only half of the originai index's size.
Citation:
Simon Sheu, Alan Chang, Webber Huang, "Fast Similarity Search in String Databases," aina, vol. 1, pp.617-622, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers), 2005
Usage of this product signifies your acceptance of the Terms of Use.