loading...
The ed-tree: An Index for Large DNA Sequence Databases
Cambridge, Massachusetts, USA July 09-July 11
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SSDM.2003.121497615th International Conference on Scie ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Zhenqiang Tan, National University of Singapore
Xia Cao, National University of Singapore
Beng Chin Ooi, National University of Singapore
Anthony K. H. Tung, National University of Singapore
The growing interest in genomic research has caused an explosive growth in the size of DNA databases making it increasely challenging to perform searches on them. In this paper, we proposed an index structure called the ed-tree for supporting fast and effective homology searches on DNA databases. The ed-tree is developed to enable probe-based homology search algorithms like Blastn which generate short probe strings from the query sequence and then match them against the sequence database in order to identify potential regions of high similarity to the query sequence. Unlike Blastn however, the homology search algorithm we developed for ed-tree supports more flexible probe model with longer probes and more relaxed matching. As a consequence, the ed-tree is not only more effective and efficient than the latest Blastn(NCBI Blast2) when supporting homology search but also takes up moderate storage compared to existing data structures like the suffix tree. To index a DNA database of 2 giga base pairs(Gbps), ed-tree only takes less than 3Gb of secondary storage which is easily handled by a desktop PC. Experiments will be shown in this paper to support our claim.
Citation:
Zhenqiang Tan, Xia Cao, Beng Chin Ooi, Anthony K. H. Tung, "The ed-tree: An Index for Large DNA Sequence Databases," ssdbm, pp.151, 15th International Conference on Scientific and Statistical Database Management, 2003
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions