loading...
A New Indexing Method for Approximate Search in Text Databases
Shanghai, China September 21-September 23
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CIT.2005.23Fifth International Conference on Com ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fei Shi, Suffolk University
Cread Mefford, Suffolk University

We present an index structure to support the approximate keyword search in text databases. In an approximate keyword search query, the user presents a query word Q and a tolerance value k (k \geqslant 0), and wishes to find all documents in the database that contain the query word Q or any other word in the vocabulary that matches Q approximately (We say that two words match each other approximately if the edit distance between them does not exceed the tolerance value k. In a typical text database application, a user will choose k = 1, 2, 3, or 4). Our index structure is built on the underlying vocabulary of the text database. The new technique has two principal components a new data structure called the V-tree and its partition methods for clustering words in the vocabulary into subgroups. We have implemented our index structure and conducted experiments on real-world data. Our experiments show that even for very large text database, the construction of our index and a search for keywords that match the query word approximately can be done quickly. Our implemntation makes it clear that the V-tree data structure can be easily integrated into existing access structures built on the database such as the inverted index file.

Citation:
Fei Shi, Cread Mefford, "A New Indexing Method for Approximate Search in Text Databases," cit, pp.70-76, Fifth International Conference on Computer and Information Technology (CIT'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.