Peer-to-Peer Information Retrieval (P2P IR) systems using a distributed index on a distributed hash table (DHT) can make highly precise searches for documents relevant to a query. However, these systems require a heavy index construction cost, and cause unfair index management costs due to the unbalanced term frequency distribution. We propose a new node access scheme for P2P IR that we call Huffman-DHT. Huffman-DHT uses an algorithm similar to Huffman coding, and modifies the DHT structure based on the term distribution. Huffman-DHT distributes the index construction cost among the nodes equally, and achieves load balancing.
Index Terms:
Peer-to-Peer, Information Retrieval, Load balancing, Huffman coding
Citation:
Hisashi Kurasawa, Atsuhiro Takasu, Jun Adachi, "Huffman-DHT: Index Structure Refinement Scheme for P2P Information Retrieval," saint, pp.111-117, 2008 International Symposium on Applications and the Internet, 2008