Many emerging applications such as wide-area network management need to query large, structured, highly distributed datasets. Seaweed is a distributed scalable infrastructure for querying such datasets. In this paper we describe its architecture and design features, using the Anemone network management system as a motivating example. The main contribution is a design supporting accurate query planning and efficient execution across a large number of unreliable endsystems. In contrast to prior work, Seaweed supports ad hoc querying in addition to continuous querying. The paper describes the solutions adopted by Seaweed: latency-based cost estimation, availability-based scheduling, and meta-data aggregation.
Citation:
Richard Mortier, Dushyanth Narayanan, Austin Donnelly, Antony Rowstron, "Seaweed: Distributed Scalable Ad Hoc Querying," icdew, pp.30, 22nd International Conference on Data Engineering Workshops (ICDEW'06), 2006