loading...
FiVaTech: Page-Level Web Data Extraction from Template Pages
Omaha, Nebraska, USA October 28-October 31
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2007.95Seventh IEEE International Conference ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
In this paper, we proposed a new approach, called FiVaTech for the problem of Web data extraction. FiVaTech is a page-level data extraction system which deduces the data schema and templates for the input pages generated from a CGI program. FiVaTech uses tree templates to model the generation of dynamic Web pages. FiVaTech can deduce the schema and templates for each individual Deep Web site, which contains either singleton or multiple data records in one Web page. FiVaTech applies tree matching, tree alignment, and mining techniques to achieve the challenging task. The experiments show an encouraging result for the test pages used in many state-of-the-art Web data extraction works.
Citation:
Mohammed Kayed, Chia-Hui Chang, Khaled Shaalan, Moheb Ramzy Girgis, "FiVaTech: Page-Level Web Data Extraction from Template Pages," icdmw, pp.15-20, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions