loading...
Extracting Objects from the Web
Atlanta, Georgia April 03-April 07
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2006.6922nd International Conference on Data ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Zaiqing Nie, Microsoft Research Asia
Fei Wu2, Tsinghua University
Ji-Rong Wen, Microsoft Research Asia
Wei-Ying Ma, Microsoft Research Asia
Extracting and integrating object information from the Web is of great significance for Web data management. The existing Web information extraction techniques cannot provide satisfactory solution to the Web object extraction task since objects of the same type are distributed in diverse Web sources, whose structures are highly heterogeneous. In this paper, we propose a novel approach called Object-Level Information Extraction (OLIE) to extract Web objects. This approach extends a classic information extraction algorithm, Conditional Random Fields (CRF), by adding Web-specific information. The experimental results show OLIE can significantly improve the Web object extraction accuracy.
Citation:
Zaiqing Nie, Fei Wu2, Ji-Rong Wen, Wei-Ying Ma, "Extracting Objects from the Web," icde, pp.123, 22nd International Conference on Data Engineering (ICDE'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.