loading...
A General Approach for Partitioning Web Page Content Based on Geometric and Style Information
Curitiba, Parana, Brazil September 23-September 26
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDAR.2007.10Ninth International Conference on Doc ...
 This Article 
 
PDF
HTML
IEEE Xplore Subscribers
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
H. Guo, Stony Brook University, NY
J. Mahmud, Stony Brook University, NY
Y. Borodin, Stony Brook University, NY
A. Stent, Stony Brook University, NY
I. Ramakrishnan, Stony Brook University, NY
In this paper, we describe a general-purpose approach for partitioning Web page content. The novelty of our ap- proach lies in the use of detailed layout information from a Web page renderer to determine spatial locality and identify visual separators, and the use of relaxed matching over pre- sentation style information to determine presentation style similarity. We present several examples to illustrate the gen- erality of our approach.
Citation:
H. Guo, J. Mahmud, Y. Borodin, A. Stent, I. Ramakrishnan, "A General Approach for Partitioning Web Page Content Based on Geometric and Style Information," icdar, vol. 2, pp.929-933, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 2007
Usage of this product signifies your acceptance of the Terms of Use.