loading...
Using tree-grammars for training set expansion in page classification
Edinburgh, Scotland August 03-August 06
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDAR.2003.1227778Seventh International Conference on D ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Stefano Baldi, DSI - University of Florence - Italy
Simone Marinai, DSI - University of Florence - Italy
Giovanni Soda, DSI - University of Florence - Italy
In this paper we describe a method for the expansion of training sets made by XY trees representing page layout. This approach is appropriate when dealing with page classification based on MXY tree page representations. The basic idea is the use of tree grammars to model the variations in the tree which are caused by segmentation algorithms. A set of general grammatical rules are defined and used to expand the training set. Pages are classified with a k - nn approach where the distance between pages is computed by means of tree-edit distance.
Citation:
Stefano Baldi, Simone Marinai, Giovanni Soda, "Using tree-grammars for training set expansion in page classification," icdar, vol. 2, pp.829, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2, 2003
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions