loading...
Layout and Language: Preliminary Investigations in Recognizing the Structure of Tables
Ulm, GERMANY August 18-August 20
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDAR.1997.620668Fourth International Conference Docum ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Matthew Hurst, Language Technology Group, Human Communication Research Centre, University of Edinburgh
Shona Douglas, Language Technology Group, Human Communication Research Centre, University of Edinburgh
We describe a prototype system for assigning table cells to their proper place in the logical structure of the table, based on a simple model of table structure combined with a number of measures of \term{cohesion} between cells. A framework is presented for examining the effect of particular variables on the performance of the system, and preliminary results are presented showing the effect of cohesion measures based on the simplest domain-independent analyses, with the aim allowing future comparison with more knowledge-intensive analyses based on Natural Language Processing. These baseline results suggest that very simple string-based cohesion measures are not sufficient to support the extraction of tuples as we require. Future work will pursue the aim of more adequate approximations to a notional subtype/supertype definition of the relationship between value cells and label cell.
Citation:
Matthew Hurst, Shona Douglas, "Layout and Language: Preliminary Investigations in Recognizing the Structure of Tables," icdar, pp.1043, Fourth International Conference Document Analysis and Recognition (ICDAR'97), 1997
Usage of this product signifies your acceptance of the Terms of Use.