loading...
Extracting Document Structure to Facilitate a Knowledge Base Creation for The UML Superstructure Specification
Las Vegas, Nevada, USA April 02-April 04
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ITNG.2007.93International Conference on Informati ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Mehrdad Nojoumian, University of Ottawa
Timothy C. Lethbridge, University of Ottawa
The research presented in this paper aims at facilitating the creation of knowledge bases (KBs) for software specifications, of which the UML superstructure specification is our initial target. Our motivation is that such specifications are dense, repetitive and difficult to use. They are written primarily in semi-structured text, but the structure must be maintained manually as they are edited, resulting in inconsistency. End users cannot use them efficiently because of the duplications, numerous concepts connected only implicitly, and general complexity of the document. Our immediate objective is to generate a KB for the UML specification by extracting knowledge from as many sources as possible in the document such as document structure, embedded natural language, as well as implicit and explicit cross references. In this paper our focus is the first step: extraction of the document?s logical structure. Many key concepts of a document are expressed in this structure, which includes the headings of the chapters, sections, subsections, etc. By extracting such a structure in XML format, we can form a good infrastructure for the subsequent KB creation steps.
Citation:
Mehrdad Nojoumian, Timothy C. Lethbridge, "Extracting Document Structure to Facilitate a Knowledge Base Creation for The UML Superstructure Specification," itng, pp.393-400, International Conference on Information Technology (ITNG'07), 2007
Usage of this product signifies your acceptance of the Terms of Use.