In the filed of information extraction (IE), the extraction of information from documents is usually event-oriented. Therefore, many information extraction machines have built their domain knowledge based on events. However, information extraction is often limited its application in specific domains, because the events are simply detected by predefined keywords. In this paper, we propose event detection driven intelligent information extraction by using neural network paradigm. In this paper the back-propagation (BP) approach learning algorithm is adopted to train the event detector. In order to detect the potential events in documents effectively, we apply natural language processing technology to aid the selection of nouns as feature words. Unrelated nouns are filtered by the analysis based on document frequency distribution. Finally, selected nouns are conceptualized into concepts. These concepts are supposed to characterize documents appropriately and they are stored in ontology as knowledge base. In the experimental results, we got high accuracy both in the inside testing and outside testing of Internet documents. By means of the well-trained event detector, the information extraction task can be certainly applied in wider domains. Eventually, this event detection technology is introduced for the delivery and information extraction of E-mails.
Citation:
Heng-Hsou Chang, Yau-Hwang Kuo, Jang-Pong Hsu, "An Event-Driven and Ontology-Based Approach for the Delivery and Information Extraction of E-mails," mse, pp.103, 2000 International Symposium on Multimedia Software Engineering, 2000