loading...
A Software Infrastructure for Research in Textual Data Mining
Sacramento, California, USA November 03-November 05
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TAI.2003.125017815th IEEE International Conference on ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Lars E. Holzman, Lehigh University
Todd A. Fisher, Lehigh University
Leon M. Galitsky, Lehigh University
April Kontostathis, Lehigh University
William M. Pottenger, Lehigh University

Few tools exist that address the challenges facing researchers in the Textual Data Mining (TDM) field. Some are too specific to their application, or are prototypes not suitable for general use. More general tools often are not capable of processing large volumes of data.

We have created a Textual Data Mining Infrastructure (TMI) that incorporates both existing and new capabilities in a reusable framework conductive to developing new tools and components. TMI adheres to strict guidelines that allow it to run in a wide range of processing environments — as a result, it accommodates the volume of computing and diversity of research occurring in TDM. A unique capability of TMI is support for optimization. This facilitates text mining research by automating the search for optimal parameters in text mining algorithms.

In this article we describe a number of applications that use the TMI. We present several novel results that have not been published elsewhere. We also discuss how the TMI utilizes existing machine-learning libraries, thereby enabling researchers to continue and extend their endeavors with minimal effort. Towards that end, TMI is available on the web at hddi.cse.lehigh.edu.

Citation:
Lars E. Holzman, Todd A. Fisher, Leon M. Galitsky, April Kontostathis, William M. Pottenger, "A Software Infrastructure for Research in Textual Data Mining," ictai, pp.112, 15th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.