Feature selection method based on text study is a mainstream method currently, whose research key lies in finding out one suitable feature assessment method, which can reduce the numbers of the words to be processed as less as possible in the situation of not decreasing classification precision, to improve the speed and the efficiency of classification. A new feature assessment method Entropy Ratio is proposed in this paper on the base of researching the classical feature assessment methods in the existing literature. This method not only considered feature classification ability, but also the feature generalization ability. It is a new and better choice to apply the centroid-based classifier to improve the effect of classification. Experimental results show that the effect obtained by using this method to select features is obviously superior to the one obtained by other methods, especially when the feature selected is less.
Index Terms:
Text Feature Selection, Centroid-Based Classifier, Automatic text classification
Citation:
Yijun Gu, Rong Wang, Jianhua Wang, Jiangde Yu, "A New Chinese Text Feature Selection Method in Centroid-Based Classifier," isip, pp.88-92, 2008 International Symposiums on Information Processing, 2008