Mingkun Li, DOE Joint Genome Institute, Walnut Creek, CA
Iris Bass, Macomb Community College Warren, MI
This paper discusses the use of a combination of support vector machine and decision tree learning for recognizing four emotions in speech, which are Neutral, Angry, Lombard, and Loud. The base features selected were pitch, derivative of pitch, energy, speaking rate, formants, bandwidths, and Mel Frequency Cepstral Coefficients. Three methods of combining learned support vector machine and decision tree classifiers were proposed, namely, minimum misclassification, maximum accuracy, and dominant class. Using the Speech Under Simulated and Actual Stress database, the average accuracy from the minimum misclassification, maximum accuracy, and dominant class methods were 72.4%, 70.8%, 71.3% respectively as opposed to 63.9% and 67.4% which were obtained by using support vector machine and decision tree learning alone.
Citation:
Thao Nguyen, Mingkun Li, Iris Bass, Ishwar K. Sethi, "Investigation of Combining SVM and Decision Tree for Emotion Classification," ism, pp.540-544, Seventh IEEE International Symposium on Multimedia (ISM'05), 2005