Implicit speech segmentation is basically to find time instances when the spectral distortion is large. Spectral Variation Function is a widely used measure of spectral distortion. However, SVF is a data-dependent measure. In order to make the measurement data-independent, a likelihood ratio is constructed to measure the spectral distortion. This ratio can be computed efficiently with a Bayesian predictive model. The prior of the Bayesian predictive model is estimated from unlabeled data via an unsupervised machine learning technique . Gaussian Mixture Model(GMM). The experimental results show that effectiveness of this novel method. The performance on TIMIT corpus indicate the potential applications in speech recognition, synthesis and coding.
Citation:
Ming Liu, Thomas S. Huang, "A Bayesian Predictive Method for Automatic Speech Segmentation," icpr, vol. 4, pp.290-293, 18th International Conference on Pattern Recognition (ICPR'06) Volume 4, 2006