loading...
Speech Modelingwith Magnitude-Normalized Complex Spectra and Its Application to Multisensory Speech Enhancement
Toronto, ON, Canada July 09-July 12
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICME.2006.2627412006 IEEE International Conference on ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Amarnag Subramanya, SSLI Lab, University of Washington, Seattle, WA - 98104. asubram@ee.washington.edu
Zhengyou Zhang, zhang@microsoft.com.
Zicheng Liu, zliu@microsoft.com.
Alex Acero, Microsoft Research, One Microsoft Way, Redmond, WA - 98052. alexac@microsoft.com.
A good speech model is essential for speech enhancement, but it is very difficult to build because of huge intra-and extra-speaker variation. We present a new speech model for speech enhancement, which is based on statistical models of magnitude-normalized complex spectra of speech signals. Most popular speech enhancement techniques work in the spectrum space, but the large variation of speech strength, even from the same speaker, makes accurate speech modeling very difficult because the magnitude is correlated across all frequency bins. By performing magnitude normalization for each speech frame, we are able to get rid of the magnitude variation and to build a much better speech model with only a small number of Gaussian components. This new speech model is applied to speech enhancement for our previously developed microphone headsets that combine a conventional air microphone with a bone sensor. Much improved results have been obtained.
Citation:
Amarnag Subramanya, Zhengyou Zhang, Zicheng Liu, Alex Acero, "Speech Modelingwith Magnitude-Normalized Complex Spectra and Its Application to Multisensory Speech Enhancement," icme, pp.1157-1160, 2006 IEEE International Conference on Multimedia and Expo, 2006
Usage of this product signifies your acceptance of the Terms of Use.