loading...
Multi-sensory speech processing: incorporating automatically extracted hidden dynamic information
Amsterdam, Netherlands July 06-July 06
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICME.2005.15216112005 IEEE International Conference on ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A. Subramanya, SSLI Lab, Washington Univ., Seattle, WA, USA
We describe a novel technique for multi-sensory speech processing for enhancing noisy speech and for improved noise-robust speech recognition. Both air- and bone-conductive microphones are used to capture speech data where the bone sensor contains virtually noise-free hidden dynamic information of clean speech in the form of formant trajectories. The distortion in the bone-sensor signal such as teeth-clacking and noise leakage can be effectively removed by making use of the automatically extracted formant information from the bone-sensor signal. This paper reports an improved technique for synthesizing speech waveforms based on the LPC cepstra computed analytically from the formant trajectories. When this new signal stream is fused with the other available speech data streams, we achieved improved performance for noisy speech recognition.
Index Terms:
data stream, multisensory speech processing, automatic hidden information extraction, noisy speech enhancement, speech recognition, air-conductive microphone, bone-conductive microphone, speech data capturing, bone sensor signal distortion, virtual noise-free dynamic information, formant trajectory, speech waveform synthesis, LPC cepstra
Citation:
A. Subramanya, L. Deng, Z. Liu, Z. Zhang, "Multi-sensory speech processing: incorporating automatically extracted hidden dynamic information," icme, pp.4 pp., 2005 IEEE International Conference on Multimedia and Expo, 2005
Usage of this product signifies your acceptance of the Terms of Use.