loading...
HMM MODELING FOR AUDIO-VISUAL SPEECH RECOGNITION
Tokyo, Japan August 22-August 25
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICME.2001.12376742001 IEEE International Conference on ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Qi Zhi, National University of Singapore
Mustafa Nazmi Kaynak, National University of Singapore
Kuntal Sengupta, National University of Singapore
Adrian David Cheok, National University of Singapore
C. C. Ko, National University of Singapore
Bimodal speech recognition is a robust technique for automated speech analysis, and has received a lot of attention in the last few decades. In this paper, we analyze the effect of the HMM models on the performance of the bimodal speech recognizer, present a comparative analysis of the different HMM models that can be used in bimodal speech recognition, and finally propose a novel model, which has been experimentally verified to perform better than others. One of the unique characteristic of our HMM model is the novel fusion strategy of the acoustic and the visual features, that takes into account the different sampling rates of these two signals. Compared to audio only, the bimodal speech recognition scheme has a much more improved recognition accuracy, especially in presence of noise.
Citation:
Qi Zhi, Mustafa Nazmi Kaynak, Kuntal Sengupta, Adrian David Cheok, C. C. Ko, "HMM MODELING FOR AUDIO-VISUAL SPEECH RECOGNITION," icme, pp.35, 2001 IEEE International Conference on Multimedia and Expo (ICME'01), 2001
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions