Task-Specific Gesture Analysis in Real-Time Using Interpolated Views
|
Abstract—Hand and face gestures are modeled using an appearance-based approach in which patterns are represented as a vector of similarity scores to a set of view models defined in space and time. These view models are learned from examples using unsupervised clustering techniques. A supervised learning paradigm is then used to interpolate view scores into a task-dependent coordinate system appropriate for recognition and control tasks. We apply this analysis to the problem of context-specific gesture interpolation and recognition, and demonstrate real-time systems which perform these tasks.
[1] 1236 R.E. Bellman Dynamic Programming. Princeton, N.J.: Princeton Univ. Press, 1957.
[2] D. Beymer, "Face Recognition Under Varying Pose," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1994, pp. 756-761.
[3] T. Breuel, "View-Based Recognition," IAPR Workshop on Machine Vision Applications, Tokyo, 1992.
[4] R. Cipolla, Y. Okamotot, and Y. Kuno, "Qualitative Visual Interpretation of 3D Hand Gestures Using Motion Parallax," IAPR Workshop on Machine Vision Applications, Tokyo, 1992.
[5] T. Darrell and A. Pentland, "Space-Time Gestures," Proc. IEEE CVPR-93, New York, 1993.
[6] T. Darrell and A. Pentland, "Attention-Driven Expression and Gesture Analysis in an Interactive Environment," Proc. Int'l Workshop on Face and Gesture Recognition, Zurich, June26-28, 1995.
[7] I. Essa and A. Pentland, "A Vision System for Observing and Extracting Facial Action Parameters," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1994.
[8] M. Fukumoto, K. Mase, and Y. Suenaga, "Real-Time Detection of Pointing Actions for a Glove-Free Interface," IAPR Workshop Machine Vision Applications. Tokyo, 1992.
[9] K. Ishibuchi, H. Takemura, and F. Kishino, "Real-Time Hand Shape Recognition Using Pipe-Line Image Processor," IEEE Workshop on Robot and Human Communication, pp. 111-116, 1992.
[10] K. Mase, "Recognition of Facial Expressions for Optical Flow," IEICE Trans., Special Issue on Computer Vision and Its Applications, vol. E74, no. 10, 1991.
[11] B. Moghaddam and A. Pentland, "Probabilistic Visual Learning for Object Detection," Int'l Conf. Computer Vision, 1995, pp. 786-793.
[12] H. Murase and S.K. Nayar, "Learning and Recognition of 3D Objects from Appearance," Proc. IEEE Qualitative Vision Workshop, New York, pp. 39-49, 1993.
[13] T. Poggio and F. Girosi, "A Theory of Networks for Approximation and Learning," MIT AI Lab TR-1140, 1989.
[14] T. Poggio and S. Edelman, "A Network that Learns to Recognize Three Dimensional Objects," Nature, vol. 343, no. 6,255, pp. 263-266, 1990.
[15] H. Sakoe and S. Chiba, "Dynamic Programming Optimization for Spoken Word Recognition," IEEE Trans. ASSP, vol. 26, pp. 623-625, 1980.
[16] D. Terzopoulos and K. Waters, "Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 569-579, 1993.
[17] M. Turk and A. Pentland, "Eigenfaces for Recognition," J. Cognitive Neuroscience, vol. 3, pp. 71-89, 1991.
[18] S. Ullman and R. Basri, "Recognition by Linear Combinations of Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 992-1006, 1991.
[19] K. Waters and D. Terzopoulos, "Modeling and Animating Faces Using Scanned Data," J. Visualization and Computer Animation, vol. 2, pp. 123-128, 1991.
[20] L. Williams, "Performance-Driven Facial Animation," Proc. ACM SIGGRAPH Conf., vol. 24, no. 4, pp. 235-242, 1990.
[21] Y. Yacoob and L. Davis, "Computing Spatio-Temporal Representations of Human Faces," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 70-75,Seattle, 1994.
Index Terms:
Gesture recognition, real-time image processing, expression analysis, view-based representation, spatio-temporal gestures.
Citation:
Trevor J. Darrell, Irfan A. Essa, Alex P. Pentland, "Task-Specific Gesture Analysis in Real-Time Using Interpolated Views," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 12, pp. 1236-1242, Dec. 1996, doi:10.1109/34.546259