Representing Images Using Nonorthogonal Haar-Like Bases
|
Efficient and compact representation of images is a fundamental problem in computer vision. In this paper, we propose methods that use Haar-like binary box functions to represent a single image or a set of images. A desirable property of these box functions is that their inner product operation with an image can be computed very efficiently. We propose two closely related novel subspace methods to model images: the non-orthogonal binary subspace (NBS) method and binary principal component analysis (B-PCA) algorithm. NBS is spanned directly by binary box functions and can be used for image representation, fast template matching and many other vision applications. B-PCA is a structure subspace that inherits the merits of both NBS (fast computation) and PCA (modeling data structure information). B-PCA base vectors are obtained by a novel PCA guided NBS method. We also show that BPCA base vectors are nearly orthogonal to each other. As a result, in the non-orthogonal vector decomposition process, the computationally intensive pseudo-inverse projection operator can be approximated by the direct dot product without causing significant distance distortion. Experiments on real image datasets show promising performance in image matching, reconstruction and recognition tasks with significant speed improvement.
[1] 2120 H. Davis, Fourier Series and Orthogonal Functions. Dover Publications, 1989.
[2] C. Chui, L. Montefusco, and L. Puccio, Wavelets: Theory, Algorithms, and Applications. Academic Press, 1994.
[3] I. Jollife, Principal Component Analysis. Springer, 1986.
[4] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Analysis. Wiley Interscience, 2001.
[5] B. Olshausen and D. Field, “Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by v1?” Vision Research, vol. 37, no. 23, pp. 3311-3325, Dec. 1997.
[6] D. Lee and H. Seung, “Learning the Parts of Objects by Non-Negative Matrix Factorization,” Nature, vol. 401, no. 6755, pp. 788-791, Oct. 1999.
[7] H. Tao, R. Crabb, and F. Tang, “Non-Orthogonal Binary Subspace and Its Applications in Computer Vision,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. 864-870, 2005.
[8] M. Turk and A. Pentland, “Face Recognition Using Eigenface,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 586-591, 1991.
[9] M. Black and A. Jepson, “Eigentracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation,” Proc. Fourth European Conf. Computer Vision, pp. 329-342, 1996.
[10] R. Basri and D. Jacobs, “Lambertian Reflectance and Linear Subspaces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 2, pp. 218-233, Feb. 2003.
[11] D. Belhumeur and D. Kriegman, “What Is the Set of Images of an Object under All Possible Lighting Conditions?” Int'l J. Computer Vision, vol. 28, no. 3, pp. 245-260, July 1998.
[12] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. I: 511-I: 518, 2001.
[13] P. Viola, M. Jones, and D. Snow, “Detecting Pedestrians Using Patterns of Motion and Appearance,” Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. II: 734-II: 741, 2003.
[14] O. Veksler, “Fast Variable Window for Stereo Correspondence Using Integral Images,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. I: 556-I: 561, 2003.
[15] Y. Ke, R. Sukthankar, and M. Hebert, “Efficient Visual Event Detection Using Volumetric Features,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. I: 166-I: 173, 2005.
[16] H. Schweitzer, J. Bell, and F. Wu, “Very Fast Template Matching,” Proc. Seventh European Conf. Computer Vision, pp. 358-372, 2002.
[17] T. Mita, T. Kaneko, and O. Hori, “Joint Haar-Like Features for Face Detection,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. II: 1619-II: 1626, 2005.
[18] A. Bell and T. Sejnowski, “The Independent Components of Natural Scenes Are Edge Filters,” Vision Research, vol. 37, no. 23, pp. 3327-3338, Dec. 1997.
[19] P.O. Hoyer, “Non-Negative Matrix Factorization with Sparseness Constraints,” J. Machine Learning Research, vol. 5, pp. 1457-1469, 2004.
[20] P. Georgiev, F. Theis, and A. Cichocki, “Sparse Component Analysis and Blind Source Separation of Underdetermined Mixtures,” IEEE Trans. Neural Networks, vol. 16, no. 4, pp. 992-996, July 2005.
[21] M. Heiler and C. Schnorr, “Learning Non-Negative Sparse Image Codes by Convex Programming,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. II: 1667-II: 1674, 2005.
[22] T. Hazan, S. Polak, and A. Shashua, “Sparse Image Coding Using a 3D Non-Negative Tensor Factorization,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. I 50-I 57, 2005.
[23] S. Chen, D. Donoho, and M. Saunders, “Atomic Decomposition by Basis Pursuit,” SIAM J. Scientific Computing, vol. 20, no. 1, pp. 33-61, Aug. 1998.
[24] D. Donoho and M. Elad, “Maximal Sparsity Representation via $\ell^{1}$ Minimization,” Proc. Nat'l Academy of Sciences, pp. 2197-2202, 2003.
[25] M. Elad and M. Aharon, “Image Denoising via Learned Dictionaries and Sparse Representation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 329-342, 2006.
[26] J. Friedman and J. Tukey, “A Projection Pursuit Algorithm for Exploratory Data Analysis,” IEEE Trans. Computers, vol. 23, no. 9, pp. 881-890, Sept. 1974.
[27] T. Kolda and D. Leary, “Computation and Uses of the Semidiscrete Matrix Decomposition,” ACM Trans. Math. Software, vol. 26, no. 3, pp. 415-435, Sept. 2000.
[28] K. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, 1990.
[29] J. Shanks, “Computation of the Fast Walsh-Fourier Transform,” IEEE Trans. Computers, vol. c, no. 18, pp. 457-459, May 1969.
[30] Y. Meyer, Wavelets. SIAM, 1993.
[31] R. Gonzalez and P. Wintz, Digital Image Processing, second ed. Prentice Hall, 2002.
[32] G. Davis, S. Mallat, and M. Avellaneda, “Greedy Adaptive Approximation,” J. Constructive Approximation, vol. 13, no. 1, pp.57-98, Mar. 1997.
[33] S. Mallat and Z. Zhang, “Matching Pursuit with Time-Frequency Dictionaries,” IEEE Trans. Signal Processing, vol. 41, no. 12, pp.3397-3415, Dec. 1993.
[34] Y. Pati, R. Rezaiifar, and P. Krishnaprasad, “Orthogonal Matching Pursuits: Recursive Function Approximation with Applications to Wavelet Decomposition,” Proc. 27th Asilomar Conf. Signals, Systems, and Computers, pp. 40-44, 1993.
[35] L. Rebollo-Neira and D. Lowe, “Optimized Orthogonal Matching Pursuit Approach,” IEEE Signal Processing Letters, vol. 9, no. 4, pp.137-140, Apr. 2002.
[36] A. Gilbert, S. Muthukrishnan, and M. Strauss, “Approximation of Functions over Redundant Dictionaries Using Coherence,” Proc. 14th Ann. ACM-SIAM Symp. Discrete Algorithms, pp. 243-252, 2003.
[37] M. Andrle and L. Rebollo-Neira, “A Swapping-Based Refinement of Orthogonal Matching Pursuit Strategies,” Signal Processing, special issue on sparse approximations in signal and image processing, vol. 86, no. 3, pp. 480-495, Mar. 2006.
[38] L. Di Stefano and S. Mattoccia, “Fast Template Matching Using Bounded Partial Correlation,” Machine Vision and Applications, vol. 13, no. 4, pp. 213-221, 2003.
[39] S. Kilthau, M. Drew, and T. Moller, “Full Search Content Independent Block Matching Based on the Fast Fourier Transform,” Proc. IEEE Int'l Conf. Image Processing, pp. I: 669-I: 672, 2002.
[40] Y. Hel-Or and H. Hel-Or, “Real-Time Pattern Matching Using Projection Kernels,” Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. II 1430-II 1445, 2003.
[41] G. Ben-Artzi, H. Hel-Or, and Y. Hel-Or, “The Gray-Code Filter Kernels,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 382-393, Mar. 2007.
[42] J. Lewis, “Fast Template Matching,” Vision Interface, pp. 120-123, 1995.
[43] J. Ho, K. Lee, M. Yang, and D. Kriegman, “Visual Tracking Using Learned Linear Subspaces,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. I 782-I 789, 2004.
[44] K. Lee, J. Ho, M. Yang, and D. Kriegman, “Visual Tracking and Recognition Using Probabilistic Appearance Manifolds,” Computer Vision and Image Understanding, vol. 99, no. 3, pp. 303-331, Sept. 2005.
[45] D. Ross, J. Lim, and M. Yang, “Adaptive Probabilistic Visual Tracking with Incremental Subspace Update,” Proc. Eighth European Conf. Computer Vision, vol. II, pp. 470-482, 2004.
[46] T. Cootes, C. Taylor, D. Cooper, and J. Graham, “Active Shape Models—Their Training and Application,” Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, Jan. 1995.
[47] T. Cootes, G. Edwards, and C. Taylor, “Active Appearance Models,” Proc. Fifth European Conf. Computer Vision, pp. II 484-II 498, 1998.
[48] M. Irani, “Multi-Frame Correspondence Estimation Using Subspace Constraints,” Int'l J. Computer Vision, vol. 48, no. 3, pp. 173-194, July 2002.
[49] K. Kanatani, “Motion Segmentation by Subspace Separation and Model Selection,” Proc. Eighth IEEE Int'l Conf. Computer Vision, pp.II: 586-II: 591, 2001.
[50] H. Murase and S. Nayar, “Visual Learning and Recognition of 3D Objects from Appearance,” Int'l J. Computer Vision, vol. 14, no. 1, pp. 5-24, June 1995.
[51] M. Tipping and C. Bishop, “Probabilistic Principal Component Analysis,” J. Royal Statistical Soc., vol. 61, no. 3, pp. 611-622, 1999.
[52] B. Scholkopf, A. Smola, and K. Muller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural Computation, vol. 10, no. 5, pp. 1299-1319, July 1998.
[53] R. Vidal, Y. Ma, and S. Sastry, “Generalized Principal Component Analysis (GPCA),” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 12, pp. 1-15, Dec. 2005.
[54] W. Hong, J. Wright, and Y. Ma, “Multi-Scale Hybrid Linear Models for Lossy Image Representation,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. 674-771, 2005.
[55] A. Shashua, A. Levin, and S. Avidan, “Manifold Pursuit: A New Approach to Appearance Based Recognition,” Proc. 16th Int'l Conf. Pattern Recognition, pp. 11-15, 2002.
[56] http://cat.middlebury.edustereo, Sept. 2007.
[57] http://cbcl.mit.edu/cbcl/software-datasets PedestrianData.html, Sept. 2007.
[58] http://www.itl.nist.gov/iad/humanidferet /, Sept. 2007.
[59] S. Li and A. Jain, Handbook of Face Recognition, chapter 7, Springer, 2004.
Index Terms:
Non-orthogonal subspace, image representations, principal component analysis, image reconstruction
Citation:
Feng Tang, Ryan Crabb, Hai Tao, "Representing Images Using Nonorthogonal Haar-Like Bases," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2120-2134, June 2007, doi:10.1109/TPAMI.2007.1123