This paper describes information theoretic methods for the determination of the optimal subset of pixels for the problem of face detection in complex backgrounds. A view-based method is described, which has limitations due to misalignments. This motivates the modular feature based method which minimizes the misalignment problem. Empirical comparisons between the view-based, modular, and sum of squared difference methods are made using four databases from three universities.