In this contribution maximum likelihood (ML) based approaches are presented which track an a-priori known surface and texture in monocular video streams. In contrast to established tracking algorithms based on homographies the surface is not modeled as planar or piecewise planar but as a collection of 3-D surface points and surface normals. Thus, any free-form surface can be modeled. This paper introduces a novel description of the image Jacobian in terms of a reference Jacobian based on the image-constancy (IC) assumption in 3-D. Tracking with this computationally efficient description is compared to the standard ML approach with respect to the region and speed of convergence.