In this paper, we propose a new approach that uses a motion-estimation based framework for video tracking of objects in the presense of self-occlusion in cluttered environments. What differs our work from others is that instead of carrying out the motion estimation between two adjacent frames, we tackle the self-occlusion problem from the view of multiple frames. The heart of our approach lies in extracting features appearing in different time frames, genesis frames, and setting up a motion estimation scheme through multiple applications of Kalman filtering based on the different genesis indices. To make the tracked object look visually familiar to the human observer, the system also makes its best attempt at extracting the boundary contour of the object — a difficult problem in its own right since self-occlusion created by any rotationalmotion of the tracked object would cause large sections of the boundary contour in the previous frame to disappear in the current frame. Our approach has been tested on a wide variety of video sequences, some of which are shown in this paper.