This paper presents an efficient and scalable coding scheme for transmitting a stream of 3D models extracted from a video. As in classical model-based video coding, the geometry, connectivity, and texture of the 3D models have to be transmitted, as well as the camera position for each frame in the original video.
The proposed method is based on exploiting the interrelations existing between each type of information, instead of coding them independently, allowing a better prediction of the next 3D model in the stream. Scalability is achieved through the use of wavelet-based representations for both texture and geometry of the models.
A consistent connectivity is built for all 3D models extracted from the video sequence, which allows a more compact representation and straightforward geometric morphing between successive models. Furthermore this leads to a consistent wavelet decomposition for 3D models in the stream.
Quantitative and qualitative results for the proposed scheme are compared with the state of the art video coder H264, 3D model-based Galpin coder and independent MPEG4-based coding of the information. Targeted applications include distant visualization of the original video at very low bitrate and interactive navigation in the extracted 3D scene on heterogeneous terminals.