We present an approach for compact video summaries that allows fast and direct access to video data. The video is segmented into shots and scenes using a previously proposed method ,then motion analysis is used to select representative shots for each scene. In contrast to approaches to video indexing which are based on key-frames, we use mosaics constructed from the representative shots for an efficient mosaic-based scene representation. We use a novel method for mosaic comparison to spatially cluster scenes and create a highly compact non-temporal representation of video. Our scene-based representation allows accurate comparison of scenes across different video data, and serves as a basis for indexing of whole video sequences.