This paper proposes a probabilistic framework for semantic video indexing. The components of the framework are multi-objects and multi-nets. Multi-objects are probabilistic multimedia objects [6] representing semantic features or concepts. A multi-net is a probabilistic network of multi-objects, which accounts for the interaction between concepts. The main contribution of this paper is the application of a graphical probabilistic framework to build the multi-net. The multi-net enhances the detection performance of individual multi-objects, provides a unified framework for integrating multiple modalities and supports inference of unobservable concepts based on their relation with observable concepts. We develop multi-objects for detecting sites (locations) in video and integrate the multi-objects using multi-net in the form of a Bayesian network. Detection performance is significantly improved using the multi-net.
Citation:
Milind R. Naphade, Thomas S. Huang, "Semantic Video Indexing Using a Probabilistic Framework," icpr, vol. 3, pp.3083, 15th International Conference on Pattern Recognition (ICPR'00) - Volume 3, 2000