We show how to learn a concise, interpretable model of scene activity directly from optical flow. The model represents the principal routes and modes of movement in complex scenes such as pedestrian plazas and traffic intersections, and supports a variety of inferences about the observed activities, including annotation, prediction, and anomaly detection. The model takes the form of a novel hidden Markov model generalization that observes a variable number of datapoints per frame (time step). A monotonic entropy-optimizing algorithm determines the parameters and structure of this model, exploiting the duality between learning and compression to produce highly predictive and interpretable models. This approach discovers minimal models of coherent motions and their switching dynamics--without tracking or prior knowledge about the spatial or temporal structure of the scene.
Citation:
Vera Kettnaker, Matthew Brand, "Minimum-Entropy Models of Scene Activity," cvpr, vol. 1, pp.1281, 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'99) - Volume 1, 1999