In this paper, we describe a recent enhancement to our HMM-based OCR system that results in a signzjicant increase in the speed of the system without any impact on recognition accuracy. Recognition speed is, in part, a function of the number of distinct HMMs that constitute the model set. As a result, the recognition speed is much slower for ideographic scripts, such as Chinese and Japanese which contain thousands of glyphs, than for alphabetic scripts such as Latin and Arabic. In our current OCR system, methods like sub-character modeling and Gaussian shortlists are used to reduce the processing time. In this paper we describe a simple character-based duration modeling technique that puts a duration constraint on the number of frames for which a character can stay active. Character durations were obtained from automatically labeled training data and a probability mass function (histogram) was used to model character durations. The use of a duration model yielded a 37% improvement in speed with no loss in accuracy.
Citation:
Premkumar Natarajan, Ram Sundaram, Rohit Prasad, Ehry MacRostie, "Character Duration Modeling for Speed Improvements in the BBN Byblos OCR System," icdar, pp.1136-1140, Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005