This paper presents a redundant bit vector approach for indexing and retrieval of handwritten words captured using an electronic pen or tablet. Handwritten words (cursive or print) are first segmented into strokes and each stroke is featurized using a neural network. Oriented principal component analysis (OPCA) is used for dimensionality reduction while ensuring robustness to handwriting variation (noise). Redundant bit vectors are used to index the resulting low dimensional representations for efficient storage and retrieval. Experimental results on large datasets with 898,652 handwritten words show good retrieval performance that is robust to handwriting variations and generalizes well over different writers and writing styles.
Citation:
K. Chellapilla, J. Platt, "Redundant Bit Vectors for Robust Indexing and Retrieval of Electronic Ink," icdar, vol. 1, pp.387-391, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1, 2007