This paper presents preliminary results for document classification of ancient Hebrew manuscripts. The main goal is to analyze documents of different writing styles in order to identify the locations, the dates, and the writer of the test documents. This analysis depends crucially on good binarization of the corrupted manuscripts.
We propose an accurate method for binarization of the manuscripts.We further propose and test topological features for handwriting style classification based a selected subset of the Hebrew alphabet. In our preliminary experiments we have used only two characters, the character Aleph and the character Lamed.
Our results so far yield 100% correct classification of a set of fourteen documents written by fourteen different writers.