Author : Dr.V.C.Bharathi 1
Date of Publication :21st February 2018
Abstract: In unconstrained handwritten document retrieval given a list of documents, retrieve the documents based on user query keyword and find the similar keyword in the relevant document that can be search and retrieved handwritten documents with efficient information. The work involves preprocessing of the input document and segmentation is applied to the document based on contour to segment the individual words. In relevant index stores all information of the words, it contains relevant information of the document, the position of the words and class label of each word. In this paper, we proposed unconstrained document retrieval based on user query. After indexing the segmented word images partitioned into 2×2 subblock, each subblock region again partitions into 5×5 subblock. In each subblock, to calculate average intensity of pixels and to find the maximum average values in horizontal and vertical direction. Thereby 40-dimensional features are extracted from 2×2 subblock and extracted features are fed to SVM with RBF kernel to construct the models for all classes. In testing samples, a user is given the query in the search area. The user query keyword randomly selected the corresponding word image in testing samples and to extract the feature for the word. The extracted features are fed for testing to retrieve the appropriate class. The class label is used to retrieve the corresponding index information and retrieve the information from the list of document.
Reference :
-
- Thomas, Simon, Clement Chatelain, Laurent Heutte, Thierry Paquet, and Yousri Kessentini. “A Deep HMM model for multiple keywords spotting in handwritten documents”, Pattern Analysis and Applications, pp. 1-13, 2012.
- Saabni, Raid M., and Jihad A. El-Sana. “Word spotting for handwritten documents using Chamfer distance and dynamic time warping.” In IS T/SPIE Electronic Imaging, pp. 78740J-78740J. International Society for Optics and Photonics, 2011.
- Yalniz, Ismet Zeki, and Raghavan Manmatha. “An efficient framework for searching text in noisy document images.” In Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on, pp. 48-52. IEEE, 2012.
- Tan, Chew Lim, Xi Zhang, and Linlin Li. “Image Based Retrieval and Keyword Spotting in Documents.” In Handbook of Document Image Processing and Recognition, pp. 805-842. Springer London, 2014.
- Frinken, Volkmar, Andreas Fischer, R. Manmatha, and Horst Bunke. “A novel word spotting method based on recurrent neural networks.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 34, no. 2 (2012)