Author(s) |
Chatbri, Houssem
Kameyama, Keisuke
Kwan, Paul H
|
Publication Date |
2015
|
Abstract |
We introduce a method for content-based document image retrieval (CBDIR) of handwritten queries that is both segmentation and recognition-free. We first demonstrate that our method is underpinned by a theoretical model that exploits the Bayes' rule. Next, we present an algorithmic implementation that takes into account real world retrieval challenges caused by handwriting fluctuations and style variations. Our algorithm operates as follows: First, a number of connected components of the query are matched against the connected components of the document image using shape features. A similarity threshold is used to select the connected components of the document image that are most similar to the query components. Then, the selected components are used to detect candidate occurrences of the query in the document image by using size-adaptive bounding boxes. Finally, a score is calculated for each candidate occurrence and used for ranking. We conduct a comparative evaluation of our method on a dataset of 200 printed document images, by executing 40 printed and 200 handwritten queries of mathematical expressions. Experimental results demonstrate competitive performances expressed by P-Recall = 100%, A-Recall = 99.95% for printed queries, and P-Recall = 73.5%, A-Recall = 57.92% for handwritten queries, outperforming a state-of-the-art CBDIR algorithm.
|
Citation |
Proceedings of the Third IAPR Asian Conference on Pattern Recognition (ACPR 2015), p. 146-150
|
ISBN |
9781479961009
|
Link | |
Publisher |
Institute of Electrical and Electronics Engineers (IEEE)
|
Title |
Towards a segmentation and recognition-free approach for content-based document image retrieval of handwritten queries
|
Type of document |
Conference Publication
|
Entity Type |
Publication
|
Name | Size | format | Description | Link |
---|