A Modular Approach for Query Spotting in Document Images and Its Optimization Using Genetic Algorithms

Author(s)
Chatbri, Houssem
Kwan, Paul H
Kameyama, Keisuke
Publication Date
2014
Abstract
Query spotting in document images is a subclass of Content-Based Image Retrieval (CBIR) algorithms concerned with detecting occurrences of a query in a document image. Due to noise and complexity of document images, spotting can be a challenging task and easily prone to false positives and partially incorrect matches, thereby reducing the overall precision of the algorithm. A robust and accurate spotting algorithm is essential to our current research on sketch-based retrieval of digitized lecture materials. We have recently proposed a modular spotting algorithm in [1]. Compared to existing methods, our algorithm is both application-independent and segmentation-free. However, it faces the same challenges of noise and complexity of images. In this paper, inspired by our earlier research on optimizing parameter settings for CBIR using an evolutionary algorithm [2][3], we introduce a Genetic Algorithm-based optimization step in our spotting algorithm to improve each spotting result. Experiments using an image dataset of journal pages reveal promising performance, in that the precision is significantly improved but without compromising the recall of the overall spotting result.
Citation
Proceedings of the 2014 IEEE Congress on Evolutionary Computation (CEC), p. 2085-2092
ISBN
9781479914883
Link
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Title
A Modular Approach for Query Spotting in Document Images and Its Optimization Using Genetic Algorithms
Type of document
Conference Publication
Entity Type
Publication

Files:

NameSizeformatDescriptionLink