An Application-Independent and Segmentation-Free Approach for Spotting Queries in Document Images

Title
An Application-Independent and Segmentation-Free Approach for Spotting Queries in Document Images
Publication Date
2014
Author(s)
Chatbri, Houssem
Kwan, Paul H
Kameyama, Keisuke
Editor
Editor(s): Lisa O'Conner
Type of document
Conference Publication
Language
en
Entity Type
Publication
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Place of publication
Los Alamitos, United States of America
DOI
10.1109/ICPR.2014.498
UNE publication id
une:16660
Abstract
We report our ongoing research on an application-independent and segmentation-free approach for spotting queries in document images. Built on our earlier work reported in [1][2], this paper introduces an image processing approach that finds occurrences of a query, which is a multi-part object, in a document image, through 5 steps: (1) Preprocessing for image normalization and connected components extraction. (2) Feature Extraction from connected components. (3) Matching of the query and document image connected components' feature vectors. (4) Voting for determining candidate occurrences in the document image that are similar to the query. (5) Candidate Filtering for detecting relevant occurrences and filtering out irrelevant patterns. Compared to existing methods, our contributions are twofold: Our approach is designed to deal with any type of queries, without restriction to a particular class such as words or mathematical expressions. Second, it does not apply a domain-specific segmentation to extract regions of interest from the document image, such as text paragraphs or mathematical calculations. Instead, it considers all the image information. Experimental evaluation using scanned journal images show promising performances and possibility of further improvement.
Link
Citation
Proceedings of the 22nd International Conference on Pattern Recognition (ICPR), p. 2891-2896
ISSN
1051-4651
ISBN
9781479952083
Start page
2891
End page
2896

Files:

NameSizeformatDescriptionLink