The Multistage Approach to Information Extraction in Degraded Document Images

Author(s)
Chen, Yan
Leedham, Graham
Publication Date
2004
Abstract
Global and local adaptive thresholding techniques have been shown effective on particular types of documents. None produces consistently good results on all types of documents. In this paper a novel method, called the multistage-approach, is presented and compared against some existing single-stage algorithms. The multistage approach recursively breaks down an image into sub-regions using quad-tree decomposition and extracts local features from each sub-region until an appropriate thresholding method can be applied to each sub-region. Quantitative analysis using word recall and on 300 degraded historical images obtained from the Library of Congress demonstrate the method is superior to any existing single methods.
Citation
Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04), v.1, p. 445-449
ISBN
0769521282
ISSN
1051-4651
Link
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Title
The Multistage Approach to Information Extraction in Degraded Document Images
Type of document
Conference Publication
Entity Type
Publication

Files:

NameSizeformatDescriptionLink