The Multistage Approach to Information Extraction in Degraded Document Images

Title
The Multistage Approach to Information Extraction in Degraded Document Images
Publication Date
2004
Author(s)
Chen, Yan
Leedham, Graham
Editor
Editor(s): Josef Kittler
Type of document
Conference Publication
Language
en
Entity Type
Publication
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Place of publication
Los Alamitos, United States of America
DOI
10.1109/ICPR.2004.1334154
UNE publication id
une:18225
Abstract
Global and local adaptive thresholding techniques have been shown effective on particular types of documents. None produces consistently good results on all types of documents. In this paper a novel method, called the multistage-approach, is presented and compared against some existing single-stage algorithms. The multistage approach recursively breaks down an image into sub-regions using quad-tree decomposition and extracts local features from each sub-region until an appropriate thresholding method can be applied to each sub-region. Quantitative analysis using word recall and on 300 degraded historical images obtained from the Library of Congress demonstrate the method is superior to any existing single methods.
Link
Citation
Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04), v.1, p. 445-449
ISSN
1051-4651
ISBN
0769521282
Start page
445
End page
449

Files:

NameSizeformatDescriptionLink