Title |
The Multistage Approach to Information Extraction in Degraded Document Images |
|
|
Publication Date |
|
Author(s) |
|
Editor |
|
Type of document |
|
Language |
|
Entity Type |
|
Publisher |
Institute of Electrical and Electronics Engineers (IEEE) |
|
|
Place of publication |
Los Alamitos, United States of America |
|
|
DOI |
10.1109/ICPR.2004.1334154 |
|
|
UNE publication id |
|
Abstract |
Global and local adaptive thresholding techniques have been shown effective on particular types of documents. None produces consistently good results on all types of documents. In this paper a novel method, called the multistage-approach, is presented and compared against some existing single-stage algorithms. The multistage approach recursively breaks down an image into sub-regions using quad-tree decomposition and extracts local features from each sub-region until an appropriate thresholding method can be applied to each sub-region. Quantitative analysis using word recall and on 300 degraded historical images obtained from the Library of Congress demonstrate the method is superior to any existing single methods. |
|
|
Link |
|
Citation |
Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04), v.1, p. 445-449 |
|
|
ISSN |
|
ISBN |
|
Start page |
|
End page |
|