Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/5612
Title: Decompose algorithm for thresholding degraded historical document images
Contributor(s): Chen, Y (author); Leedham, Graham  (author)
Publication Date: 2005
DOI: 10.1049/ip-vis:20045054
Handle Link: https://hdl.handle.net/1959.11/5612
Abstract: Numerous techniques have previously been proposed for single-stage thresholding of document images to separate the written or printed information from the background. Although these global or local thresholding techniques have proven effective on particular subclasses of documents, none is able to produce consistently good results on the wide range of document image qualities that exist in general or the image qualities encountered in degraded historical documents. A new thresholding structure called the decompose algorithm is proposed and compared against some existing single-stage algorithms. The decompose algorithm uses local feature vectors to analyse and find the best approach to threshold a local area. Instead of employing a single thresholding algorithm, automatic selection of an appropriate algorithm for specific types of subregions of the document is performed. The original image is recursively broken down into subregions using quad-tree decomposition until a suitable thresholding method can be applied to each subregion. The algorithm has been trained using 300 historical images obtained from the Library of Congress and evaluated on 300 'difficult' document images, also extracted from the Library of Congress, in which considerable background noise or variation in contrast and illumination exists. Quantitative analysis of the results by measuring text recall, and qualitative assessment of processed document image quality is reported. The decompose algorithm is demonstrated to be effective at resolving the problem in varying quality historical images.
Publication Type: Journal Article
Source of Publication: IEE Proceedings on Vision, Image and Signal Processing, 152(6), p. 702-714
Publisher: The Institution of Engineering and Technology
Place of Publication: United Kingdom
ISSN: 1350-245X
Fields of Research (FoR) 2008: 080104 Computer Vision
080199 Artificial Intelligence and Image Processing not elsewhere classified
080106 Image Processing
Socio-Economic Objective (SEO) 2008: 810199 Defence not elsewhere classified
810107 National Security
890299 Computer Software and Services not elsewhere classified
Peer Reviewed: Yes
HERDC Category Description: C1 Refereed Article in a Scholarly Journal
Appears in Collections:Journal Article
School of Science and Technology

Files in This Item:
2 files
File Description SizeFormat 
Show full item record

SCOPUSTM   
Citations

42
checked on Dec 28, 2024

Page view(s)

990
checked on Mar 7, 2023
Google Media

Google ScholarTM

Check

Altmetric


Items in Research UNE are protected by copyright, with all rights reserved, unless otherwise indicated.