Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/52933
Title: Generalizability assessment of COVID-19 3D CT data for deep learning-based disease detection
Contributor(s): Fallahpoor, Maryam (author); Chakraborty, Subrata  (author)orcid ; Heshejin, Mohammad Tavakoli (author); Chegeni, Hossein (author); Horry, Michael James (author); Pradhan, Biswajeet (author)
Publication Date: 2022-06
Early Online Version: 2022-04-01
DOI: 10.1016/j.compbiomed.2022.105464
Handle Link: https://hdl.handle.net/1959.11/52933
Abstract: Background: Artificial intelligence technologies in classification/detection of COVID-19 positive cases suffer from generalizability. Moreover, accessing and preparing another large dataset is not always feasible and time-consuming. Several studies have combined smaller COVID-19 CT datasets into "supersets" to maximize the number of training samples. This study aims to assess generalizability by splitting datasets into different portions based on 3D CT images using deep learning.
Method: Two large datasets, including 1110 3D CT images, were split into five segments of 20% each. Each dataset's first 20% segment was separated as a holdout test set. 3D-CNN training was performed with the remaining 80% from each dataset. Two small external datasets were also used to independently evaluate the trained models.
Results: The total combination of 80% of each dataset has an accuracy of 91% on Iranmehr and 83% on Moscow holdout test datasets. Results indicated that 80% of the primary datasets are adequate for fully training a model. The additional fine-tuning using 40% of a secondary dataset helps the model generalize to a third, unseen dataset. The highest accuracy achieved through transfer learning was 85% on LDCT dataset and 83% on Iranmehr holdout test sets when retrained on 80% of Iranmehr dataset.
Conclusion: While the total combination of both datasets produced the best results, different combinations and transfer learning still produced generalizable results. Adopting the proposed methodology may help to obtain satisfactory results in the case of limited external datasets.
Publication Type: Journal Article
Source of Publication: Computers in Biology and Medicine, v.145, p. 1-18
Publisher: Elsevier Ltd
Place of Publication: United Kingdom
ISSN: 1879-0534
0010-4825
Fields of Research (FoR) 2020: 460102 Applications in health
460103 Applications in life sciences
Socio-Economic Objective (SEO) 2020: 280115 Expanding knowledge in the information and computing sciences
200499 Public health (excl. specific population health) not elsewhere classified
Peer Reviewed: Yes
HERDC Category Description: C1 Refereed Article in a Scholarly Journal
Appears in Collections:Journal Article
School of Science and Technology

Files in This Item:
1 files
File SizeFormat 
Show full item record

SCOPUSTM   
Citations

12
checked on Apr 6, 2024

Page view(s)

586
checked on Mar 8, 2023

Download(s)

2
checked on Mar 8, 2023
Google Media

Google ScholarTM

Check

Altmetric


Items in Research UNE are protected by copyright, with all rights reserved, unless otherwise indicated.