Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/9551
Title: Cancer Classification from Microarray Data using Gene Feature Ranking
Contributor(s): Hasan, Abid (author); Morshed, Maruf Golam (author); Shareef, MD (author); Al-Mamun, Hawlader Abdullah (author); Kwan, Paul H  (author)
Publication Date: 2011
Handle Link: https://hdl.handle.net/1959.11/9551
Abstract: A significant challenge in DNA (Deoxyribo Nucleic Acid) microarray analysis can be attributed to the problem of having a large number of features (genes) but with a small number of samples in the dataset. When applying statistical methods to analyse the microarray data, particular care is required to deal with problem such as the low classification accuracy of models brought about by the small number of features that have predictive capability. To overcome these problems, proper approaches for data normalisation, feature reduction, and identifying the optimal set of genes are critical. In this paper, we apply the Gene Feature Ranking [5] method to select genes with high trust values from high dimensional cancer microarray datasets. Our contribution lies in the use of a different metric for calculating the trust values that are more domain specific for cancer datasets. By choosing a pre-defined threshold based on user's knowledge, only genes that show sufficient trustworthiness to be considered for constructing the classification model are retained. Through experimentation on three microarray datasets, namely Acute Lymphoblastic Leukemia (ALL), lymph node negative primary breast cancer, and High Grade Glioma, we are able to confirm that the classification accuracy obtained by the genes selected by the modified GFR method is consistently higher than when the method was not used.
Publication Type: Journal Article
Source of Publication: International Journal Of Data Mining And Emerging Technologies, 1(2), p. 54-60
Publisher: IndianJournals.com
Place of Publication: India
ISSN: 2249-3220
2249-3212
Fields of Research (FoR) 2008: 080301 Bioinformatics Software
060405 Gene Expression (incl Microarray and other genome-wide approaches)
080109 Pattern Recognition and Data Mining
Socio-Economic Objective (SEO) 2008: 970108 Expanding Knowledge in the Information and Computing Sciences
920102 Cancer and Related Disorders
970106 Expanding Knowledge in the Biological Sciences
Peer Reviewed: Yes
HERDC Category Description: C1 Refereed Article in a Scholarly Journal
Publisher/associated links: http://indianjournals.com/ijor.aspx?target=ijor:ijdmet&volume=1&issue=2&article=002
Appears in Collections:Journal Article

Files in This Item:
3 files
File Description SizeFormat 
Show full item record

Page view(s)

1,002
checked on Mar 7, 2023
Google Media

Google ScholarTM

Check


Items in Research UNE are protected by copyright, with all rights reserved, unless otherwise indicated.