Please use this identifier to cite or link to this item:
https://hdl.handle.net/1959.11/9551
Title: | Cancer Classification from Microarray Data using Gene Feature Ranking | Contributor(s): | Hasan, Abid (author); Morshed, Maruf Golam (author); Shareef, MD (author); Al-Mamun, Hawlader Abdullah (author); Kwan, Paul H (author) | Publication Date: | 2011 | Handle Link: | https://hdl.handle.net/1959.11/9551 | Abstract: | A significant challenge in DNA (Deoxyribo Nucleic Acid) microarray analysis can be attributed to the problem of having a large number of features (genes) but with a small number of samples in the dataset. When applying statistical methods to analyse the microarray data, particular care is required to deal with problem such as the low classification accuracy of models brought about by the small number of features that have predictive capability. To overcome these problems, proper approaches for data normalisation, feature reduction, and identifying the optimal set of genes are critical. In this paper, we apply the Gene Feature Ranking [5] method to select genes with high trust values from high dimensional cancer microarray datasets. Our contribution lies in the use of a different metric for calculating the trust values that are more domain specific for cancer datasets. By choosing a pre-defined threshold based on user's knowledge, only genes that show sufficient trustworthiness to be considered for constructing the classification model are retained. Through experimentation on three microarray datasets, namely Acute Lymphoblastic Leukemia (ALL), lymph node negative primary breast cancer, and High Grade Glioma, we are able to confirm that the classification accuracy obtained by the genes selected by the modified GFR method is consistently higher than when the method was not used. | Publication Type: | Journal Article | Source of Publication: | International Journal Of Data Mining And Emerging Technologies, 1(2), p. 54-60 | Publisher: | IndianJournals.com | Place of Publication: | India | ISSN: | 2249-3220 2249-3212 |
Fields of Research (FoR) 2008: | 080301 Bioinformatics Software 060405 Gene Expression (incl Microarray and other genome-wide approaches) 080109 Pattern Recognition and Data Mining |
Socio-Economic Objective (SEO) 2008: | 970108 Expanding Knowledge in the Information and Computing Sciences 920102 Cancer and Related Disorders 970106 Expanding Knowledge in the Biological Sciences |
Peer Reviewed: | Yes | HERDC Category Description: | C1 Refereed Article in a Scholarly Journal | Publisher/associated links: | http://indianjournals.com/ijor.aspx?target=ijor:ijdmet&volume=1&issue=2&article=002 |
---|---|
Appears in Collections: | Journal Article |
Files in This Item:
File | Description | Size | Format |
---|
Page view(s)
1,208
checked on Jun 23, 2024
Items in Research UNE are protected by copyright, with all rights reserved, unless otherwise indicated.