Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/19577
Title: Assessing the reproducibility of discriminant function analyses
Contributor(s): Andrew, Rose  (author)orcid ; Albert, Arianne YK (author); Renaut, Sebastien (author); Rennison, Diana J (author); Bock, Dan G (author); Vines, Tim (author)
Publication Date: 2015
Open Access: Yes
DOI: 10.7717/peerj.1137Open Access Link
Handle Link: https://hdl.handle.net/1959.11/19577
Abstract: Data are the foundation of empirical research, yet all too often the datasets underlying published papers are unavailable, incorrect, or poorly curated. This is a serious issue, because future researchers are then unable to validate published results or reuse data to explore new ideas and hypotheses. Even if data files are securely stored and accessible, they must also be accompanied by accurate labels and identifiers. To assess how often problems with metadata or data curation affect the reproducibility of published results, we attempted to reproduce Discriminant Function Analyses (DFAs) from the field of organismal biology. DFA is a commonly used statistical analysis that has changed little since its inception almost eight decades ago, and therefore provides an opportunity to test reproducibility among datasets of varying ages. Out of 100 papers we initially surveyed, fourteen were excluded because they did not present the common types of quantitative result from their DFA or gave insufficient details of their DFA. Of the remaining 86 datasets, there were 15 cases for which we were unable to confidently relate the dataset we received to the one used in the published analysis. The reasons ranged from incomprehensible or absent variable labels, the DFA being performed on an unspecified subset of the data, or the dataset we received being incomplete. We focused on reproducing three common summary statistics from DFAs: the percent variance explained, the percentage correctly assigned and the largest discriminant function coefficient. The reproducibility of the first two was fairly high (20 of 26, and 44 of 60 datasets, respectively), whereas our success rate with the discriminant function coefficients was lower (15 of 26 datasets). When considering all three summary statistics, we were able to completely reproduce 46 (65%) of 71 datasets. While our results show that a majority of studies are reproducible, they highlight the fact that many studies still are not the carefully curated research that the scientific community and public expects.
Publication Type: Journal Article
Source of Publication: PeerJ, v.3, p. 1-22
Publisher: PeerJ, Ltd
Place of Publication: United Kingdom
ISSN: 2167-8359
Fields of Research (FoR) 2008: 060299 Ecology not elsewhere classified
060399 Evolutionary Biology not elsewhere classified
Fields of Research (FoR) 2020: 310399 Ecology not elsewhere classified
310499 Evolutionary biology not elsewhere classified
Socio-Economic Objective (SEO) 2008: 970106 Expanding Knowledge in the Biological Sciences
Socio-Economic Objective (SEO) 2020: 280102 Expanding knowledge in the biological sciences
Peer Reviewed: Yes
HERDC Category Description: C1 Refereed Article in a Scholarly Journal
Appears in Collections:Journal Article

Files in This Item:
2 files
File Description SizeFormat 
Show full item record

SCOPUSTM   
Citations

7
checked on Mar 30, 2024

Page view(s)

1,318
checked on Apr 21, 2024
Google Media

Google ScholarTM

Check

Altmetric


Items in Research UNE are protected by copyright, with all rights reserved, unless otherwise indicated.