Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/30398
Title: Regarding the F-word: The effects of data filtering on inferred genotype-environment associations
Contributor(s): Ahrens, Collin W (author); Jordan, Rebecca (author); Bragg, Jason (author); Harrison, Peter A (author); Hopley, Tara (author); Bothwell, Helen (author); Murray, Kevin (author); Steane, Dorothy A (author); Whale, John W (author); Byrne, Margaret (author); Andrew, Rose  (author)orcid ; Rymer, Paul D (author)
Publication Date: 2021-07
Early Online Version: 2021-02-10
DOI: 10.1111/1755-0998.13351
Handle Link: https://hdl.handle.net/1959.11/30398
Abstract: Genotype-environment association (GEA) methods have become part of the standard landscape genomics toolkit, yet, we know little about how to best filter genotype-by-sequencing data to provide robust inferences for environmental adaptation. In many cases, default filtering thresholds for minor allele frequency and missing data are applied regardless of sample size, having unknown impacts on the results, negatively affecting management strategies. Here, we investigate the effects of filtering on GEA results and the potential implications for assessment of adaptation to environment. We use empirical and simulated data sets derived from two widespread tree species to assess the effects of filtering on GEA outputs. Critically, we find that the level of filtering of missing data and minor allele frequency affect the identification of true positives. Even slight adjustments to these thresholds can change the rate of true positive detection. Using conservative thresholds for missing data and minor allele frequency substantially reduces the size of the data set, lessening the power to detect adaptive variants (i.e., simulated true positives) with strong and weak strengths of selection. Regardless, strength of selection was a good predictor for GEA detection, but even some SNPs under strong selection went undetected. False positive rates varied depending on the species and GEA method, and filtering significantly impacted the predictions of adaptive capacity in downstream analyses. We make several recommendations regarding filtering for GEA methods. Ultimately, there is no filtering panacea, but some choices are better than others, depending on the study system, availability of genomic resources, and desired objectives.
Publication Type: Journal Article
Grant Details: ARC/LP150100936.
ARC/DE190100326
Source of Publication: Molecular Ecology Resources, 21(5), p. 1460-1474
Publisher: Wiley-Blackwell Publishing Ltd
Place of Publication: United Kingdom
ISSN: 1755-0998
1755-098X
Fields of Research (FoR) 2008: 060411 Population, Ecological and Evolutionary Genetics
060303 Biological Adaptation
Fields of Research (FoR) 2020: 310403 Biological adaptation
Socio-Economic Objective (SEO) 2008: 970106 Expanding Knowledge in the Biological Sciences
961306 Remnant Vegetation and Protected Conservation Areas in Forest and Woodlands Environments
Socio-Economic Objective (SEO) 2020: 280102 Expanding knowledge in the biological sciences
180604 Rehabilitation or conservation of terrestrial environments
Peer Reviewed: Yes
HERDC Category Description: C1 Refereed Article in a Scholarly Journal
Description: All data and R code are available on Dryad: https://doi.org/10.5061/dryad.ffbg79ctg (Ahrens, Jordan, et al., 2020).
Appears in Collections:Journal Article
School of Environmental and Rural Science

Files in This Item:
2 files
File Description SizeFormat 
Show full item record

SCOPUSTM   
Citations

18
checked on Oct 12, 2024

Page view(s)

1,216
checked on May 7, 2023

Download(s)

8
checked on May 7, 2023
Google Media

Google ScholarTM

Check

Altmetric


Items in Research UNE are protected by copyright, with all rights reserved, unless otherwise indicated.