Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship

Title
Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship
Publication Date
2017-12-21
Author(s)
Hong Lee, S
Clark, Sam
( author )
OrcID: https://orcid.org/0000-0001-8605-1738
Email: sclark37@une.edu.au
UNE Id une-id:sclark37
van der Werf, Julius H J
( author )
OrcID: https://orcid.org/0000-0003-2512-1696
Email: jvanderw@une.edu.au
UNE Id une-id:jvanderw
Type of document
Journal Article
Language
en
Entity Type
Publication
Publisher
Public Library of Science
Place of publication
United States of America
DOI
10.1371/journal.pone.0189775
UNE publication id
une:1959.11/28967
Abstract
Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close and distant relatives as well as `unrelated' individuals from the wider population in the genomic prediction. The various sources of information were modeled as different populations with different effective population sizes (Nₑ). Both the effective number of chromosome segments (Mₑ) and Nₑ are considered to be a function of the data used for prediction. We validate our theory with analyses of simulated as well as real data, and illustrate that the variation in genomic relationships with the target is a predictor of the information content of the reference set. With a similar amount of data available for each source, we show that close relatives can have a substantially larger effect on genomic prediction accuracy than lesser related individuals. We also illustrate that when prediction relies on closer relatives, there is less improvement in prediction accuracy with an increase in training data or marker panel density. We release software that can estimate the expected prediction accuracy and power when combining different reference sources with various degrees of relationship to the target, which is useful when planning genomic prediction (before or after collecting data) in animal, plant and human genetics.
Link
Citation
PLoS One, 12(12), p. 1-22
ISSN
1932-6203
Pubmed ID
29267328
Start page
1
End page
22
Rights
Attribution 4.0 International

Files:

NameSizeformatDescriptionLink
openpublished/EstimationLeeClarkVanDerWerf2017JournalArticle.pdf 3247.354 KB application/pdf Published version View document