SNPQC: an R pipeline for quality control of Illumina SNP genotyping array data

Title
SNPQC: an R pipeline for quality control of Illumina SNP genotyping array data
Publication Date
2014
Author(s)
Gondro, Cedric
( author )
OrcID: https://orcid.org/0000-0003-0666-656X
Email: cgondro2@une.edu.au
UNE Id une-id:cgondro2
Porto-Neto, Laércio R
Lee, Seung Hwan
Type of document
Journal Article
Language
en
Entity Type
Publication
Publisher
Wiley-Blackwell Publishing Ltd
Place of publication
United Kingdom
DOI
10.1111/age.12198
UNE publication id
une:17905
Abstract
In genome-wide association studies, quality control (QC) of genotypes is important to avoid spurious results. It is also important to maintain long-term data integrity, particularly in settings with ongoing genotyping (e.g. estimation of genomic breeding values). Here we discuss SNPQC, a fully automated pipeline to perform QC analyses of Illumina SNP array data. It applies a wide range of common quality metrics with user-defined filtering thresholds to generate a comprehensive QC report and a filtered dataset, including a genomic relationship matrix, ready for further downstream analyses which make it amenable for integration in high-throughput environments. SNPQC also builds a database to store genotypic, phenotypic and quality metrics to ensure data integrity and the option of integrating more samples from subsequent runs. The program is generic across species and array designs, providing a convenient interface between the genotyping laboratory and downstream genome-wide association study or genomic prediction.
Link
Citation
Animal Genetics, 45(5), p. 758-761
ISSN
1365-2052
0268-9146
Start page
758
End page
761

Files:

NameSizeformatDescriptionLink