SNPQC: an R pipeline for quality control of Illumina SNP genotyping array data

Author(s)
Gondro, Cedric
Porto-Neto, Laércio R
Lee, Seung Hwan
Publication Date
2014
Abstract
In genome-wide association studies, quality control (QC) of genotypes is important to avoid spurious results. It is also important to maintain long-term data integrity, particularly in settings with ongoing genotyping (e.g. estimation of genomic breeding values). Here we discuss SNPQC, a fully automated pipeline to perform QC analyses of Illumina SNP array data. It applies a wide range of common quality metrics with user-defined filtering thresholds to generate a comprehensive QC report and a filtered dataset, including a genomic relationship matrix, ready for further downstream analyses which make it amenable for integration in high-throughput environments. SNPQC also builds a database to store genotypic, phenotypic and quality metrics to ensure data integrity and the option of integrating more samples from subsequent runs. The program is generic across species and array designs, providing a convenient interface between the genotyping laboratory and downstream genome-wide association study or genomic prediction.
Citation
Animal Genetics, 45(5), p. 758-761
ISSN
1365-2052
0268-9146
Link
Language
en
Publisher
Wiley-Blackwell Publishing Ltd
Title
SNPQC: an R pipeline for quality control of Illumina SNP genotyping array data
Type of document
Journal Article
Entity Type
Publication

Files:

NameSizeformatDescriptionLink