Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples.
AuthorMullen, Michael Paul
Creevey, Christopher J.
Berry, Donagh P.
Magee, David A
Howard, Dawn J
Killeen, Aideen P.
Park, Stephen D.
McGettigan, Paul A.
Lucy, Matt C.
MacHugh, David E
Waters, Sinead M.
MetadataShow full item record
StatisticsDisplay Item Statistics
CitationMullen MP, Creevey CJ, Berry DP, McCabe MS, Magee DA, Howard DJ, Killeen AP, Park SD, McGettigan PA, Lucy MC, MacHugh DE, Waters SM. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples. BMC Genomics, 2012, 13:16. DOI:10.1186/1471-2164-13-16.
AbstractBackground: The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results: In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952) of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612) were intronic and 9% (n = 464) were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS). Significant (P < 0.01) mean allele frequency differentials between the low and high fertility groups were observed for 720 SNPs (58 NSS). Allele frequencies for 43 of the SNPs were also determined by genotyping the 150 individual animals (Sequenom® MassARRAY). No significant differences (P > 0.1) were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total). Conclusions: The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving interval plausibly harbouring causative variants contributing to heritable variation. To our knowledge, this is the first report describing sequencing of targeted genomic regions in any livestock species using groups with divergent phenotypes for an economically important trait.
FunderScience Foundation Ireland; Science Foundation Ireland (SFI) Stokes lecturer scheme; Department of Agriculture, Food and the Marine
Grant Number07/SRC/B1156; 07/SK/B1236A; RSF-06-0353; RSF-06-0409