Historical mating patterns in the U.S. revealed through admixture and IBD patterns from genome-wide data from over 800,000 individuals.
J. M. Granka1 ; Y. Wang1 ; E. Han1 ; J. K. Byrnes1 ; A. Kermany1 ; R. E. Curtis2 ; P. Carbonetto1 ; K. Noto1 ; M. J. Barber1 ; N. M. Myres2 ; C. A. Ball1 ; K. G. Chahine2
1) AncestryDNA, San Francisco, CA; 2) AncestryDNA, Provo, UT.
Within a diverse population like the United States, many individuals are admixed, with ancestry from many worldwide regions. Non-random mating and migration can result in non-random combinations of ancestries within admixed individuals (i.e., certain sets of ancestries may be common, and others may be rare); such dynamics can also affect patterns of identity-by-descent (IBD) among admixed and non-admixed individuals. To shed insight into historical mating and migration, we study genome-wide genotype data of over 800,000 AncestryDNA customers, as well as a subset of over 400,000 born in the US. First, we use a supervised algorithm to estimate individuals’ genetic admixture proportions across 26 global regions. We measure correlations between the estimated ancestries, and find certain sets of ancestries to frequently co-occur in individuals’ estimates. Such relationships may reflect historical events; e.g., the association between ancestry from the Americas and the Iberian Peninsula could reflect Colonial Era admixture. In addition to historical mating patterns, however, the admixture inference procedure and the delineation of global regions could also impact such correlations. To disentangle whether these trends could reflect mating patterns and preferences, we examine associations between the estimated ancestries of the parents of over 10,000 trios. Observed correlations agree with many of those identified within individuals, and potentially reflect more recent historical trends. Thirdly, we extend our study to IBD patterns in an inferred IBD network among genotyped individuals. Sub-clusters of the IBD network, which can often be annotated by ethnicity or historical US migration, are often inter-connected by bridging IBD connections; we highlight several connected sub-clusters in light of findings from genetic ancestry. Finally, we corroborate findings from these three analyses, as well as their potential timescales, by examining over 500,000 AncestryDNA customer pedigrees. Associations of country-level birth locations between pairs of couples support many of the non-random associations of ethnicities and IBD connections identified using genetic data. Many of the associations we observe reflect historical phenomena, and while not conclusive about their cause, suggest that many individuals with admixed ancestry, including those in the US, have present-day genetic signatures reflecting the migration and subsequent non-random mating of their ancestors.