race/history/evolution notes: Additional ASHG 2010 abstracts

Selected abstracts from the 2010 meeting of the American Society of Human Genetics.

The male gene pool of the contemporary Mesopotamia marsh population supports their Semitic origin. N. Al-Zahery1, J. A. Irwin2, V. Battaglia1, M. A. Hamod3, V. Grugni1, A. S. Santachiara-Benerecetti1, O. Semino1 1) Department of Genetics and Microbiology, Pavia University, Via Ferrata 1, 27100 Pavia, Italy; 2) Research Department, Armed Forces DNA Identification Laboratory (AFDIL), 1413 Research Blvd, Rockville, MD 20850, USA; 3) Department of Biotechnology, Faculty of Sciences, Baghdad University, Baghdad, Iraq.

The origin of the modern Mesopotamia marsh people, which are locally called “Ma’dan” or “Marsh’s Arabs”, is a question of great interest. Based on their life-style (living in reed houses, grazing of water buffalo and other aspects) and local archaeological sites, many historians and archaeologists believe they may have Sumerian ancestry. Although little is known about the origin of Sumerians themselves, two main hypotheses have been advanced in this regard. According to the first, Sumerians were a group of populations which migrated from the “South East” following a seashore route through the Arabian Gulf, and settled down in the southern marshes of Iraq. According to the second, the advancement of the Sumerian civilization is the result of migration from the mountainous area of Anatolia to the southern marshes of Iraq where they settled, adsorbing previous populations. In order to shed some light on the genetic origin of the Mesopotamia marsh population, we investigated the male gene pool of 145 DNA samples of modern Mesopotamia people, still living in marshes in the south of Iraq. The analyses of Single Nucleotide Polymorphisms (SNPs) and Short Tandem Repeats (STRs) of the paternally transmitted Male Specific region of the Y chromosome (MSY) revealed that more than 80% of marsh Y chromosomes belong to (Hg) J1-M267, the autochthonous haplogroup of Middle Eastern/Semitic speakers with possible recent expansion and/or founder effect reflected by the reduced STRs variability. In particular, 90% of them were assigned to the J1e-M267-PAGE08 sub-haplogroup, which is the predominant Y chromosome lineage among Middle Eastern Arab populations (Yemen, Qatar, UAE, and Levant). Thus, these findings testify, at least from the paternal side, a strong Semitic Arabian component in the contemporary Mesopotamia marshes population, whereas no clear Anatolian and/or South Asian genetic evidence has been detected.

Western Eurasian Y chromosomes found in the Chinese Salar ethnic group. Y. Lu, H. Li MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433 China.

Salar is a small Western-Turkish-speaking population living mostly in Qinghai province of China. The most similar languages to Salar are all far in Turkmenistan. Historical records suggested that they may be descendants of the Turkic nomadic tribes in Central Asia. In this study, 141 Salar Y chromosomes were analyzed for 39 SNP and 14 STR markers to investigate the potential imprints of their western ancestors. The most frequent haplogroup (hg) in this population sample is Hg R, comprising 40% of all Y chromosomes. Most of these Hg R samples belong to R1a1 (M17), which distributes in a wide geographic region including South Asia, East Europe, Central Asia, and South Siberia. Other four Western Eurasian haplogroups (G-2%, H-5%, I-3%, J-3%) were also found in Salar Y chromosome gene pool. These paternal lineages of Salar are absent in their East Asian neighbors but frequent in Central Asia. Y-STR-based analyses also grouped Salar to Central Asians. On the other side, Salar also has low frequencies of the East Asian specific Hg D and Hg O, suggesting possible gene flow from their neighboring populations. This Y chromosome study demonstrated that Salar well keeps the Western Eurasian paternal lineages of their Central Asian ancestors although they may have migrated to Central China for about 800 years.

Ancient and recent demographic events influence mitochondrial DNA diversity in an immigrant Basque population. M. Davis, S. Novak, G. Hampikian Department of Biological Sciences, Boise State University, Boise, ID.

The Basques are an ancient people, considered by many anthropologists to represent the oldest extant European population. Because of this, they have been the subject of numerous sociological and biological investigations. The Basque Diaspora, a relatively recent demographic expansion of the Basque population, has until now been overlooked in genetic studies. Samples were taken from 53 individuals with Basque ancestry in Boise, Idaho, and the mitochondrial DNA (mtDNA) sequence variation of the first and second hypervariable regions were determined. Thirty-six mtDNA haplotypes were detected in the sample. Comparing the genetic diversity in the Idaho sample with other Basque populations, signatures of founder effects were observed, consistent with both the recent and ancient history of Basque mitochondrial lineages. There has been a marked alteration of haplogroup frequency and diversity, and there is a slight reduction in other measures of diversity in the NW Basque population compared to the native Basque population. We have found a relatively high percentage of the Cambridge Reference Sequence (rCRS) haplotype for hypervariable regions I and II, which is absent in previous studies of Basque mtDNA, and rare in other Spanish populations. The amount of nucleotide diversity is consistent with a sample that is predominantly haplogroup H, which is especially common in the Basque regions of Europe, due to ancient migrations and expansions out of glacial refugia. This is the first report of mtDNA diversity in an immigrant Basque population, and we find that the diversity in NW Basques can be explained by the recent history of migration, as well as the phylogeography and diversity of the major European haplogroups.

HLA Associations with Birth Date and Age and Applications to Disease Association Studies. L. Gragert1, M. Maiers1, W. Klitz2 1) Bioinformatics, National Marrow Donor Program, Minneapolis, MN; 2) Public Health, University of California, Berkeley, CA.

In order to test for HLA disparities based on birth date and age, we split a control sample of European-American donors recruited by the NMDP between 1997 and 2002 into two parts and ran HLA association studies. Comparing the HLA of donors under 50 with those over 50, we found several significant HLA associations. The HLA haplotype most associated with older age was A*02-B*18-DR*04 with an odds ratio of 0.582 and a P-value of 0.0006 after Bonferroni correction. We also found a B-DR haplotype associated with younger age, B*35-DR*02 with an odds ratio of 1.25 and a P-value of 0.013. When comparing donors born before 1950 with those born after 1950, we also detected several significant associations, some of which were distinct from the age association analysis. HLA differences based on age may come about from protective and/or predisposing effects of HLA for disease, results in older subjects having a different HLA makeup than younger subjects. HLA differences based on birth date may reflect demographic changes in populations over time, a result of new immigration or higher levels of admixture in younger subjects. Further investigation is needed to test hypotheses for the causes of these HLA disparities. We have previously described significant HLA associations within self-identified race/ethnic (SIRE) categories for geography, gender, and population substructure. Matching controls for all of these factors when conducting HLA disease association studies and must be considered part of the research protocol to avoid erroneous conclusions.

The missing heritability of a model common trait - adult height - is partially detectable by a long “polygenic tail” of common variant signals with very small effect. H. Lango Allen1, K. Estrada2, G. Lettre3, S. I. Berndt4, M. N. Weedon1, F. Rivadeneira2, G. R. Abecasis5, M. Boehnke6, C. Gieger7, D. Gudbjartsson8, N. L. Heard-Costa9, A. U. Jackson6, A. V. Smith10, N. Soranzo11, C. Willer5, A. Kumar12, A. Mahajan13, W. Rayner12, N. Robertson12, A. D. Morris14, C. N. A. Palmer14, A. G. Uitterlinden2, C. M. Lindgren12, M. I. McCarthy12, T. M. Frayling1, J. N. Hirschhorn15, The Genetic Investigation of ANthropometric Traits (GIANT) Consortium 1) Peninsula Medical School, University of Exeter, Exeter, United Kingdom; 2) Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands; 3) Montreal Heart Institute (Research Center), Université de Montréal, Montréal, Québec, Canada; 4) Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA; 5) Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA; 6) Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, USA; 7) Institute of Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany; 8) deCODE genetics, Reykjavik, Iceland; 9) Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, USA; 10) Icelandic Heart Association, Kopavogur, Iceland; 11) Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom; 12) Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom; 13) Institute of Genomics and Integrative Biology, CSIR, Delhi, India; 14) Biomedical Research Institute, University of Dundee, Ninewells Hospital & Medical School, Dundee, UK; 15) Program in Medical and Population Genetics, Broad Institute of Harvard and Massachusetts Institute of Technology, Boston, Massachusetts, USA.

Height is a classic, highly heritable polygenic trait. Since it is an easily and accurately measured phenotype, it is available for large number of samples, and can be used as a model for other common traits. We previously reported results from a genome-wide association (GWA) analysis of height using 133,600 European individuals from the Genetic Investigation of ANthropometric Traits (GIANT) Consortium. We identified 118 independent common variants (defined as 1Mb windows either side of the lead SNP) at P<5x10-8. These signals together contributed <10% of phenotypic variation in height, or <12.5% of the heritable component. One of the most important questions emerging from GWA studies has been the location of “missing heritability”. We hypothesized that common variant signals that individually do not reach conventional levels of GWA significance would increase the proportion of heritability explained. We took forward SNPs representing 89 independent signals at 5x10-8< P<5x10-6 into an in-silico replication set of 50,000 samples of European ancestry. Of the 89 SNPs, 62 reached overall genome wide significance and 88 were directionally consistent with the initial analysis. We also assessed 227 independent signals at 5x10-8< P <5x10-4 in a separate sample of 7000 individuals from the UK T2D Genetics Consortium, genotyped on the metabochip. The following number of SNPs showed directional consistency in these P-value ranges: 5x10-8< P <5x10-7: 25 out of 28 SNPs (89.3%, sign test P=2.7x10-5); 5x10-7< P <5x10-6: 47/54 SNPs (87.0%, P=2.3x10-8); 5x10-6< P <5x10-5: 71/83 SNPs (85.5%, P=2.4x10-11); 5x10-5< P <5x10-4: 48/61 SNPs (78.7%, P=7.7x10-6). Consistent with these results, a deep set of independent variants (in the range of 0.05>P>5x10-8) accounted for up to 16.8% of phenotypic variation in height, or ~20% of the heritable component. A second potential source of common variation that may increase the proportion of heritability explained are common variants that fall within an already identified locus, but are overlooked because they are in partial linkage disequilibrium with the confirmed variant. Conditional analysis showed that at 19 loci there is at least one additional independent signal (post-conditioning P-values were between 1x10-14 and 2x10-7). In summary, we provide strong evidence that there are many more additional variants associated with common polygenic traits among loci that miss the strict genome-wide significance threshold.

Balancing ethics and genetics: classifying individuals by their ancestry groups. R. Zuvich1,2, E. W. Clayton3, M. Basford4, J. Denny5,6, D. M. Roden6.7, J. L. Haines1,2, M. D. Ritchie1,2 1) Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN; 2) Center for Human Genetics Research, Vanderbilt University, Nashville, TN; 3) Center for Biomedical Ethics and Society, Vanderbilt University, Nashville, TN; 4) Office of Personalized Medicine, Vanderbilt University, Nashville, TN; 5) Department of Biomedical Informatics, Vanderbilt University, Nashville, TN; 6) Department of Medicine, Vanderbilt University, Nashville, TN; 7) Department of Pharmacology, Vanderbilt University, Nashville, TN.

Individuals with different genetic ancestry differ in allele frequencies of various single nucleotide polymorphisms (SNPs), linkage disequilibrium patterns, and disease susceptibility, all of which affect analysis of the data and interpretation of the results. Therefore, characterization of an individual’s genetic ancestry is imperative to minimize spurious association results. The Vanderbilt DNA Databank (BioVU) is a DNA repository of >85,000 DNA samples (both adult and pediatric), which are de-identified and linked to electronic medical records. In a preliminary study, we genotyped 360 SNPs using the Illumina DNA Test Panel, which contains ancestry informative markers (AIMs) on ~1,500 DNA samples from patients whose race/ethnicity was identified by hospital personnel (“observer-reported”). Using an ancestry proportion threshold of 90%, there was 95.7% concordance in people of European ancestry between the observer-reported race/ethnicity and the ancestry group identified by the AIMs; while people of African ancestry had far less concordance (22%). However, lowering the ancestry proportion threshold to 75% increases African ancestry concordance to 74%. This is due to the expected proportion of admixture in African American individuals. About 20% of samples in BioVU do not have a race/ethnicity noted in the EMR and characterization of these samples using AIMs would be of great utility. We genotyped an additional ~7,900 DNA samples from BioVU using the DNA Test Panel to characterize genetic ancestry and determine whether using a smaller subset of AIMs yields the same classifications as the full set of AIMs. While characterizing genetic ancestry is important for association studies, this process raises many ethical and social implications about “labeling” patients based on genetic markers. Additionally, with admixed populations, such as those in the United States, the utility of AIMs are limited in their ability to explore genetic ancestry and provide no data to describe socially defined categories of race and ethnicity. We suggest that including AIMs in the EMR will be important to interpret and apply the results of genetics research in the clinic, but that it will be critical to declare that these markers of genetic ancestry do not necessarily correspond to a particular race or ethnicity.

Family-based genome-wide association study for length or height from infancy to early childhood. H. Kim1, E. Lee2, H. Kim1, S. Jung1, B. Han2, J. Lee2, H. Chung3 1) 1Department of Biochemistry, School of Medicine, Ewha Womans University, Seoul, Korea; 2) Center for Genome Science, Korea National Institute of Health, Korea Centers for Disease Control and Prevention, Seoul, Korea; 3) Department of Obstetrics and Gynecology, School of Medicine, Ewha Womans University, Seoul, Korea.

With increasing intermarriage and the number of admixed infants, understanding the anthropometric variation of admixed infants and children is important. To identify genetic factors that influence infant and childhood height, family-based genome-wide association analyses were conducted using 269,888 single nucleotide polymorphisms (SNPs) in 165 trios composed of a Korean father, a Vietnamese mother, and Vietnamese-Korean offspring of a marriage-based immigrant cohort in Korea. In a single-SNP-based analysis, the six SNPs in or near the genes BMP4, MAF, MAGI2, and PTPN7 showed consistent suggestive associations at all height standard deviation scores using Korean, World Health Organization, and Vietnamese growth references. We did not find genome-wide significant associations with height after multiple-testing correction in a single SNP-based analysis. However, the haplotypes in linkage disequilibrium block, which contained the SNPs near the suspected loci, were significantly associated with height. Similar to the results of contiguous haplotype analysis using tagged SNPs, noncontiguous haplotypes of variable length also showed a significant association near the suspected loci. Our results demonstrate that infant and childhood height may be regulated by genetic variations that differ from those of adults and remind us of the need to look at human height from a different perspective, namely age. This study is the first genetic association analysis on cross-sectional infant and childhood height in admixture families, and it provides a basis for future investigations into the genes acting at each stage of height growth.

Music as a novel marker in the study of prehistoric human migrations. T. Rzeszutek1, P. Savage1, V. Grauer2, S. Brown1 1) Psychology, Neuroscience & Behaviour, McMaster University, Hamilton, Ontario, Canada; 2) Independent Scholar, Pittsburgh, PA.

The study of prehistoric human population history is often fraught with controversy owing to incongruent evidence among various markers of present-day genetic and cultural diversity. While archaeological evidence can be used to calibrate the conclusions drawn from present-day diversity, the fickle nature of the fossil record leaves some migration histories unresolved. Our work analyzes the potential of music - in particular, vocal music - to serve as novel migration marker, bolstering established migration work and shedding light on regions of the world whose settlement history is contested. One such migration is the recent expansion of Austronesian-speaking peoples across the Pacific within the last 6000 years. The dominant hypothesis posits a recent origin in Taiwan, with a rapid movement southwards and eastwards to populate Polynesia during the following 3500 years. While this model is strongly supported by both archaeological evidence and the present-day distribution of linguistic diversity, our goal was to analyze whether music could serve as a novel line of evidence in the study of Pacific prehistory. A critical concern regarding any migration marker is its time depth. In order to examine this for music, we analyzed correlations between musical diversity and mitochondrial-DNA diversity in 9 Taiwanese aboriginal tribes for which both types of data were available. A sample of 226 choral songs was analyzed using 39 binary characters representing significant structural features of music (e.g., rhythm, interval size, melodic contour, etc.). The musical samples were restricted to ritual musics, which constitute the most conservative (i.e., slowly changing) component of a culture’s repertoire. Mantel tests showed a significant correlation between musical distance and genetic distance among these 9 tribes, suggesting that music may have a time depth comparable to widely-used genetic markers like mitochondrial DNA. This work demonstrates that music has the potential to enrich the conclusions drawn from other markers, and establishes methods for employing it as a tool in the study of prehistoric human movements throughout the world. At the same time, we want to capitalize on music’s own unique dynamics of change over time and place, particularly its capacity for admixture. In other words, music might not only be able to support the narratives told by other migration markers but shed new light on the histories of population movement and cultural contact.

Additional ASHG 2010 abstracts

No comments: