race/history/evolution notes: ASHG 2010: South Asian genetics

Selected abstracts from the 2010 meeting of the American Society of Human Genetics.

The genetic structure of South Asian populations as revealed by 650 000 SNPs. M. Metspalu1, G. Chaubey1, B. Yunusbayev1,2, I. Gallego Romero4, M. Karmin1, C. Basu Mallick1, E. Metspalu1, K. Thangaraj3, L. Singh3, S. Shanmugalakshmi6, K. Balakrishnan6, R. Pitchappan5, T. Kivisild4,1, R. Villems1 1) Dept. of Evolutionary Biology, Estonian Biocentre and Tartu University, Tartu, Estonia; 2) Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences, Ufa 450054, Russia; 3) Centre for Cellular and Molecular Biology, Hyderabad, India; 4) Leverhulme Centre of Human Evolutionary Studies, The Henry Wellcome Building, University of Cambridge, Fitzwilliam Street, Cambridge, CB2 1QH, UK; 5) Department of Immunology, School of Biological Sciences, Madurai Kamaraj University, India; 6) School of Biotechnology, Bharathidasan University, Trichirappalli, India.

The onset of the era of analyses of dense marker sets covering the whole genome has revolutionised the field of (human) population genetics. Driven largely by the needs of biomedical research the new data is helping to unveil our demographic past outlined by the study of mtDNA and Y-chromosome variation during the past ca. 20 years. Here we have analysed (Illumina 650K SNPs) over 320 new samples from South and Central Asia and the Caucasus together with the publicly available databases (HGDP panel and our published dataset of ca. 600 Eurasian samples) and illustrate the power of full genome analyses by addressing two specific questions: i) the nature of genetic continuity and discontinuity between South Asia, Middle East and Central Asia, and ii) genetic origins of the Munda speakers of India. We use principal component and structure-like analyses to reveal the structure in the genome wide SNP data. The most striking feature of the genetic structure of South Asian populations is the clear separation of the Indus valley and southern India populations. The genetic component prevalent in the latter region is marginal in the former and absent outside South Asia. The component ubiquitous to Indus valley is, in contrast, also present (ca. 30 - 40%) among Indo-European speakers of Ganges valley and Dravidic speakers in southern India. Furthermore this component can also be found in Central Asia and the Caucasus as well as in Middle East. We explore possibilities to identify the source region for this genetic component. Alternative models put the origins of Munda languages speakers either in South Asia (the Munda speakers sport exclusively autochthonous South Asian mtDNA variants) or to Southeast Asia where the other Austro Asiatic languages are spread. Y-chromosome variation supports the latter model through sharing of hg O2a in both regions. We show that in addition to the dominant ancestry component shared between the Indian Dravidic and Munda speakers the latter retain (up to 30%) an ancestry component otherwise prevalent in East Asia. There is no widespread sign of South Asian ancestry component in Southeast Asia. This provides genomic support to the model by which Indian Austro-Asiatic populations derive from dispersal from Southeast/East Asia followed by an extensive admixture with local Indian populations.

A genome-wide survey of the history of the Roma people. P. Moorjani1, N. Patterson2, M. Bonin3, O. Riess3, D. Reich1,2,4, B. Melegh5 1) Dept Gen, Harvard Medical School, Boston, MA; 2) Broad Institute, Cambridge, MA; 3) Department of Medical Genetics, University Tuebingen, Tübingen, Germany; 4) Harvard School of Public Health, Boston, MA; 5) Department of Medical Genetics, University of Pécs, Pécs, Hungary.

Previous linguistic studies, as well as studies of mitochondrial DNA and the Y chromosome, have suggested that the European Roma or Romani people are a group of endogamous founder populations that have inherited genetic material from South Asians. However, no previous studies have been able to characterize the date and proportion of mixture. To understand the origin and population history of the Roma, we have analyzed data for about a million single nucleotide polymorphisms (SNPs) and confirm evidence for their South Asian origin. We prove that the Roma, like other Indian populations, inherit ancestry from both ‘Ancestral South Indians’ (ASI) and ‘Ancestral North Indians’ (ANI) that are closely related to West Eurasians. To estimate the date of the mixture, we develop a novel method that uses summary statistics of admixture linkage disequilibrium. We estimate that the average number of generations since mixture is 29 ± 2 generations or about 800 years ago. In addition, we characterize the effect of founder events by calculating a genomic measure of individual homozygosity. We observe that the Roma have significantly higher autozygosity compared to other European populations, which likely contributes to the higher prevalence of recessive disorders seen in this population.

The Genetic Canvas of European Roma. M. Karmin1, M. Baldovič2, N. Jeran3, S. Cvjetan4, M. Reidla1, T. Šaric3, J. Šarac3, M. Cenanovic6, A. Leskovac5, D. Marjanovic6, H. D. Auguštin3, A. Ficek2, G. Chaubey1, S. Rootsi1, V. Ferak2, R. Pavao3, E. Metspalu1, D. M. Behar1, R. Villems1 1) Estonian Biocenter and Department of Evolutionary Biology, University of Tartu, Tartu, Estonia; 2) Comenius University, Bratislava, Slovakia; 3) Institute for Anthropological Research, Zagreb, Croatia; 4) Mediterranean Institute for Life Sciences, Split, Croatia; 5) Vinca Institute of Nuclear Sciences, Belgrade, Serbia; 6) Institute for Genetic Engineering and Biotechnology, Sarajevo, Bosnia.

According to linguistic evidence, the Indian exodus of the ancestors of the European Roma took place most probably around the end of the first millennium. By 13th - 15th centuries, different groups of Roma had spread throughout Europe. Virtual lack of written records prior to their arrival to Europe has left us with scarce knowledge about their historical migratory routes. Therefore, valuable insight may come from archaeogenetic studies. Here we report the results of a combined mtDNA, NRY and autosomal study of the genetic variation of Roma (Gypsies). We have studied variation of close to 600 Roma mtDNAs from six European countries, including 60 complete mitochondrial genomes from various populations of European Roma, India and the Near East. The most common Indian-specific maternal lineage among Roma is M5a1b1. Reaching from 5% to 35% in different Roma communities, it is, at the same time, present in various linguistically, socially and geographically different populations of India. The analysis of complete mitochondrial genomes shows that U3b1 and X2e1 lineages found in Roma have their closest relatives in the Near East and X2d1 lineages in the Caucasus, suggesting that at least a part of their West Eurasian-specific matrilineages has been picked up by the Roma en route before reaching Europe. We have analyzed over 250 Y-chromosomes from different Roma communities for their Y-STR and Y-SNP variation. Our results confirm previous studies, identifying NRY hg H1a as a clear sign of the Indian-specific contribution within the Y-chromosomal pool of Roma. Whole genome analysis of Roma, carried out using Illumina BeadArrays in the context of Eurasian and North African populations, revealed that Roma individuals exhibit varied levels of shared South Asian - West Eurasian ancestry. Taken together, our mtDNA, Y-chromosomal and GW analysis of European Roma populations in the context of Indian, Middle Eastern and European populations offers new insights into the demographic history of Roma populations.

Inference of human expansion in Eurasia and genetic diversity in India. J. Xing1, W. S. Watkins1, Y. Hu2, C. D. Huff1, A. Sabo2, D. M. Muzny2, R. A. Gibbs2, L. B. Jorde1, F. Yu2 1) Dept of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT; 2) Human Genome Sequencing Center, Dept of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX.

Genetic studies of populations from the Indian subcontinent are of great interest for many reasons, including India’s large population size, its complex demographic history, and its unique social structure. Despite recent large-scale efforts in discovering human genetic variation, India’s vast reservoir of genetic diversity remains largely unexplored. To address this issue and to study human migration history in Eurasia, we resequenced one of the 100 kb ENCODE regions in 92 samples collected from four groups - representing three castes and one tribal group - from the state of Andhra Pradesh in south India. By comparing the four south Indian populations with eight HapMap populations that are sequenced for the same region, we found that more than 15% of the total SNPs in the twelve populations in this region are Indian-specific (including HapMap GIH), and 30% of all SNPs in the south Indian populations are not seen in any HapMap population. For this 100 kb region, several Indian population samples, such as middle-caste Yadava, and lower-caste Mala/Madiga, have nucleotide diversity as high as HapMap African populations. In contrast to many other Eurasian populations, the diversity levels in the Indian samples are not correlated with their geographic distances from eastern Africa. Using the unbiased allele-frequency spectrum from twelve Old World populations, we were also able to investigate the divergence and expansion of human populations in Eurasia. The divergence time estimates among continental groups suggest that all Eurasian populations in this study diverged from Africans during the same time frame (~100-120 thousand years ago). The divergence times among the individual Eurasian populations were more than 40 thousand years later than their divergence with Africans, supporting the long-term existence of an ancestral Eurasian founding population after the out-of-Africa diaspora.

Prevalence of lysosomal storage disorders in India: Our experience. J. Sheth1, M. Mistri1, N. Oza1, U. Dave1, P. Gambhir2, R. Shah1, F. Sheth1 1) Biochem & Molec Bio, Inst Human Gen, Ahmedabad, India; 2) Sasoon General Hospital, Pune, India.

With the population of 1.2 billion, birth rate of about 29 million per year and consanguineous marriages in many parts of the country storage disorders are considered to be high in India. However, due to overlapping clinical phenotypes, lack of therapeutic options for majority of LSDs, the subject remains an investigative with lesser interest from the clinician. On the other hand recent availability of enzyme replacement therapy for some of the storage disorders there is a growing interest among clinicians in an early diagnosis of these diseases. Present study was carried out in 604 children in the age range of 3 months to 12 years with variable phenotypes like coarse facial features, hepatomegaly, neuroregression, epiphyseal skeletal abnormality, corneal clouding, cherry red spot, impaired motor neuron function/hypotonia, respiratory complications with progressive muscular weakness and regression of learned skill/mile stone. All were investigated for seven mucopolysaccharide disorders, glycolipid and lipid storage disorders (Tay Sachs, NPD, Gaucher), storage of sulphatides (MLD and Krabbe), glycogen storage (Pompe), defects in lysosomal transporters (Sialic acid storage disorder) and lysosomal trafficking protein abnormality (Mucolipidosis and NPD-C). Enzyme study was carried out from leucocytes, plasma and skin fibroblasts using fluorometric and spectrophotometric substrate. 208/604 (34.44%) children wear found to have storage disorders like MPS (36.53%), glycolipid and lipid storage (38.46%), sulphatides accumulation (10.57%), glycogen storage (9.66%), lysosomal transporter abnormality (1.93%) and lysosomal trafficking protein abnormality (2.9%). Impaired motor neuron function/hypotonia and neuroregression were the most common phenotype observed in 58.65% and 42.78% children with storage disorders. While other phenotypes were hepatosplenomegaly (28.36%), coarse facial features (29.32%), cherry red spot(32.21%), regression of learned skill/delayed mile stone (32.21%), skeletal abnormality (17.78%), respiratory complications with progressive muscle weakness (9.6%) and corneal clouding (2.4%). Our study suggests that glycolipid and lipid storage disorders are the most common storage disorders in India followed by MPS and information can be utilized for the national screening programme of storage disorders like Gaucher and NPD A/B to offer an early therapeutic intervention with improved prognostication.

ASHG 2010: South Asian genetics

1 comment: