Detecting patterns of global and local positive selection by
examining novel variants in the exomes of 7 world-wide human populations
Laura Botigué
1, Jeff Kidd2, Brenna Henn1
1Stony Brook University, Stony Brook, New York, USA, 2University of Michigan, Ann Arbor, Michigan, USA
1Stony Brook University, Stony Brook, New York, USA, 2University of Michigan, Ann Arbor, Michigan, USA
Recent efforts to identify adaptive loci in humans relied
primarily on single nucleotide polymorphism array data. For many global
populations however, these datasets suffered from ascertainment bias
and did not allow for the identification of novel, adaptive variants
unique to different populations. In this study we use high coverage
exomes and low coverage full genomes from over 50 individuals from 7
human populations of geographically divergent groups from Namibia,
Congo, Algeria, Pakistan, Cambodia, Siberia and Mexico to differentiate
between local and global adaptation. We additionally apply the same
approach to examining 1000 Genomes data. In order to minimize the effect
of demography, we compare the site frequency spectrum of putatively
functional variants with the neutral site frequency spectrum as
estimated from synonymous sites or intergenic loci. We specifically
hypothesize that derived variants with a large predicted functional
impact found at high frequencies are not deleterious and potentially
beneficial. We further hypothesize that derived variants common across
populations are good candidates for adaptative traits common to the
human species, whereas variants that are at high frequency but
population specific are indicative of local adaptation. When we consider
only variants with an extreme functional effect, as predicted by GERP
scores, a total of 6% are shared across all populations, and 16% are
private to a given population at frequencies higher than 10%. We obtain a
subset of candidate genes under selection based on these hypotheses and
assess common features among then using gene ontology. Overall, results
may shed light to human adaptation at the species level, as well as the
local level, and finally have a better understanding of how exposure to
new environmental pressures affected early human expansion across the
globe.
As a first step, we have developed a statistical method for inferring segments of Neandertal local ancestry in modern humans and applied this method to construct a map of Neandertal ancestry in modern non-Africans, using data from Phase 1 of the 1000 genomes project combined with a high coverage (50×) Neandertal genome. This map reveals the adaptive impact of Neandertal gene flow as we find enhanced Neandertal ancestry in genes involved in keratin filament formation as well as other biological pathways. We also observe large regions with reduced Neandertal ancestry consistent with purifying selection against introgressing Neandertal alleles in part due to these alleles contributing to hybrid male sterility.
To extend this approach to other archaic-modern human introgression events, we generated deep genome sequences of 21 people from populations with substantial Denisovan ancestry: 16 Papua New Guineans, 2 Bougainville Islanders, and 3 aboriginal individuals from Australia. We also extend our method to infer Neandertal and Denisovan local ancestry in these populations. We test whether the same evidence for hybrid male sterility is observed in this introgression event as is observed between Neandertals and modern humans.
Inference of local ancestry in archaic-modern human admixture and its impact on modern human evolution
Sriram Sankararaman
1
,2, Swapan Mallick1
,2, Michael Danneman3, Kay Prufer3, Janet Kelso3, Svante Paabo3, Nick Patterson1
,2, David Reich1
,2
1Harvard Medical School, Boston, USA, 2Broad Institute, Cambridge, USA, 3Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Analysis of archaic genomes has documented several examples of
admixture between archaic and modern human groups e.g. these analyses
have revealed that Neandertals interbred with the ancestors of all
non-Africans and the Denisovans interbred with the ancestors of
present-day Melanesians. To understand how these admixture events
shaped the evolution of modern humans, we need to build maps of archaic
ancestry in modern humans.1Harvard Medical School, Boston, USA, 2Broad Institute, Cambridge, USA, 3Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
As a first step, we have developed a statistical method for inferring segments of Neandertal local ancestry in modern humans and applied this method to construct a map of Neandertal ancestry in modern non-Africans, using data from Phase 1 of the 1000 genomes project combined with a high coverage (50×) Neandertal genome. This map reveals the adaptive impact of Neandertal gene flow as we find enhanced Neandertal ancestry in genes involved in keratin filament formation as well as other biological pathways. We also observe large regions with reduced Neandertal ancestry consistent with purifying selection against introgressing Neandertal alleles in part due to these alleles contributing to hybrid male sterility.
To extend this approach to other archaic-modern human introgression events, we generated deep genome sequences of 21 people from populations with substantial Denisovan ancestry: 16 Papua New Guineans, 2 Bougainville Islanders, and 3 aboriginal individuals from Australia. We also extend our method to infer Neandertal and Denisovan local ancestry in these populations. We test whether the same evidence for hybrid male sterility is observed in this introgression event as is observed between Neandertals and modern humans.
Fine Atlas of Natural Selection in Human Genome
Hang Zhou
1, Sile Hu1, Rostislav Matveev2, Kun Tang1
1CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Science, Shanghai 200031, China, 2Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
In the last few years a number of genome-wide scans for signals of recent natural selection have been done for the human genome. These studies strongly furthered our understanding of recent human evolution. Nonetheless, some key issues were left largely un-solved. In this study, several coalescent based likelihood tests were developed to collectively assign all genome fragments to modes of neutrality, negative, balancing or positive selection, and simultaneously estimate the selection time and coefficient. Simulations revealed that this workflow was powerful towards various non-neutral evolutions, while remaining highly robust against demographic factors. Here we report a fine atlas of natural selection in the human genome through analyze the 1000 Genomes data. Several hundreds of regions undergone positive selection and a bunch of regions undergone negative and balancing selection were detected. We did functional annotation for genes undergone selection in various categories. Genes were enriched in certain functional groups. And we found that there is high heterogeneity of selection time of positive selection genes in different functional categories. We also evaluated the selection pressures in ENCODE predicted regulatory elements. The selection pressure in promoter regions was the highest, whereas introns and repressed or low-activity regions showed obviously lower influence of selection. Spatial distribution revealed that TSS and CDS clearly centered in the selection signals. Given the fine resolution of the selection signals, we are in the process of understanding the different selection pressures our ancestors have encountered during the course of recent migration, local adaptation and social transitions.
Inference of selection using extended haplotype homozygosity on polygenic traits
Angeles de Cara, Frederic Austerlitz
Museum National d'Histoire Naturell, Paris, France
The fast-growing amount of genome-wide polymorphism data
available has led to considerable efforts for developing methods to
detect the footprints of natural selection at the molecular level.
Finding regions under selection is one of the first steps to understand
the process of adaptation and speciation. Our ability to detect
selection at the molecular level depends critically on the type of data
available and on the robustness of the methods to the underlying
assumptions. Several commonly used methods consist in looking for FST
outlier loci, which are considered to be under selection. However, it
has been shown that these methods fail to clearly identify loci under
weak selection. Conversely, some neutral markers can be inferred to be
under selection (false positives). We study here the efficiency of a
recent method to infer selection, iHS, in simulated data where we
perform artificial selection on a polygenic trait under several genetic
architectures. This iHS method is based on the idea that positive
selection on a given position in the genome will create a region of
extended homozygosity around this position. Our results show that this
method seems to only work when selection is strong and acts on a single
locus, while it fails to identify loci under selection when selection
acts simultaneously on many loci.Museum National d'Histoire Naturell, Paris, France
Soft shoulders ahead: on the problem of differentiating between hard and soft sweeps
Daniel Schrider
1, Mendes Fabio2, Matthew Hahn2, Andrew Kern1
1Department of Genetics, Rutgers University, Piscataway, NJ, USA, 2Department of Biology, Bloomington, IN, USA
Characterizing the nature of the adaptive process at the
genetic level is a central goal for population genetics. In particular,
we know little about the sources of adaptive substitution. Historically,
population geneticists have focused attention on the hard sweep model
of adaptation in which a de novo beneficial mutation arises and
rapidly fixes in a population (e.g. Maynard Smith and Haigh 1974).
Recently more attention has been given to soft sweep models, in which
alleles that were previously neutral, or nearly so, drift until such a
time as the environment shifts and their selection coefficient changes
to become beneficial (e.g. Hermisson and Pennings 2005). It remains an
active and difficult problem however to tease apart the tell-tale
signatures of hard vs. soft sweeps in genomic polymorphism data. Through
extensive simulations of hard and soft sweep models, we show that
indeed the two might not be separable through the use of univariate
summaries of the site frequency spectrum or a recent class of haplotype
based statistics that has been introduced. In particular it seems that
recombination in regions linked to, but distant from, sites of hard
sweeps can create a patterns of polymorphism that closely mirror what is
expected to be found near soft sweeps. We show that regions flanking
hard sweeps also resemble partial sweeps, where an allele has begun
sweeping to high frequency but not reached fixation. This problem of
“soft shoulders” suggests that we currently have only a very limited
ability to differentiate soft vs. hard vs. partial sweep scenarios in
molecular population genomics data. We propose an approach that can
distinguish these “shoulders” from true targets of selection.1Department of Genetics, Rutgers University, Piscataway, NJ, USA, 2Department of Biology, Bloomington, IN, USA
Signatures of selection surrounding large insertions and
deletions in coding regions identified in the hominid lineage
genome-wide.
Wilfried Guiblet
1
,2, Kai Zhao3, Daysha Ferrer-Torres1, Christina Ruiz-Rodriguez1
,3, Alfred Roca3, Steven Massey4, Juan Martinez-Cruzado1, Taras Oleksyk1
1University of Puerto Rico at Mayaguez, Puerto Rico, Puerto Rico, 2IBIOS Graduate Program Option In BioInformatics and Genomics, The Huck Institute of Life Sicences, Pennsylvania State University, University Park, Pennsylvania, USA, 3Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA, 4Department of Biology, University of Puerto Rico at Rio Piedras, San Juan, Puerto Rico, Puerto Rico, 5Caribbean Genome Center, Biology Department, University of Puerto Rico at Mayaguez, Mayaguez, Puerto Rico, Puerto Rico
Genes have highly conserved sequences and usually show very few
differences between closely related species, such as human and nonhuman
primates. In this study, we focused on >10 bp insertions and
deletions (indels) found when comparing modern human, chimpanzee,
gorilla, orangutan, and rhesus macaque reference genome sequences, with
the purpose of testing indel flanking regions for the signatures of
selection. From 36,422 indels identified by comparing reference genomes
pairwise, we chose 151 indels within coding regions because of the
potentially high impact on protein sequence. Twenty-two of these
fragments had been earlier validated in the laboratory by PCR and
electrophoresis to distinguish real features from computational
artifacts. Ka-Ks values for the genes containing each of these fragments
were computed in pairwise comparison across the hominid lineage. We
also searched for and identified indels within candidate chromosomal
regions showing signals of positive selection, i.e., displaying
unusually low multilocus heterozygosity and high divergence (FST) in
pairwise comparisons between populations or continental groups from the
Human Genome Diversity Panel (HGDP). The comparisons were performed on
populations that geographically were placed along the modern human
migration routes of the Out of Africa event. Our findings were evaluated
against random expectations by a resampling method, where exactly the
same procedures and tests were performed with a dataset of randomly
positioned indels matched by size, distributed across the human
reference genome. The genes examined in our study may have been shaped
by selection in the human or other primate lineages, thus adding to our
understanding of recent human evolution. Some of these may reflect
adaptation to disease, and enable discoveries in future biomedical
studies.1University of Puerto Rico at Mayaguez, Puerto Rico, Puerto Rico, 2IBIOS Graduate Program Option In BioInformatics and Genomics, The Huck Institute of Life Sicences, Pennsylvania State University, University Park, Pennsylvania, USA, 3Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA, 4Department of Biology, University of Puerto Rico at Rio Piedras, San Juan, Puerto Rico, Puerto Rico, 5Caribbean Genome Center, Biology Department, University of Puerto Rico at Mayaguez, Mayaguez, Puerto Rico, Puerto Rico
Genetic Origins of Lactase Persistence and the Spread of Pastoralism in Africa
Alessia Ranciaro
1, Michael C. Campbell1, Jibril B. Hirbo1, Wen-Ya Ko1, Alain Froment2, Paolo Anagnostou3, Maritha J. Kotze4, Muntaser IbraIbrahimhim5, Thomas Nyambo6, Sabah A. Omar7, Sarah A. Tishkoff1
,8
1University of Pennsylvania, Philadelphia, PA, USA, 2UMR 208, IRD-MNHN, Musée de l’Homme, 75116 Paris, France, 3Dipartimento di Biologia Ambientale, Universita’ La Sapienza, Roma, Italy, 4Division of Anatomical Pathology, Department of Pathology, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, 7505, South Africa, 5Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, 15-13 Khartoum, Sudan, 6Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania, 7Kenya Medical Research Institute, Centre for Biotechnology Research and Development, 54840-00200 Nairobi, Kenya, 8Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
In humans the ability to digest the sugar in milk,
lactose, declines after weaning because of decreasing levels of the
enzyme lactase-phlorine hydrolase (LPH) coded for by the LCT
gene. However, some individuals maintain the ability to digest lactose
into adulthood (known as lactase persistence (LP)). It is thought that
selection has played a major role in maintaining this
genetically-determined trait (LP) in different human populations who
practice pastoralism. In order to identify novel variants associated
with the LP trait and study its evolutionary history in Africa, we
sequenced introns 9 and 13 of the MCM6 gene, and ~2 kb of the LCT
promoter region in 819 individuals from 63 African populations and in
154 non-Africans from 9 populations. We also genotyped 4 microsatellites
in an ~198 kb region in a subset of 252 individuals to reconstruct the
origin and spread of LP-associated variants in Africa. Additionally, we
performed genotype-phenotype association analyses in 513 individuals
from 50 eastern African populations. We confirm the association between
the LP trait and three common variants in intron 13 (C -14010, G -13907
and G -13915). Furthermore, we identified two additional SNPs in intron
13 and in the promoter region (G -12962 and T -956, respectively)
associated with LP. Using a test of long range linkage disequilibrium
(LD), we detected strong signatures of recent positive selection in East
African populations and in the Fulani from Central Africa. In addition,
haplotype analysis supports an East African origin of the C-14010
LP-associated mutation in southern Africa.1University of Pennsylvania, Philadelphia, PA, USA, 2UMR 208, IRD-MNHN, Musée de l’Homme, 75116 Paris, France, 3Dipartimento di Biologia Ambientale, Universita’ La Sapienza, Roma, Italy, 4Division of Anatomical Pathology, Department of Pathology, Faculty of Health Sciences, University of Stellenbosch, Tygerberg, 7505, South Africa, 5Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, 15-13 Khartoum, Sudan, 6Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania, 7Kenya Medical Research Institute, Centre for Biotechnology Research and Development, 54840-00200 Nairobi, Kenya, 8Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
The Genetic Architecture Of Skin Pigmentation In Southern Africa
Alicia R Martin
1, Julie M Granka2, Christopher R Gignoux1, Marlo Möller3, Cedric J Werely3, Jeffrey M Kidd4, Marcus W Feldman2, Eileen G Hoal3, Carlos D Bustamante1, Brenna M Henn1
,5
1Genetics Department, Stanford University, Stanford, CA, USA, 2Department of Biological Sciences, Stanford University, Stanford, CA, USA, 3Division of Molecular Biology and Human Genetics, Stellenbosch University, Tygerberg, South Africa, 4Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA, 5Department of Ecology and Evolution, SUNY Stony Brook, Stony Brook, NY, USA
Skin pigmentation is one of the most recognizably diverse
phenotypes in humans across the globe, but its highly genetic basis has
mainly been studied in northern European and Asian populations. The
Eurasian pigmentation alleles are among the most differentiated variants
in the genome, suggesting strong positive selection for light skin
pigmentation. Light skin pigmentation is also observed in the far
southern latitudes of Africa, among KhoeSan hunter-gatherers of the
Kalahari Desert and other populations. The KhoeSan hunter-gatherers are
among the oldest human populations, believed to have diverged from other
populations 100,000 years ago, and maintain extraordinary levels of
genetic diversity. It is unknown whether light skin pigmentation
represents convergent evolution or the ancestral human phenotype. We
have collected ethnographic information, pigmentation phenotypes, and
genotype data from 136 individuals in the ≠Khomani San from the
Kalahari. To understand the genetic basis for light skin pigmentation,
we have also exome sequenced 84 ≠Khomani San individuals to high
coverage, generating one of the largest indigenous African exome
datasets sequenced outside of the 1000 Genomes Project. Because linkage
disequilibrium decay is rapid in this population, we have assessed
parameters influencing phasing and imputation accuracy empirically using
sequencing data from two full genomes since ideal reference panels do
not exist. We have also pursued multiple genotype/phenotype mapping
methods, including a mixed model approach, admixture mapping, and
linkage mapping. After controlling for admixture from European and
Bantu-speaking populations, we find that globally common variants are
not significantly associated with pigmentation. Rather, our results
indicate that there are a multitude of rare variants in known
pigmentation genes, and suggest that previously unidentified genes
acting in canonical pigmentation pathways are involved. Our results
highlight the strength of diverse population studies to explain
phenotypic variation in the context of human evolutionary history.1Genetics Department, Stanford University, Stanford, CA, USA, 2Department of Biological Sciences, Stanford University, Stanford, CA, USA, 3Division of Molecular Biology and Human Genetics, Stellenbosch University, Tygerberg, South Africa, 4Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA, 5Department of Ecology and Evolution, SUNY Stony Brook, Stony Brook, NY, USA
The distribution of effects of mutations
Thomas Lenormand
CEFE - CNRS, Montpellier, France
Mutations and their diversity of effects is the fuel of
Evolution. Yet, most population genetics models ignore this statement
and its consequences. It is extremely frequent in these models to
consider a single class of mutations (e.g. deleterious recessive
mutations). Of course all types of mutations occur simultaneously and it
is difficult to ignore this reality if we want to make quantitative
predictions in evolution. The difficulty is to describe in a general and
realistic way how the effect of mutations varies. In addition the
effect of mutations comprises a large array of different problems, among
which are some that raised the longest and fiercest controversies in
evolutionary biology. The debate over dominance is perhaps emblematic in
this respect. There are currently different approaches to predict the
effect of mutations (physiological theory, canalization theory, extreme
value theory, mutational landscape theory). In this talk, I will focus
on mutation models based on a fitness landscape approach. I will present
the rationale of this approach and the different predictions that can
be made using this framework. I will then survey the current data to
confront this theory and finish by presenting how this theory may be
extended.CEFE - CNRS, Montpellier, France
Cultural transmission of reproductive success: a strong evolutionary force that shapes genetic diversity.
Evelyne Heyer1, Jean-Tristan Brandenburg1
,2, Michela Leonardi1, Patricia Balaresque3, Bruno Toupance1, Tatyana Hegay4, Almaz Aldashev5, Frederic Austerlitz
1
1CNRS/MNHN/P7 UMR7206, Paris, France, 2INRA/CNRS UMR 0320/UMR 8120, Moulon, France, 3CNRS/Univ Toulouse UMR5288, Toulouse, France, 4Academy of Science, Tachkent, Uzbekistan, 5Academy of Science, Bishkek, Kyrgyzstan
One of the specificities of our species, as acknowledged for a
long time by anthropologists, is to live in an extremely wide range of
social organizations defined mainly by alliance rules, matrimonial
systems, residence rules and descent rules*. The hint that social
organization should be taken into account when studying genetic
diversity came mainly from comparisons between mitochondrial DNA (mtDNA)
and Y-chromosome genetic diversity. Initially, it was proposed that
sex-specific behaviours, and particularly differences in migration rates
between men and women due to residence rules, may explain differences
in Y-chromosome diversity versus mtDNA diversity. More recently it has
been shown that the differences in diversity and differentiation levels
between the different genetic systems (X, Y, mtDNA and autosomes) could
not be explained only by differences between male and female migration
rates, but also by differences between male and female effective
population sizes.1CNRS/MNHN/P7 UMR7206, Paris, France, 2INRA/CNRS UMR 0320/UMR 8120, Moulon, France, 3CNRS/Univ Toulouse UMR5288, Toulouse, France, 4Academy of Science, Tachkent, Uzbekistan, 5Academy of Science, Bishkek, Kyrgyzstan
We hypothesized that the mechanism by which such reduction in effective population size can be reached is Cultural transmission of reproductive success. Building on our previous theoretical work that showed that CTRS can reduce profoundly effective population size, and on a method that we have designed to detect such transmission from current DNA sequence polymorphism datasets, we tested formally the extent to which CTRS reduces genetic diversity in Central Asia, where we have previously demonstrated the occurrence of sex-specific reduction in effective population size: male effective size is much smaller than its female counterpart.
We used mtDNA and Y-chromosome genetic data to infer male and female transmission of reproductive success in 19 Turkic and Indo-Iranian populations from Central Asia known for their contrasted social organisations. Both societies are patrilocal and mildly polygynous, but Turkic populations have a patrilineal descent, while Indo-Iranian populations have a cognatic descent.
Our results show that patrilinearity impacts genetic diversity through cultural transmission of reproductive success. This clearly demonstrates the impact of social organization on human biological evolution. Moreover, notwithstanding the fact that our genetic approach clearly shows that there is a strong male bias transmission of reproductive success in patrilineal societies, it also formally demonstrates that cultural transmission of reproductive success could be a major evolutionary force. Indeed, it reduces within-population genetic diversity and increases among-population differentiation, the two key components for the evolution of cooperation.
Linking subsistence strategy, learning practices and demography
Laurel Fogarty, Nicole Creanza, Marcus W. Feldman
Stanford University, California, USA
Human populations vary demographically with population
sizes ranging from small groups of hunter-gatherers with less than fifty
individuals to vast cities containing many millions. Here we investigate
how the cultural transmission of traits affecting survival, fertility,
or both can influence the birth rate, age structure, and asymptotic
growth rate of a population. We show that, in a simple model with just
three age classes, the strong spread of such a trait can lead to a
demographic transition, similar to that experienced in Europe in the
late 19th and early 20th centuries, without using ecological or economic
optimizing models. We also show that population subsistence and
learning strategies can be linked using a more realistic model with five
age classes, and can explain some demographic data on modern
hunter-gatherer and small scale farming populations.Stanford University, California, USA
We investigate the roles of vertical, oblique, and horizontal learning of a fitness-altering cultural trait and find that, compared to vertical learning alone, horizontal and oblique learning can accelerate the trait’s spread, lead to faster population growth, and increase its equilibrium frequency.
Genome-wide analysis of Oceanian ancestry
Ana T. Duggan
1, David Reich2
,3, Mark Stoneking1
1Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 2Harvard Medical School, Boston, USA, 3Broad Institute, Boston, USA
The history of Oceania, as inferred from
archaeological, linguistic and genetic evidence, points to two major
human expansions through the region. It seems that the first human
settlers arrived in New Guinea and Australia, then joined as the
continent of Sahul, more than 40 thousand years ago and spread to the
Bismarck Archipelago and other nearby islands but did not spread widely
through the Solomon Islands. Present day populations believed to be
descendent of these initial settlers speak very diverse languages of
apparently great time depth (referred to collectively as Papuan),
practice patrilocality and tend to have darker skin pigmentation. The
second wave of human expansion arrived with the Austronesians
approximately 3.5 thousand years ago and touched almost all of Near
Oceania before spreading further into the Pacific and settling Remote
Oceania. The Austronesians brought with them a single proto-language
which has diversified into a group of closely related languages,
possessed a distinctive pottery style, were likely matrilocal and their
descendants have a more Asian phenotype. MtDNA and Y-chromosome
indicated that Papuan-speaking and Austronesian-speaking populations did
admix extensively in Near Oceania and that the mixture appears to have
been sex-biased. Maternal ancestry of putative Asian origin is high,
even within Papuan speaking populations, and yet Remote Oceanian
populations show high levels of Y-chromosomes of Near Oceanian origin.
While some studies of genome-wide short tandem repeats or polymorphisms
have been conducted they have been restrict to populations from New
Guinea and Polynesia who likely represent population extremes. Here we
analyse genome-wide SNP data, collected on the Affymetrix Human Origins
array, from approximately 300 samples from 40 populations across
Southeast Asia and Near and Remote Oceania. We are using these data to
attempt to elucidate the genetic structure of Papuan-speaking and
Austronesian-speaking groups, including the time and extent of admixture
between them, to better understand the dynamics of population contact
which lead to the distinctive pattern of uniparental inheritance but
also maintained two very different language groups and cultures within
Oceania.1Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 2Harvard Medical School, Boston, USA, 3Broad Institute, Boston, USA
Genes mirror subsistence in prehistoric Europe
Mattias Jakobsson
Uppsala University, Uppsala, Sweden
The Neolithic transition swept over Europe after the invention of
farming some 11,000 years ago in the Near East to reach its northern
fringe some 6,000 years ago. Genomic information from ancient human
remains is beginning to show its full potential for learning about the
human demographic history, including the debated agricultural
transition. We generate and investigate low- to medium-coverage genomic
data (up to 2.2x coverage) from several Stone-Age Scandinavian and
Iberian individuals, including 10 Scandinavian 5,000 year old
individuals from farming and hunter-gatherer groups, a 7,500 year old
Mesolithic individual from the same region as the Scandinavian
hunter-gatherers, and 5 Iberian 5,000 year old individuals from a farmer
group. The Stone-Age Scandinavian individuals show remarkable
population structure corresponding to their material culture association
and the farmers are genetically most similar to extant southern
Europeans, contrasting sharply to the hunter-gatherers whose genetic
signature is unique, but closest to extant northern Europeans. The
genomic make-up of present-day Scandinavians is intermediate between the
two Neolithic groups suggesting that extensive admixture - perhaps
around the time of the disappearance of the hunter-gatherer lifestyle -
eventually shaped the patterns of variation. Similarly, Iberian farmers
show affinities to modern-day southern Europeans, especially to
Sardinians, in contrast to the published 7,000 year old Iberian
hunter-gatherer from La Brana that is genetically more close to
current-day northern Europeans. Notably, these similarities to
Sardinians seem to be stronger than to the contemporary population of
Spain, which suggests complex changes in genetic ancestry of Iberians
during the last 7,000 years. The pattern of genetic variation in
Stone-Age Europe contrasts to current-day patterns that mirror the
individuals' geographic sampling locations. We further estimate genetic
diversity within the groups and show that diversity was lower among the
hunter-gatherers compared to the farmers suggesting smaller population
size for the hunter-gatherers, perhaps related to a lower carrying
capacity associated with hunting and gathering lifestyles. These
findings show that lifestyle may be the major determinant of genetic
similarity and diversity in pre-historic Europe rather than geography as
in modern-day Europe, which illuminate the impact of the agricultural
revolution.Uppsala University, Uppsala, Sweden
Biocultural Analysis of Variation in Blood Pressure among
African Americans in the Health Equity Alliance of Tallahassee (HEAT)
Heart Health Study
Laurel N. Pearson, Sarah M. Szurek, Clarence C. Gravlee, Connie J. Mulligan
University of Florida, Gainesville, FL, USA
Disparities in health and risk of disease are of significant
interest in the United States where African Americans experience some of
the poorest health outcomes and greatest burden of chronic diseases.
Large efforts have been undertaken to identify genetic factors
contributing to differential risk and outcomes experienced across
American populations; however, population structure created over
centuries through immigration, migration, and admixture has added
complexity to simple genetic analysis of health disparities.
Additionally, the effect of cultural variability and the interaction of
these with underlying genetic factors is poorly understood and only
rarely sufficiently considered. Well-designed interdisciplinary research
that incorporates genetic and socio-cultural factors and their
interactions will be critical to understanding and addressing health
disparities, especially for complex phenotypes such as hypertension.University of Florida, Gainesville, FL, USA
The Health Equity Alliance of Tallahassee (HEAT) Heart Health Study is a community-based participatory research (CBPR) design that engages community members in the planning and collection of research data. A primary aim of HEAT is to investigate the socio-cultural factors that contribute to disparity in health status, especially in regard to cardiovascular phenotypes. Extensive cultural survey data targeted at understanding neighborhood environment, socioeconomic status, exposure to discrimination, and other stressors as well as phenotypic measures of body composition and blood pressure were completed on 165 African American research participants from economically diverse neighborhoods in Tallahassee, Florida. DNA derived from saliva samples was collected and genotyped on a custom Affymetrix Axiom array to assay a large panel of ancestry informative markers for assessment of genomic admixture and to perform genomic admixture mapping (3,600 AIMs). Additionally, this array includes SNPs in previously reported candidate genes for blood pressure, stress, and skin pigmentation (over 25,000 candidate SNPs). This work aims to address the complex interplay of genetic influences, including candidate genes and genetic ancestry, and socio-cultural factors, such as stress caused by perceived discrimination and community support, on blood pressure variation. We have previously shown that genetic contributions to variation in blood pressure phenotypes are modified by the inclusion of socio-cultural data. The more detailed study design made possible by this interdisciplinary CBPR study reveals the complex interplay of the genome and culture in contributing to health disparities in complex phenotypes.
Parallel trajectories of genetic and linguistic admixture in Cape Verdean Kriolu speakers.
Paul Verdu
1, Ethan Jewett2, Trevor Pemberton3, Noah Rosenberg2, Marlyse Baptista4
1CNRS/MNHN/Univ. Paris Diderot/Sorbonne Paris Cite, Paris, France, 2Stanford University, Department of Biology, Stanford, CA, USA, 3University of Manitoba,Department of Biochemistry and Medical Genetics, Winnipeg, MB, Canada, 4University of Michigan, Departments of Linguistics & Afroamerican and African Studies, Ann Arbor, MI, USA
Starting in the 15th Century, European colonization of Africa and
the Atlantic Slave Trade brought together populations of European and
African origin on the islands of Cape Verde, giving rise to an admixed
population. The ways in which the different waves of migration and major
sociohistorical events such as the abolition of slavery influenced the
admixture process, and their impacts on the resulting genetic and
cultural diversity in this population, remain largely unknown. To study
the cultural and demographic history of the Cape Verdean population, we
investigated patterns of genetic and linguistic diversity among 44
unrelated Cape Verdean individuals. Genetic data consisted of genotypes
at ~2.5 million genome-wide SNPs and linguistic data of spontaneous
speech in Cape Verdean Creole (Kriolu) provided by each subject. We
found that individual speech patterns across Cape Verdean Kriolu
speakers was significantly correlated with pairwise levels of
allele-sharing dissimilarities, as well as with the birthplaces of
individuals and their parents. Individual levels of African genetic
admixture were significantly positively correlated with the number of
words of putative African origin used by each individual. These results
suggest that genetic and linguistic admixture followed parallel
evolutionary trajectories in the Cape Verdean archipelago, and they
provide a basis for combining genetic and linguistic information to
reconstruct the complex admixture processes that have shaped the
cultural and biological diversity of Cape Verde. To our knowledge, this
work is the first joint analysis of genetic and cultural variation
within a single population of individuals sharing a common, mutually
intelligible language.1CNRS/MNHN/Univ. Paris Diderot/Sorbonne Paris Cite, Paris, France, 2Stanford University, Department of Biology, Stanford, CA, USA, 3University of Manitoba,Department of Biochemistry and Medical Genetics, Winnipeg, MB, Canada, 4University of Michigan, Departments of Linguistics & Afroamerican and African Studies, Ann Arbor, MI, USA
Copy number variation evolution and human disease traits.
James R Lupski
1
,2
1Baylor College of Medicine, Houston, TX, USA, 2Texas Children's Hosptial, Houston, TX, USA
Whereas Watson-Crick DNA base pair changes have long been
recognized as a mechanism for mutations, rearrangements of the human
genome including deletions, duplications, inversions and complex
combinations thereof have been appreciated only more recently as a
significant source for human genetic variation. Diseases that result
from DNA rearrangements have been referred to as genomic disorders
[Lupski, JR (2009) Genomic disorders ten years on. Genome Medicine 1:42.1-42.11].
Rearrangements associated with genomic disorders can be recurrent, with
breakpoint clusters resulting in a common sized deletion/duplication,
or nonrecurrent and of different sizes. Nonallelic homologous
recombination (NAHR) is a major mechanism for recurrent rearrangements,
whereas nonhomologous end-joining (NHEJ) can be responsible for
non-recurrent rearrangements. Genome architectural features consisting
of low-copy repeats (LCRs), also called segmental duplications, can
stimulate and mediate NAHR. There are positional hotspots for the
crossovers within the LCRs. Complex rearrangements can occur by FoSTeS -
Fork Stalling and Template Switching. A newer model,
microhomology-mediated break-induced replication or MMBIR, provides
further molecular mechanistic details and may be operative in all life
forms as a means to process one-ended, double-stranded DNA generated by
collapsed forks. Rearrangements introduce variation into our genome for
selection to act upon and as such serve an evolutionary function for
our genome analogous to base pair changes for genes. Genomic
rearrangements may result in CNV that range in size from 100s to
millions of base pairs and include single exons, whole genes, or genomic
segments encompassing many genes or no genes at all! They can be
complex such as DUP-TRI/INV-DEL; the latter stimulated by inverted
repeats. CNV can cause Mendelian diseases and complex traits such as
obesity and neurobehavioral phenotypes. The mechanisms by which
rearrangements convey phenotypes are diverse and include gene dosage,
position effects, unmasking of coding region mutations (cSNPs) or other
functional SNPs, creating gain-of-function fusion genes at the
breakpoints, and perhaps through effects of transvection. De novo
genomic rearrangements cause both chromosomal and Mendelian disease, as
well as sporadic traits, but our understanding of the extent to which
genomic rearrangements, gene CNV, and/or gene dosage alterations are
responsible for common and complex traits remains rudimentary.1Baylor College of Medicine, Houston, TX, USA, 2Texas Children's Hosptial, Houston, TX, USA
The presence of convergent evolution suggests adaptive roles for genetic variants contributing to the human addictions
Christina Barr
1, Carlos Driscoll1, Stephen Lindell1, Kevin Blackistone1, Stephen Suomi2
1NIH/NIAAA, Section of Comparative Behavioral Genomics, Rockville, MD, USA, 2NIH/NICHD, Section of Comparative Ethology, Poolesville, MD, USA
The neurobiological systems that influence addiction
vulnerability in humans may do so by acting on reward pathways,
behavioral dyscontrol, and vulnerability to stress. In certain
instances, genetic variants that are functionally similar or orthologous
to those that moderate risk for human psychiatric and addictive
disorders are maintained across species, and some of our studies have
suggested there to be convergent evolution or allelic variants being
under selection across primates. We have also shown that the rhesus
macaque (Macaca mulatta) is useful for learning how relatively
common genetic variants, which are associated with traits that may be
adaptive in certain environmental contexts, can also increase
vulnerability to behavioral pathology and alcohol preference. Genomics
approaches can be used to home in on convergences in genetic variations
that promote species-specific behaviors (fixed differences that have
undergone purifying selection) and variable behavioral strategies that
appear to be selected in multiple species (often as a result of
balancing selection). We wanted to perform whole exome sequencing to
identify coding polymorphisms that correlated with individual
differences in behavior in the rhesus macaque. While there are no
commercially available reagents for performing whole exome sequencing in
nonhuman primates, a human whole exome platform is available. Given
that there would likely be more purifying selection in coding regions
and, therefore, less interspecific variation, we used a human exome
capture and sequencing kit for performing exome sequencing for rhesus
macaque subjects that differed in their levels of impulsivity and
aggression. As genetic variation observed in domestic animals may be
powerful for looking at genetic factors that enabled domestication as
well as the reversal of some of those traits through more recent
artificial selection, whole exome sequencing for individuals from
several domestic animal species (canids, felids and equids) was
performed in parallel. I will describe the types of variation we
identified using these approaches and will illustrate how the genes in
which we find functionally similar genetic variation overlap with those
that predict vulnerability to human psychopathology and the addictions.
The presented findings will be discussed in the context of how the high
prevalence of addiction risk alleles in some populations of humans may
be rooted in the fact that the same variants contribute to potentially
adaptive traits in the absence of environmental stressors, recreational
drugs and alcohol.1NIH/NIAAA, Section of Comparative Behavioral Genomics, Rockville, MD, USA, 2NIH/NICHD, Section of Comparative Ethology, Poolesville, MD, USA
Patterns of ancient selection in modern humans around candidate sites
Fernando Racimo
1
,2, Martin Kuhlwilm2, Montgomery Slatkin1
1University of California - Berkeley, Berkeley, CA, USA, 2Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
1University of California - Berkeley, Berkeley, CA, USA, 2Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Though the recent sequencing of the high-coverage Denisovan and Neanderthal genomes has allowed us to find the genetic differences that set modern humans apart from archaic humans, the subset of such changes that rose to fixation due to selection is currently unknown. In this study, we look for patterns of positive selection on the modern human lineage at various classes of putatively functional changes using diversity scaled by divergence, as has been done previously on the human lineage since the split from chimpanzees. We also develop an approximate Bayesian computation (ABC) approach incorporating various statistics aimed at identifying ancient patterns consistent with selection around a candidate site. We fail to find an enrichment for signals of positive selection around nonsynymous changes relative to synonymous changes. It has been argued that the failure to detect this difference in changes on the human lineage may be due to varying levels of background selection which occlude the signal of positive selection. Indeed, when we control for the intensity of background selection (BS), we observe a significant difference between nonsynonymous changes in regions of low BS and matching regions of the genome, lending support to this hypothesis. We also identify a slight enrichment for positive selection at splice site changes.
Biased gene conversion skews allele frequencies in human populations, increasing the disease burden of recessive alleles
Joseph Lachance, Sarah Tishkoff
University of Pennsylvania, Philadelphia, PA, USA
Gene conversion results in the non-reciprocal transfer of genetic
information between two recombining sequences, and there is evidence
that this process is biased towards G and C alleles. Using
high-coverage whole genome sequences of African hunter-gatherers, other
human populations, and primate outgroups we quantified the effects of
GC-biased gene conversion (gBGC) on population genomic datasets. We
find that genetic distances (Fst and population branch statistics) are
modified by gBGC. In addition, the site frequency spectrum is
left-shifted when ancestral alleles are favored by gBGC and
right-shifted when derived alleles are favored by gBGC. Allele
frequency shifts due to gBGC mimic the effects of natural selection.
Summary statistics of site frequency spectra (Tajima’s D, Fay and Wu’s
H, and mean derived allele frequency) depend strongly on whether alleles
are favored by gBGC. These effects are strongest in high recombination
regions of the human genome. By comparing the site frequency spectra
of unbiased and biased sites the strength of gene conversion was
estimated to be on the order of Ne*b=0.009. We also find that derived
alleles favored by gBGC are much more likely to be homozygous than
derived alleles at unbiased SNPs (+42.2% to 62.8%). This results in a
"curse of the converted", whereby recessive alleles have an increased
disease burden. Taken together, our findings reveal that GC-biased gene
conversion has important population genetic and public health
implications.University of Pennsylvania, Philadelphia, PA, USA
Coalescence Based Models to Detect Different Types of Selection
Hang Zhou1, Sile Hu
1, Rostislav Matveev2, Kun Tang1
1CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences,Chinese Academy of Science, Shanghai, China, 2Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Detecting signals of natural selection is a central
problem in Population Genetics. Up to date, many mathematical models
have been proposed to describe dynamics of natural selection. Model
based methods have also been proposed to detect signals of selection and
estimate the corresponding parameters. Nevertheless, under scenarios of
varying population size, it is not easy to identify selection event,
because population size changes may result in patterns similar to what
natural selection produced. In addition, there lack powerful methods for
detecting negative or balancing selection. Recently, the pairwise
sequential markovian coalescent (PSMC) method was proposed to estimate
the comprehensive population size trajectory based on genome sequencing
data, which estimates the pairwise time of most recent common ancestors
(TMRCA) at the same time. We found that the TMRCA estimates could be
used to reconstruct the local coalescent trees across the whole genome.
Therefore methods can be directly constructed on the coalescent data to
detect natural selection and to infer the corresponding parameters. The
coalescent trees were first converted to coalescent time scale, by
rescaling against a fine population size trajectory estimated by PSMC.
The resulting standard coalescent distribution is therefore independent
of the effective population size changes. Three coalescent models were
constructed to describe various evolutionary scenarios, namely H0, H1 and H2. H0 is null hypothesis under assumption of neutral coalescent; H1
is one parameter hypothesis under assumption that the coalescent rate
changed with a constant scale , therefore devoting to the cases of
negative or balancing selection. H2 is a five parameters
model, which assumes coalescent rate changes over three consecutive time
intervals. This model tries to capture the different phases of a
positive selection event. Based on H2, we developed an
algorithm to estimate selection coefficient and selection starting time.
Likelihood ratio-tests were constructed to assign the proper model to
any coalescent tree. A large number of simulations showed that our
approach has strong power and high accuracy of estimation for selection
starting time and selection coefficient. We applied this approach to the
whole genome data from the 1000 genome project, and built a fine atlas
map of recent selection signals of whole genome.1CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences,Chinese Academy of Science, Shanghai, China, 2Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Investigating fine-scale population structure between the Nama and Khomani San of South Africa
Caitlin Uren
1, Marlo Moller1, Dean Bobo2, Julie Granka3, Michelle Daya1, Cedric Werely1, Justin Myrick4, Alicia Martin5, Christopher Gignoux5, Brenna Henn2, Eileen Hoal1
1Stellenbosch University, Department of Molecular Biology and Human Genetics, Tygerberg, Cape Town, South Africa, 2Stony Brook University, Department of Ecology and Evolution, Stony Brook, New York, USA, 3Stanford University, Department of Biology, Stanford, California, USA, 4University of California,, Los Angeles, California, USA, 5Stanford University, Department of Genetics, Stanford, California, USA
1Stellenbosch University, Department of Molecular Biology and Human Genetics, Tygerberg, Cape Town, South Africa, 2Stony Brook University, Department of Ecology and Evolution, Stony Brook, New York, USA, 3Stanford University, Department of Biology, Stanford, California, USA, 4University of California,, Los Angeles, California, USA, 5Stanford University, Department of Genetics, Stanford, California, USA
The Cape Coloured population of Cape Town, South Africa (SAC) derives ancestry from multiple, global populations including Europeans and Indonesians. Initial studies also indicated a substantial contribution from the KhoeSan, a diverse group of hunter-gatherers and pastoralists that historically occupied much of southern Africa (Chimusa et al. 2013a, de Wit et al. 2010). The degree of KhoeSan ancestry reflects the role of indigenous KhoeSan in the early establishment of the SAC population (Mountain 2003). Furthermore, we have demonstrated significant evidence of an association between KhoeSan ancestry and Tuberculosis (TB) susceptibility that is not confounded by socio-economic status. It was additionally found that the KhoeSan ancestry component in the SAC seems to contribute to the extreme susceptibility to TB in this admixed population. The southern African KhoeSan fall into two genetic groups, roughly corresponding to the northwestern and southeastern Kalahari, which has been shown to have separated within the last 30,000 years (Pickrell et al. 2013). We collected DNA samples from the Nama along the western coast and the Khomani San from the Kalahari Desert (written informed consent and approval of the Human Research Ethics Committee of Stellenbosch University). SNP genotype data was generated on the Illumina OmniExpress platforms (700k- 1M array) for 120 Khomani San, 25 Cape Coloureds and 13 Nama. Whole genome sequencing data of an additional 106 Nama samples is currently underway by collaboration with the Welcome Trust Sanger Institute. This is to our knowledge the largest genome-wide dataset collected for the purpose of understanding South African genetic diversity. We use principal component analysis, chromopainter and ADMIXTURE to investigate fine sale population structure among these South African groups.
The burden of private mutations is greatly affected by recent explosive human population growth
Alon Keinan, Feng Gao, Elodie Gazave, Li Ma, Diana Chang, Andrew Clark
Cornell University, Ithaca, NY, USA
Human populations have experienced dramatic growth since the
Neolithic revolution. In this study, we modeled how this growth
increases the effective population size and what it entails for the load
of individual private mutations. Recent studies that sequenced large
numbers of individuals observed an extreme excess of rare variants and
provided evidence of recent rapid growth in effective population size,
although estimates have varied greatly among studies. These studies were
based on protein-coding genes, in which variants have been impacted by
natural selection. Hence, we sequenced loci far from genes that meet a
stringent set of criteria designed to ensure that mutations therein are
likely to be neutral. We used high coverage sequencing and 500
individuals of homogeneous European ancestry to capture very rare
variants, and fit an array of recent demographic history models to the
site frequency spectrum. The best-fitting model estimates 3-4% growth
per generation during the last 3000-4000 years, resulting in an
effective population size increase of two orders of magnitude. Our
models fit the data well only after observing that estimates are
impacted by assumptions of ancient demography, which also explains the
discrepancy among previous studies. We next aimed to quantify the effect
of growth and purifying selection on the burden of private mutations
per individual sample, which also translates to the number of new
variants discovered with the sequencing of each new genome. Hence, we
introduced a statistic (%HP) that is defined as the proportion of
heterozygous sequence variants in an individual that are novel with
respect to a sample of sequenced individuals from the same population.
We predicted this quantity for demographic models and estimated it for
different datasets. We observed a significantly higher empirical %HP
compared with models without recent population growth. Incorporating
growth as estimated above provides a much improved fit, a phenomenon
that is more marked as sample size increases, e.g. for a sample of
10,000 individuals, %HP is 0.25% with recent growth, which is 18-fold
higher than that without growth. This implies that 1 in 400 heterozygous
sites in any 10,001st individual is expected to be private, which
amounts to ~6000 variants, or roughly 100 times the number of de novo
mutations. Finally, we report an increase in %HP due to purifying
selection, e.g. it is ~10-fold higher for nonsense mutations compared to
other genic mutations, for which in turn it is higher compared to the
above putatively neutral mutations.Cornell University, Ithaca, NY, USA
Evolution History of ethnic groups from European Russia and Sub-Arctic Transuralic region.
Svetlana Limborska
1, Andrey Khrunin1, Denis Khokhrin1, Dmitry Verbenko1, Diana Gerasimova2, Roman Kuchin2, Vladislav Rabinovich3
1Institute of Molecular Genetics, Russian Academy of Sciences, Moscow 123182, Russia, 2Ugra State University, Khanty-Mansiisk 628012, Russia, 3Ugra Research Institute of cellular technology with stem cell bank, Khanty-Mansiisk 628011, Russia
Understanding the genetic structure of populations is important
both from a historical perspective and for the appropriate design and
interpretation of genetic epidemiological studies. Several studies have
examined the fine-scale structure of human genetic variation in Europe.
However populations of North-Eastern European area and Sub-Arctic
Transuralic region are less investigated. These territories are
inhabited by different indigenous Finno-Ugric people (e.g., Veps, Komi,
Khanty and Mansi) and ethnic Russians.1Institute of Molecular Genetics, Russian Academy of Sciences, Moscow 123182, Russia, 2Ugra State University, Khanty-Mansiisk 628012, Russia, 3Ugra Research Institute of cellular technology with stem cell bank, Khanty-Mansiisk 628011, Russia
To explore genetic structure of the region described we analyzed single nucleotide polymorphism in the populations mentioned above using different versions of Illumina BeadChips. Principal components analysis, ADMIXTURE clustering and Wright's fixation indices (FST) were used to probe genetic variation.
Mansi were indigenous inhabitants of Northern European area till 17th Century AD. This ethnic group has undergone trans-Uralic migration and nowadays inhabits Sub-Arctic Region of Western Siberia. The Khanty, closely related to Mansi by linguistic classification, are the indigenous inhabitants of this region. The Mansi and the Khanty peoples have genomic characteristics that the most distant from all others by presence of different ancestry component.
Komi live in the farthest corner of Northern-Eastern Europe. Based on genomic analysis Komi form separate pole of genetic diversity in northern Europe gene pool. Modern Finno-Ugric minority, the Veps, which is one of the oldest people of northern Europe, still inhabit some territories of northwest Russia, demonstrates genetic similarity both with Finns and Komi.
Russians are the most abundant people in Northern-Eastern Europe. Principal component analysis has shown significant differences between Russians of Northern European region and Russian populations from the central part of Russian Plain. The later Russian populations have formed a single cluster on PC plot. In contrast, Northern Russians have demonstrated close relationships with Veps' population.
In general, our data provide a more complete genetic map of Europe and adjacent Northern area accounting for the diversity in its most eastern and northeastern populations. Furthermore, these data contribute to a better understanding of the population genetic history of present day ethnic groups of the area studied.
A Novel Likelihood Ratio Test for Sex-Biased Demography and the
Effect of Cryptic Sex-Bias on the Estimation of Demographic Parameters
Shaila Musharoff
1, Suyash Shringarpure1, Carlos D. Bustamante1, Sohini Ramachandran2
1Stanford University, Stanford, CA, USA, 2Brown University, Providence, RI, USA
Sex-bias is defined as an unequal number of breeding males and
females in a population. This can be caused by variance in reproductive
success, demographic events involving unequal numbers of males and
females, and/or differential selection at sex-linked genomic loci. A
commonly used estimator of the proportion of females is based on the
test statistic Q where Q is the ratio of neutral genetic diversity
estimated from the X chromosome to that estimated from the autosomes.
This is problematic if the population changed in size: because X
chromosomal diversity recovers from size changes at a different rate
than autosomal diversity due to unequal effective population sizes, this
estimator of the proportion of females will be biased. To this end we
present a novel likelihood ratio test for sex-bias in a single
population based on the Poisson random field model. We use the program
dadi to estimate demographic parameters jointly from X chromosomal and
autosomal data and test first for a persistent sex-bias, and then for a
sex-biased demographic event. Our test has more power to detect sex-bias
from unlinked or partially linked sites than the commonly used test
statistic Q for a range of demographic scenarios. Encouragingly, our
test is well powered for events relevant to human history including
recent rapid expansion whereas the test statistic Q is not.1Stanford University, Stanford, CA, USA, 2Brown University, Providence, RI, USA
In addition to being of fundamental interest, the presence of sex-bias affects demographic inference. Sex-bias, either in the male or female direction, decreases the effective population size of the X chromosome as well as the autosomes of a population. If this reduction in effective population size is unaccounted for, demographic parameters estimates (e.g., bottleneck times or divergence times) will be inflated. We assess the effect of cryptic sex-bias on the estimation of demographic parameters using simulated data. We propose a correction based on the joint inference of demographic parameters from the X chromosome and the autosomes. These analyses give us a more complete picture of the presence and effect of human sex-biased demography and can be easily applied to other organisms.
Statistical Inference of Archaic Introgression In Central African Pygmies
PingHsun Hsieh
1, Jeffrey Wall2, Joseph Lachance3, Sarah Tishkoff3, Ryan Gutenkunst1, Michael Hammer1
1University of Arizona, Tucson, AZ, USA, 2University of California, San Francisco, CA, USA, 3University of Pennsylvania, Philadelphia, PA, USA
Recent evidence from ancient DNA studies suggests that genetic material introgressed from archaic forms of Homo,
such as Neanderthals and Denisovans, into the ancestors of contemporary
non-African populations. These findings also imply that hybridization
may have given rise to some of adaptive novelties in anatomically modern
human (AMH) populations as they expanded from Africa into various
ecological niches in Eurasia. Within Africa, fossil evidence suggests
that AMH and a variety of archaic forms coexisted for much of the last
200,000 years. Here we present preliminary results leveraging high
quality whole-genome data (>60X coverage) for three contemporary
sub-Saharan African populations (Biaka, Baka, and Yoruba) from Central
and West Africa to test for archaic admixture. With the current lack of
African ancient DNA, especially in Central Africa due to its rainforest
environment, our statistical inference approach provides an alternative
means to understand the complex evolutionary dynamics among groups of
the genus Homo.1University of Arizona, Tucson, AZ, USA, 2University of California, San Francisco, CA, USA, 3University of Pennsylvania, Philadelphia, PA, USA
To identify candidate introgressive loci, we scan the genomes of 16 individuals and calculate S*, a summary statistic that was specifically designed by one of us (JDW) to detect archaic admixture. The significance of each candidate is assessed through extensive whole-genome level simulations using demographic parameters estimated by ∂a∂i to obtain a parametric distribution of S* values under the null hypothesis of no archaic introgression. As a complementary approach, top candidates are also examined by an approximate-likelihood computation method. The admixture time for each individual introgressive variant is inferred by estimating the decay of the genetic length of the diverged haplotype as a function of its underlying recombination rate. A neutrality test that controls for demography is performed for each candidate to test the hypothesis that introgressive variants rose to high frequency due to positive directional selection. The present study represents one of the most comprehensive genomic surveys to date for evidence of archaic introgression to anatomically modern humans in Africa.
Tracing the genetic ancestry of enslaved Africans using ancient DNA
Hannes Schroeder1
,2, María C. Ávila-Arcos
1
,4, Pontus Skoglund3, Meredith Carpenter4, Anna Sapfo Malaspinas1, Marcela Sandoval-Velasco1, Jose Víctor Moreno-Mayar1, Morten Rasmussen1
,4, Jay B. Haviser2, Ludovic Orlando1, Antonio Salas5, Carlos Bustamante4, Mattias Jakobsson3, M. Thomas P Gilbert1
1Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark, 2Faculty of Archaeology, Leiden University, Leiden, The Netherlands, 3Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden, 4Center for Computational, Evolutionary and Human Genomics, Stanford, California, USA, 5Instituto de Ciencias Forenses 'Luís Concheiro', Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Between the 16th and 19th centuries, over 12 million Africans
were kidnapped in Africa and transported to the Americas as a result of
the transatlantic slave trade. The captives were taken from
various parts of mainly West and West Central Africa but their precise
origins often remained unknown or were deliberately obscured. In this
study, we sequenced enriched DNA libraries from 17th century remains of
three enslaved Africans, who had died on the Caribbean island of Saint
Martin, in an attempt to trace their ancestral origins in Africa. Our
results show that the three captives, who had been buried together, are
genetically related to different populations in Africa, including Bantu
and non-Bantu speakers. This suggests that they might have originated
from different parts of Africa and reflects upon the nature of the
transatlantic slave trade and its role in shaping the population history
of the Americas.1Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark, 2Faculty of Archaeology, Leiden University, Leiden, The Netherlands, 3Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden, 4Center for Computational, Evolutionary and Human Genomics, Stanford, California, USA, 5Instituto de Ciencias Forenses 'Luís Concheiro', Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Accurate estimates of heterozygosity in 135 diverse human populations
Niru Chennagiri
1
,2, Swapan Mallick1
,2, Nick Patterson2
,1, Susanne Nordenfelt1, Arti Tandon1
,2, Iosif Lazaridis1, Guillermo del Angel2, Gabriel Renaud3, Udo Stenzel3, Brenna Henn4, Antti Sajantila5, Aashish Jha6, Richard Villems15, Michael Hammer8, Andres Ruiz-Linares9, Robert Mahley10, Toomas Kivisild11, Sarah Tishkoff12, Lynn Jorde13, Rem Sukernik14, Mait Metspalu15, Svante Pääbo3, Janet Kelso3, David Reich1
,2, Simons Genome Diversity Project Consortium16
1Harvard Medical School, Boston MA, USA, 2Broad Institute of Harvard and MIT, Boston MA, USA, 3Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 4Stony Brook University, Stony Brook NY, USA, 5University of Helsinki, Helsinki, Finland, 6University of Chicago, Chicago IL, USA, 7Departament de Ciències Experimentals i de la Salut, Barcelona, Spain, 8University of Arizona, Tuscon AZ, USA, 9University College, London, UK, 10University of California San Francisco, San Francisco CA, USA, 11University of Cambridge, Cambridge, UK, 12University of Pennsylvania, Philadelphia PA, USA, 13University of Utah, Salt Lake City Utah, USA, 14Russian Academy of Science Siberian Branch, Novosibirsk, Russia, 15Estonian Biocentre, Tartu, Estonia, 16Simons Foundation, New York, USA
Worldwide human variation studies have established that
heterozygosity (genetic diversity) decreases as a function of geographic
distance from East Africa (Ramachandran et al. PNAS 2005). However,
previous estimates of heterozygosity have been based on microsatellites
or linkage disequilibrium, resulting in numbers that are biased or
limited in their resolution.1Harvard Medical School, Boston MA, USA, 2Broad Institute of Harvard and MIT, Boston MA, USA, 3Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 4Stony Brook University, Stony Brook NY, USA, 5University of Helsinki, Helsinki, Finland, 6University of Chicago, Chicago IL, USA, 7Departament de Ciències Experimentals i de la Salut, Barcelona, Spain, 8University of Arizona, Tuscon AZ, USA, 9University College, London, UK, 10University of California San Francisco, San Francisco CA, USA, 11University of Cambridge, Cambridge, UK, 12University of Pennsylvania, Philadelphia PA, USA, 13University of Utah, Salt Lake City Utah, USA, 14Russian Academy of Science Siberian Branch, Novosibirsk, Russia, 15Estonian Biocentre, Tartu, Estonia, 16Simons Foundation, New York, USA
We have generated whole genome sequences (>30x average) in 280 individuals from 135 worldwide populations, using an identical protocol at a single facility (Illumina, Ltd.). In addition we have built an informatics pipeline geared towards population genetics that eliminates biases in standard pipelines that might confound population genetics analyses.
We compute a maximum likelihood estimate for the population mutation rate (heterozygosity) in each population using mlrho (Haubold et al. Mol. Ecol. 2010). This provides precise information about how heterozygosity varies across diverse worldwide human populations. These data can be used to test more powerfully the extent to which a serial founder model is sufficient to explain the empirically observed decline in heterozygosity with distance from Africa.
Genomes from late hunter-gatherers and an early farmer from Europe reveal three ancestral populations for modern Europeans
Iosif Lazaridis
1
,2, Nick Patterson2, Alissa Mittnik3, Gabriel Renaud4, Swapan Mallick1
,2, Peter H. Sudmant5, Joschua G. Schraiber6, Sergi Castellano4, Karola Kirsanow7, Christos Economou8, Ruth Bollongino7, Mait Metspalu9, Matthias Meyer4, Evan E. Eichler5, Joachim Burger7, Montgomery Slatkin6, Svante Pääbo4, Janet Kelso4, David Reich1
,2, Johannes Krause3, for the Ancient European Genomes Consortium1
,3
1Department of Genetics, Harvard Medical School, Boston, MA, USA, 2Broad Institute, Cambrige, MA, USA, 3Institute for Archaeological Sciences, University of Tübingen, Tübingen, Germany, 4Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 5Department of Genome Sciences, University of Washington, Seattle, WA, USA, 6Department of Integrative Biology, University of California, Berkeley, CA, USA, 7Johannes Gutenberg University Mainz, Institute of Anthropology, Mainz, Germany, 8Archaeological Research Laboratory, Stockholm University, Stockholm, Sweden, 9Estonian Biocentre, Evolutionary Biology group, Tartu, Estonia
We sequenced two ancient Europeans from around the time of the
Neolithic transition: a ~7.5 thousand year old Linear Pottery farmer
from Stuttgart, Germany, and an ~8 thousand year old Mesolithic
hunter-gatherer from the Loschbour rock shelter, Luxembourg. We also
sequenced at lower coverage seven ~8 thousand year old Mesolithic
hunter-gatherers from Motala, Sweden. We co-analyzed the data from these
ancient Europeans with a published ~7 thousand year old Mesolithic
Iberian genome from La Brana-Arintero, Spain, a ~24 thousand year old
Paleolithic Siberian from Mal'ta, Russia, other lower quality ancient
European genomes, and a large dataset of present-day humans genotyped on
the Affymetrix Human Origins Array.1Department of Genetics, Harvard Medical School, Boston, MA, USA, 2Broad Institute, Cambrige, MA, USA, 3Institute for Archaeological Sciences, University of Tübingen, Tübingen, Germany, 4Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 5Department of Genome Sciences, University of Washington, Seattle, WA, USA, 6Department of Integrative Biology, University of California, Berkeley, CA, USA, 7Johannes Gutenberg University Mainz, Institute of Anthropology, Mainz, Germany, 8Archaeological Research Laboratory, Stockholm University, Stockholm, Sweden, 9Estonian Biocentre, Evolutionary Biology group, Tartu, Estonia
Our main findings are: (i) early European farmers were of mainly Near Eastern ancestry but with substantial European hunter-gatherer ancestry; (ii) European hunter-gatherers fall outside extant European variation in the direction of Near Eastern-European differentiation, (iii) most modern Europeans do not appear to be a simple mixture of the early European farmers and hunter-gatherers, but rather to have ancestry from at least three ancestral populations: (i) EEF: early European farmers (like the Stuttgart individual), (ii) WHG: west European hunter-gatherers (like the Loschbour and La Brana individuals), and (iii) ANE: ancient North Eurasians (like the Mal'ta individual). Mediterranean populations like Sardinians most closely resemble EEF individuals, while Baltic populations like Lithuanians most closely resemble WHG individuals.
Unexpectedly, all present-day eastern non-African groups (Oceanians, East Asians, Onge from the Indian Ocean, and Native Americans) are genetically closer to Eurasian hunter-gatherer groups than to the Stuttgart individual. We propose a model of Eurasian prehistory in which EEF possessed a fraction of ancestry from a basal Eurasian population that split off from other Eurasians prior to the split between Eurasian hunter-gatherers and eastern non-Africans.
The Scandinavian Motala hunter-gatherers are the only ancient population showing evidence of ANE ancestry, yet such ancestry is pervasive in present-day populations from both Europe and the Near East. This suggests that ANE ancestry spread across much of West Eurasia after the early Neolithic. Additional migrations from the Near East and East Eurasia affected more limited subsets of Europeans from the Mediterranean and Northeastern Europe respectively.
Our results suggest a dynamic history of the emergence of modern Europeans in which the Neolithic-Mesolithic admixture played a major role, but was supplemented by later admixture processes.
Ancient DNA Insights into the Population History of Seafaring Mid-Holocene Hunter-Gatherers on the Gulf of Maine
Alexander Kim
1
,2, Susanne Nordenfelt1
,2, Nadin Rohland1
,2, Nick Patterson1
,2, Michèle Morgan3, Steven LeBlanc3, David Reich1
,2
1Department of Genetics, Harvard Medical School, Boston, MA, USA, 2Broad Institute of Harvard and MIT, Cambridge, MA, USA, 3Peabody Museum of Archaeology and Ethnology, Harvard University, Cambridge, MA, USA
The Red Paint People, a remarkable manifestation of the Moorehead
Phase (c. 4500-3800 YBP), are an enigmatic pre-Columbian culture of
northeastern North America famed for their distinctive technology and
elaborate, strikingly characteristic ceremonial practices — including
ochre-laden burials, ritual ground-slate bayonets, the hunting of
swordfish and other marine megafauna, and what are potentially the
oldest known tumuli and toggling harpoons ever discovered on the
continent. As one of the earliest maritime cultures on the eastern
seaboard of North America, their extraordinary flowering, abrupt
archaeological disappearance, and situation in a long-range
transportation network of artifacts and raw materials stretching as far
as the Great Lakes evokes numerous questions about seaborne dispersal
capability, coast-interior connectivity, and the extents of genetic
continuity or overturn into and through the Archaic of New England and
Atlantic Canada. We report, for the first time, mitochondrial and
preliminary genome-wide autosomal data from ancient Moorehead Phase
skeletal remains recovered from the Nevin site, a shell midden at Blue
Hill Falls, Maine, and situate this locality and its inhabitants in the
context of earliest North American settlement, patterns of gene flow at
continental and subcontinental scales, and the panorama of social and
ecological specialization by forager populations along Holocene North
America's Atlantic littoral.1Department of Genetics, Harvard Medical School, Boston, MA, USA, 2Broad Institute of Harvard and MIT, Cambridge, MA, USA, 3Peabody Museum of Archaeology and Ethnology, Harvard University, Cambridge, MA, USA
Population history of South America: ancient DNA study of extinct people from Tierra del Fuego
Zuzana Faltyskova
1, Hannes Schroeder2
,3, Carles Lalueza4, Yolanda Espinoza4, Elena Gigli4, Oscar Ramirez4, Alfredo Prieto5
,6, Susana Morano5, David Caramelli7, Elena Pilli7, Alessandra Modi7, Giorgio Manzi7, Alessandro Pietrelli8, Ermanno Rizzi8, Aurelio Marangoni9, Guido Barbujani10, Silvia Ghirotto10, Toomas Kivisild1, Maru Mormina1
,11
1Division of Biological Anthropology, University of Cambridge, Cambridge, UK, 2Centre for GeoGenetics, University of Copenhagen, Copenhagen, Denmark, 3Faculty of Archaeology, Leiden University, Leiden, The Netherlands, 4Institute of Evolutionary Biology, Pompeu Fabra University, Barcelona, Spain, 5Institute of Patagonia, University of Magallanes, Punta Arenas, Chile, 6Autonomous University of Barcelona, Barcelona, Spain, 7Department of Biology, University of Florence, Florence, Italy, 8ITB CNR Institute for Biomedical Technologies, National Research Council, Milan, Italy, 9Department of Environmental Biology, University of Rome La Sapienza, Rome, Italy, 10Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy, 11Department of Archaeology, University of Winchester, Winchester, UK
The details of the early human settlement of the Americas such as
the dispersal time, number of migrations, and migration routes remain
subject to debate. With many Native populations now extinct, the
Pre-Columbian genetic make-up has been partly lost or blurred by recent
admixture. The present study examines the mitochondrial genetic
diversity of extinct Fuegian populations in order to illuminate the
population history of South America.1Division of Biological Anthropology, University of Cambridge, Cambridge, UK, 2Centre for GeoGenetics, University of Copenhagen, Copenhagen, Denmark, 3Faculty of Archaeology, Leiden University, Leiden, The Netherlands, 4Institute of Evolutionary Biology, Pompeu Fabra University, Barcelona, Spain, 5Institute of Patagonia, University of Magallanes, Punta Arenas, Chile, 6Autonomous University of Barcelona, Barcelona, Spain, 7Department of Biology, University of Florence, Florence, Italy, 8ITB CNR Institute for Biomedical Technologies, National Research Council, Milan, Italy, 9Department of Environmental Biology, University of Rome La Sapienza, Rome, Italy, 10Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy, 11Department of Archaeology, University of Winchester, Winchester, UK
The Fuegians lived on the islands of Tierra del Fuego in the Southern Cone of South America in isolation from other Native Americans until their extinction at the beginning of the 20th century, likely maintaining their original genetic signature without recent admixture. Based on the Fuegian robust cranial morphology, a few controversial studies have suggested that Fuegians might be descendants of a putative earlier migration wave preceding the arrival of the other Native Americans.
Using target enrichment and next-generation sequencing, we obtained complete mitochondrial genomes from skeletal remains of 37 Fuegians and 19 individuals from adjacent Patagonia. Comparing them to published sequences of other Native Americans, we estimated the divergence times and past population dynamics in the Southern Cone and we assessed the question of population continuity in the region. The coalescent ages of deep Fuegian-specific clades suggest early human settlement in Tierra del Fuego, probably associated with the initial peopling of the continent. The early arrival of Fuegians to the Southern Cone is consistent with the generally accepted scenario of rapid coastal dispersal throughout the Americas, which is further supported by the presence of Monte Verde, the oldest known South American pre-Clovis archaeological site, in Chilean Patagonia. In this presentation, alternative views on Fuegian origins and their genetic affinities with other Native Americans are considered in the context of the evolutionary history of South American populations.
Hundreds of shared ‘deletions’ in ancient hominins are polymorphic in modern human populations
David Radke
1
,2, Shamil Sunyaev1
,2
1Harvard Medical School, Boston, MA, USA, 2Brigham and Women's Hospital, Boston, MA, USA
Deciphering the genetic uniqueness of modern humans in relation
to distant hominins and other primates is one of the central goals of
human evolutionary genomics. Recently, with the availability of
high-coverage sequence data for both Neanderthal and Denisova, it is now
possible to more precisely determine the particular loci responsible
for modern human uniqueness. While much of the distinguishing variation
may be due to single nucleotide variants, genomic structural variants
may also play a crucial role. Structural variants can be a potent
phenotype-shaping force, particularly for unbalanced events, such as
deletions, as they can alter reading frames and remove regulatory
component space. We find hundreds of ‘deleted’ regions in Neanderthal
and Denisova (including shared deletions), compared to the modern human
reference. Because these deletions are polymorphic in modern human
populations, they may represent regions of modern human-specific
insertion, regions lost in archaic human lineages, or regions influenced
by forces such as drift or selection.1Harvard Medical School, Boston, MA, USA, 2Brigham and Women's Hospital, Boston, MA, USA
What changes matter? A genomic approach to human evolution
Nicolas Rohner
1, Michael Zody2, David Reich1, Steven McCarroll1, Daniel Lieberman3, Clifford Tabin1
1Harvard Medical School, Boston, USA, 2Broad Institute of MIT and Harvard, Cambridge, USA, 3Harvard University, Cambridge, USA
We humans and our closest relatives the chimpanzees differ only
in 1-2 % of our genomes. Despite this genetic similarity we differ in
many anatomical and behavioral traits. Upright walking and larger brains
are just two prominent examples amongst many others that allowed us to
adapt to new environments. Although full genome sequences are now
available for humans, chimpanzees and other primates, surprisingly
little is known about the genetic basis underlying these traits. One
reason being that even within a 1-2% difference lie many genetic changes
potentially driving human evolution. Because open-reading-frames of
genes tend to be very similar between great apes, it has been argued
that the majority of significant evolutionary changes affect
cis-regulatory mutations. To identify regulatory changes specific to the
human lineage we undertook a whole genome approach by aligning human,
chimpanzee, macaque, and mouse genomes and focusing on conserved
non-coding regions. We identified 298 human-specific deletions
potentially removing cis-regulatory elements. We used a mouse transgenic
approach to test if the deletions affect enhancer activity. Indeed out
of 12 tested elements, 4 showed tissue-specific expression at diverse
developmental stages. We focused on two human-specific deletions for
further study. The first removes an enhancer element near the gene OSR2,
and its expression argues for a role in human palate, cranial base and
jaw development. The second deletion removes a regulatory element in the
gene ACVR2A. Its expression pattern and the phenotype of a full
knockout of ACVR2A in mouse point to its role in the human specific
shortening of digit 2-5 and the smaller size of upper incisors in
humans. We are currently mimicking the human situation by removing the
corresponding piece in each of two different mouse models to test the
ability to generate human-like phenotypes.1Harvard Medical School, Boston, USA, 2Broad Institute of MIT and Harvard, Cambridge, USA, 3Harvard University, Cambridge, USA
Reproducibility of ancient DNA methodologies within a single laboratory
Eadaoin Harney
1
,2, Susanne Nordenfelt2, Nadin Rohland2, David Reich1
,2
1Howard Hughes Medical Institute, Boston, MA, USA, 2Harvard Medical School, Boston, MA, USA, 3Broad Institute of Harvard and MIT, Cambridge, MA, USA
Recent advances in DNA extraction and targeted
(enrichment) capture methods make it possible to study whole
mitochondrial genomes or subsets of the nuclear genomes of ancient
samples with degraded and/or low levels of endogenous DNA. However
relatively little is published about the reproducibility of the data
collected within, or even between, labs using the same methods. We
report on the degree of reproducibility observed in replicate samples
processed during ongoing screening of ancient skeletal remains in our
own lab. We are focusing on ancient human bones and teeth—the most
abundant type of ancient remains—dating from 3000-9000 years ago, which
have each undergone multiple bone powder preparations, DNA extractions,
and/or library preparations. As part of our screening, we enrich all
libraries for the complete mitochondrial genome, and sequence the
enriched and un-enriched libraries on a MiSeq Sequencer. We compare
relevant metrics such as percent endogenous reads, contamination rate,
and mitochondrial coverage at a fixed number of reads to assess the
degree of reproducibility for these samples. An important finding of our
work to date is that we obtain relatively little variability in terms
of library preparation success (for example as measured by a variation
in percentage of endogenous DNA of less than 5%) when applying identical
protocols to the same bone powder. The findings of this ongoing study
will shed light on the degree of reproducibility inherent in our
laboratory’s ancient DNA processing, and may help to assess the degree
of optimization of these screening methodologies.1Howard Hughes Medical Institute, Boston, MA, USA, 2Harvard Medical School, Boston, MA, USA, 3Broad Institute of Harvard and MIT, Cambridge, MA, USA
Ancient DNA reveals the complex genetic history of the New World Arctic
Maanasa Raghavan
1, Pontus Skoglund2, Michael DeGiorgio6, Anders Albrechtsen4, Ida Moltke5, Helena Malmström2, M. Thomas P. Gilbert1, Mattias Jakobsson2, Rasmus Nielsen3, Eske Willerslev1
1Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen, Denmark, 2Uppsala University, Uppsala, Sweden, 3University of California - Berkeley, Berkeley, California, USA, 4University of Copenhagen, Copenhagen, Denmark, 5University of Chicago, Chicago, Illinois, USA, 6Pennsylvania State University, University Park, Pennsylvania, USA
1Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen, Denmark, 2Uppsala University, Uppsala, Sweden, 3University of California - Berkeley, Berkeley, California, USA, 4University of Copenhagen, Copenhagen, Denmark, 5University of Chicago, Chicago, Illinois, USA, 6Pennsylvania State University, University Park, Pennsylvania, USA
New World Arctic (North America and Greenland) was first occupied by modern humans around 5,000 years ago. The PaleoEskimos constituted the first two cultures to have peopled the region: the Pre-Dorset or Saqqaq culture (ca. 3000-800 BC) and the Dorset culture (ca. 800 BC-1300 AD). The NeoEskimos (Thule culture), who are considered to be ancestral to modern-day Inuit, were the latest migrants into the New World Arctic and spread eastwards from northern Alaska in around 1000 AD. However, despite decades of archaeological research having established when the cultural transitions occurred, there is no consensus on how these people were related to one another and whether one or several gene pools were represented in these different Arctic traditions. We present results from an ongoing study comprising the largest genomic dataset generated thus far on ancient human samples from sites in Siberia, Alaska, Canada and Greenland. Our research contributes new perspectives to the debate of cultural versus genetic replacement in the New World Arctic and also evaluates the extent to which the PaleoEskimos and the NeoEskimos have shaped the genetic structure of modern populations in the region.
Bayesian methods for estimating homozygous tracking length distribution (HTLD) from single individuals
xiaoqian jiang, michael lynch
Indiana University, Bloomington, IN, USA
HTLD refers to the frequency of spans of length separating
heterozygous sites, which harbors information on past population
history. At high coverage and low error rate, HTLD could be obtained by
simply assessing consensus genotypes at each site. However, with most
genome data, uncertainties will exist as to whether sites are homozygous
or heterozygous. In this project, a Bayesian method has been developed
for estimating HTLD in an unbiased fashion. The genome-wide estimates of
the individual heterozygosity that obtained from likelihood method is
as the prior information of Bayesian method, and then EM algorithm is
used to fill the missing genome data. Compared to previous arbitrary way
of assigning zygosity to sites with missing data, this method could
provide more accurate information on HTLD. This more accurate HTLD
allows further investigation into the demographic history. In this
project, further mathematical methods will be developed to re-infer the
pattern of population history from both simulated data and individual
genome sequence. Furthermore, we will compare the relative power of the
estimation of HTLD and correlation of heterozygosity in inferring
information about population linkage-disequilibrium pattern.Indiana University, Bloomington, IN, USA
Predicting the discovery rate of genomic features
Simon Gravel
McGill University, Montreal, Qc, Canada
Successful sequencing experiments require judicious sample
selection. However, this selection must often be performed on the basis
of limited preliminary data. Predicting the statistical properties of
the final sample based on preliminary data can be challenging, because
numerous uncertain model assumptions may be involved. Here, we ask
whether we can predict "omics" variation across many samples by
sequencing only a fraction of them. In the infinite-genome limit, we
find that a pilot study sequencing 5% of a population is sufficient to
predict the number of genetic variants in the entire population within
6% of the correct value, using an estimator agnostic to demography,
selection, or population structure. To reach similar accuracy in a
finite genome with millions of polymorphisms, the pilot study would
require about 15% of the population. We present computationally
efficient jackknife and linear programming methods that exhibit
substantially less bias than the state of the art when applied to
simulated data and sub-sampled 1000 Genomes Project data. Extrapolating
based on the NHLBI Exome Sequencing Project data, we predict that 7.2%
of sites in the capture region would be variable in a sample of 50,000
African-Americans, and 8.8% in a European sample of equal size. Finally,
we show how the linear programming method can also predict discovery
rates of various genomic features, such as the number of transcription
factor binding sites across different cell types.McGill University, Montreal, Qc, Canada
Genetic data supporting an East-South African migration associated with pastoralism
Gwenna Breton
1
,2, Mattias Jakobsson1
,4, Carina Schlebusch1, Himla Soodyall3
1Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden, 2Master Biosciences, École Normale Supérieure de Lyon, Lyon, France, 3Division of Human Genetics, School of Pathology, Faculty of Health Sciences, University of 11 the Witwatersrand, and National Health Laboratory Service, Johannesburg, South Africa, 4Science for Life Laboratory, Uppsala University, Uppsala, Sweden
The ability to digest milk into adulthood, lactase persistence,
is heterogeneously distributed. As an example, pastoralist populations
often display higher frequencies of lactase persistence. Lactase
persistence is considered adaptive in populations with pastoralist
practices. The characterization of lactase persistence in southern
Africa is poor. By sequencing a 360 bp region in southern Africans we
characterized the lactase-persistence genotype of these groups; in order
to confirm the results obtained on alleles' origin, we performed a
genome-wide analysis of relationships to other groups.1Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden, 2Master Biosciences, École Normale Supérieure de Lyon, Lyon, France, 3Division of Human Genetics, School of Pathology, Faculty of Health Sciences, University of 11 the Witwatersrand, and National Health Laboratory Service, Johannesburg, South Africa, 4Science for Life Laboratory, Uppsala University, Uppsala, Sweden
We sequenced the LCT regulatory region in 267 individuals from 13 populations: 7 Khoe and San groups, the ancestral inhabitants of southern Africa, 3 Bantu-speaking groups and 3 groups with mixed ancestry. Those groups have diverse subsistence patterns. We then searched for signals of past East-South African admixture events using DNA chip data including many Eastern and Southern African populations as well as HapMap reference populations.
We found two previously described lactase persistence alleles in our sample: the European 13910C>T allele in individuals with recent European admixture and the East African 14010G>C allele in the Nama (at a frequency of 35.7% if recently admixed individuals are removed) and in other groups with lower frequency. The Nama are a Khoe group and are pastoralist. To learn about the origin of this variant in southern Africans, we analysed a 54.6 kb window of DNA chip data including the two SNPs. It showed that the 14010C allele in the Nama is on the same haplotype as in the East African Maasai; hence, we concluded that the allele appeared only once, likely in the Eastern Africans (greater frequencies) and then it was brought to southern Africa. Thanks to an ADMIXTURE analysis we identified an East African component in several Khoe-San groups; again, the highest percentage of East African ancestry (~13%) is found in the Nama. This admixture event likely took place after the 14010C allele appeared in East Africans, ie ~3,000–7,000 years BP.
In a nutshell, we investigated a South-East African migration event combining information on a single trait and genome-wide data. This event explains the presence of an East African allele in Khoe-San groups. The groups with the largest East African component are the pastoralist groups, in which being able to digest milk is advantageous. Our findings provide new elements about ancestral migrations and spreading of pastoralism in Africa and complement conclusions of other fields, like archaeology.
Reducing pervasive false positive identical-by-descent segments detected by large-scale pedigree analysis
Eric Durand, Nicholas Eriksson, Cory McLean
23andMe, Inc., Mountain View, CA, USA
Analysis of genomic segments shared identical-by-descent (IBD)
between individuals is fundamental to many genetic applications, from
demographic inference to estimating the heritability of diseases. A
large number of methods to detect IBD segments have been developed
recently. However, IBD detection accuracy in non-simulated data is
largely unknown. In principle, it can be evaluated using known
pedigrees, as IBD segments are by definition inherited without
recombination down a family tree. We extracted 25,432 genotyped European
individuals containing 2,952 father-mother-child trios from the
23andMe, Inc. dataset. We then used GERMLINE, a widely used IBD
detection method, to detect IBD segments within this cohort. Exploiting
known familial relationships, we identified a false positive rate over
67% for 2–4 centiMorgan (cM) segments, in sharp contrast with accuracies
reported in simulated data at these sizes. We show that nearly all
false positives arise due to allowing switch errors between haplotypes
when detecting IBD, a necessity for retrieving long (> 6 cM) segments
in the presence of imperfect phasing. We introduce HaploScore, a novel,
computationally efficient metric that enables detection and filtering
of false positive IBD segments on population-scale datasets. HaploScore
scores IBD segments proportional to the number of switch errors they
contain. Thus, it enables filtering of spurious segments reported due to
GERMLINE being overly permissive to imperfect phasing. We replicate the
false IBD findings and demonstrate the generalizability of HaploScore
to alternative genotyping arrays using an independent cohort of 555
European individuals from the 1000 Genomes project. HaploScore can be
readily adapted to improve the accuracy of segments reported by any IBD
detection method, provided that estimates of the genotyping error rate
and switch error rate are available.23andMe, Inc., Mountain View, CA, USA
What can we learn from the methylation maps of ancient humans?
David Gokhman
1, Eitan Lavi1, Kay Prufer2, Mario Fraga3, Jose Riancho4, Janet Kelso2, Svante Paabo2, Eran Meshorer1, Liran Carmel1
1The Hebrew University of Jerusalem, Jerusalem, Israel, 2Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 3University of Oviedo and CNB-CSIC, Oviedo, Spain, 4University of Cantabria, Santander, Spain
We have previously presented the full reconstructed DNA
methylation maps of the Neandertal and the Denisovan. Here, we use these
maps to reveal trends in recent hominin epigenetic evolution. We show
that the methylation pattern of transcription start sites (TSS) are the
most conserved regions, and that the distance from TSS highly correlates
with variation in methylation.1The Hebrew University of Jerusalem, Jerusalem, Israel, 2Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, 3University of Oviedo and CNB-CSIC, Oviedo, Spain, 4University of Cantabria, Santander, Spain
Additionally, we found several genes that are imprinted in present-day humans but are methylated in archaic humans. This includes H19, a gene that encodes a long non-coding RNA that is maternally imprinted in present-day humans. When imprinting is damaged, methylation of this gene causes the Beckwith-Wiedrmann syndrome, whose symptoms include growth dysregulation, increased susceptibility to cancer and facial features such as a prominent lower jaw and midfacial hypoplasia. Unlike present-day humans and the Denisovan, in the Neandertal the promoter of H19, as well as the imprinting-control region (ICR), are both completely methylated. Methylation of the H19 promoter was previously shown to anti-correlate with its expression levels, suggesting that H19 might have had reduced activity in the Neandertal. Interestingly, H19 was also found to be differentially methylated in Orangutans. This gene is one of several examples where altered methylation in present-day humans results in abnormal symptoms, whereas in the Neandertal, to our knowledge, the symptoms do not come to realization.
Another differentially methylated gene between archaic and modern humans is AUH. Defects in AUH are behind the methylglutaconic aciduria type I syndrome, whose symptoms include speech delay, poor articulation, and forgetfulness. This gene is unmethylated in present-day humans, but is methylated in archaic humans, suggesting differential regulation in both archaic humans. As this gene shows constant methylation levels across 25 human tissues, it is possible that these differences in methylation extend to the brain tissue as well.
Such trends in methylation shed light on the evolutionary constraints that are behind epigenetic regulation in the human lineage and on the mechanisms that lead to disease symptoms in one human group and to a completely healthy individual in another.
A model-based approach for identifying signatures of ancient balancing selection in genetic data
Michael DeGiorgio
1, Kirk Lohmueller2, Rasmus Nielsen3
1Pennsylvania State University, University Park, PA, USA, 2University of California, Los Angeles, Los Angeles, CA, USA, 3University of California, Berkeley, Berkeley, CA, USA
While much effort has focused on detecting positive and negative
directional selection in the human genome, relatively little work has
been devoted to balancing selection. This lack of attention is likely
due to the paucity of sophisticated methods for identifying sites under
balancing selection. We designed the first set of likelihood-based
methods that explicitly model the genealogical process under balancing
selection using a coalescent framework. Simulation results show that our
methods for detecting balancing selection vastly outperform previous
approaches based on summary statistics are robust to demography. We
apply the new methods to whole-genome sequencing data from humans, and
find a number of previously-identified loci with strong evidence of
balancing selection, including various HLA genes. Additionally, we find
evidence for many novel candidates, the strongest of which is FANK1,
an imprinted gene that suppresses apoptosis, is expressed during
meiosis in males, and displays marginal signs of segregation distortion.
We hypothesize that balancing selection acts on this locus to stabilize
the segregation distortion and negative fitness effects of the
distorter allele. Not only are our methods for identifying signatures of
balancing selection the most powerful developed to date, but they can
also be applied to any organism with polymorphism data and an outgroup
sequence. As such, we expect that our methods will be widely used by the
genomics community to uncover the potentially numerous genomic regions
that are under balancing selection in many non-human species.1Pennsylvania State University, University Park, PA, USA, 2University of California, Los Angeles, Los Angeles, CA, USA, 3University of California, Berkeley, Berkeley, CA, USA
Ancient DNA and the population history of pre-Columbian Puerto Rico
Maria A Nieves-Colon
1, William J Pestle2, Anne C. Stone1
1Arizona State University, Tempe, AZ, USA, 2University of Miami, Coral Gables, FL, USA
The population history of the Caribbean has been recently studied
through the use of large scale genome-wide studies on modern
populations. However, there are inherent limitations to the use of
modern data for making inferences about past population processes.
Ancient DNA may be a useful tool for elucidating the contributions of
indigenous pre-Columbian populations to the genomes of contemporary,
highly admixed Caribbean islanders, as well as for studying the history
of ancient Amerindian peoples in the Caribbean basin.1Arizona State University, Tempe, AZ, USA, 2University of Miami, Coral Gables, FL, USA
We present the results of pilot research focused on retrieving ancient DNA from human skeletal remains from Puerto Rico. We performed DNA extraction on 43 individuals from three pre-Columbian Puerto Rican sites dated between 590 to 1280 cal AD. We tested our extracts for the presence of ancient DNA through PCR amplification of an 80 bp fragment of mitochondrial DNA (mtDNA). This preliminary assessment indicates that 42% (n=18) of our samples have amplifiable mtDNA.
However, extensive DNA fragmentation and degradation may affect amplification efficiency in these samples. With the aim of overcoming these issues, we converted 18 of our extracts into sequencing libraries, and enriched them by targeting complete mitochondrial genomes. Preliminary quality assessments with fragment analysis and quantitative PCR methods suggest that we have successfully captured ancient mtDNA in no less than nine of our sequencing libraries.
The recovery of complete mtDNA genomes from these individuals will allow us to begin to characterize the genetic diversity and population history of a pre-Columbian Antillean population. These data may also be used to help estimate the contribution that these ancient groups played in shaping the genetic ancestry of modern Puerto Ricans.
A genomic study of the contribution of DNA methylation to regulatory evolution in primates
Julien Roux1, Irene Hernando-Herraez
2, Nicholas Banovich1, Claudia Chavarria1, Amy Mitrano1, Jonathan Pritchard3
,4, Tomas Marques-Bonet2, Yoav Gilad1
1University of Chicago, Chicago, USA, 2Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, Spain, 3Howard Hughes Medical Institute, Stanford University, Stanford, USA, 4Departments of Biology and Genetics, Stanford University, Stanford, USA
A long-standing hypothesis is that changes in gene regulation
play an important role in adaptive evolution, notably in primates. Yet,
in spite of the evidence accumulated in the past decade that regulatory
changes contribute to many species-specific adaptations, we still know
remarkably little about the mechanisms of regulatory evolution. In this
study we focused on DNA methylation, an epigenetic mechanism whose
contribution to the evolution of gene expression remains unclear. To
interrogate the methylation status of the vast majority of cytosines in
the genome, we performed whole-genome bisulfite conversion followed by
high-throughput sequencing across 4 tissues (heart, kidney, liver and
lung) in 3 primate species (human, chimpanzee and macaque). Because the 4
tissues are from the same individuals, we are able to monitor
methylation differences between individuals, tissues and species. In
parallel, we collected gene expression profiles using RNA-seq from the
same tissue samples, allowing us to perform a high resolution scan for
genes and pathways whose regulation evolved under natural selection. We
integrated these datasets to characterize better the genome features
whose methylation status leads to expression changes, and we developed a
statistical model to quantify the proportion of variation in gene
expression levels across tissues and species which can be explained by
changes in methylation. Globally, our study leads to a better
understanding of the molecular basis for regulatory changes and
adaptations in primates.1University of Chicago, Chicago, USA, 2Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, Spain, 3Howard Hughes Medical Institute, Stanford University, Stanford, USA, 4Departments of Biology and Genetics, Stanford University, Stanford, USA
Inferring African population structure and the dynamics of the Out-of-Africa event
Shyam Gopalakrishnan
1, Paul Grabowski1, Michael Turchin1, Brenna Henn6, Jeff Kidd4, George Perry2, Cynthia Beall3, A Gebremedhin5, Carlos Bustamante7, Anna Di Rienzo1, Yoav Gilad1, Abraham Palmer1, Jonathan Pritchard7
1University of Chicago, Chicago, IL, USA, 2Penn State University, University Park, PA, USA, 3Case Western Reserve University, Cleveland, OH, USA, 4University of Michigan, Ann Arbor, MI, USA, 5Addis Ababa University, Addis Ababa, Ethiopia, 6SUNY Stony Brook, Stony Brook, NY, USA, 7Stanford University, Stanford, CA, USA
Human population history is an intriguing and complex story
consisting of many events like population growth, bottleneck,
time-dependent and non-homogeneous migration, population splits and
admixture. Estimating complex demographies with a large number of
dependent parameters such as split times, gene flow rates and changing
population sizes, has proven especially challenging. Here we propose a
framework for estimating the demography of a large number of populations
jointly, especially the gene-flow rates and split times between them.
We use coalescent rate estimates obtained from Pairwise Sequentially
Markovian Coalescent (PSMC) as a starting point for our analysis. We
obtain the pairwise coalescent rates for each pair of sampled population
using a pairwise application of PSMC to each pair of samples. Using a
mathematical model for calculating coalescent probabilites given the
demography, we estimate the demography using the parameters that best
fit the observed coalsecent rates obtained from PSMC.1University of Chicago, Chicago, IL, USA, 2Penn State University, University Park, PA, USA, 3Case Western Reserve University, Cleveland, OH, USA, 4University of Michigan, Ann Arbor, MI, USA, 5Addis Ababa University, Addis Ababa, Ethiopia, 6SUNY Stony Brook, Stony Brook, NY, USA, 7Stanford University, Stanford, CA, USA
In this study, we focus on African demography, specifically the population structure in Africa going back in time and the dynamics of the Out-of-Africa event. To address these questions, we assembled a dataset with whole genome sequences from 162 individuals using some in-house sequencing and publicly available sources such as the 1000 Genomes project. These samples span twenty two populations worldwide. These include eleven African populations which we use to examine the population substructure in Africa. In addition, we also have 2 Middle Eastern, 5 European and 4 East/Central Asian populations which allows us to estimate the timing of the Out-of-Africa event and the European-Asian split.
We find extensive population structure in Africa extending back to before the Out-of-Africa event. The Ethiopian populations show gene flow back from 15kya, with the Maasai and Luhye merging with the east African populations ~40kya. We find evidence for extensive mixing between east and west African populations beginning 50kya. Among the pygmy populations, we see recent gene flow between the Batwa and Mbuti. All the African populations except for the San merge into a single population around 100 kya. The San exchange migrants with the other African populations starting ~120 kya. We estimate the Out-of-Africa event to have occurred ~75kya and the European-Asian split to ~25kya. Our findings also suggest a period of sustained gene flow between East Africa and Middle Eastern populations after the Out-of-African event.
Fast, scalable and distributed dimensionality reduction of genome-wide data
Suyash Shringarpure, Carlos Bustamante
Stanford University, Stanford, CA, USA
The increasing size of genomic datasets, especially for
genome-wide association studies (GWAS), presents significant analytical
and computational challenges. Dimensionality reduction methods such as
principal components analysis (PCA) and model-based ancestry inference
are used to obtain informative summaries of genome-wide data that can be
used in GWAS. However, existing methods require considerable
computational time to analyze genomic datasets with tens (or hundreds)
of thousands of individuals genotyped at hundreds of thousands (or
millions) of single nucleotide polymorphisms (SNPs).Stanford University, Stanford, CA, USA
We propose random projections as a fast and scalable way of performing dimensionality reduction of large genome-wide SNP datasets. With a sparse implementation, we show that projections can be computed in time linear in the size of the dataset. Using 20,000 individuals simulated from the HapMap Phase 3 CEU, ASW, CHB and YRI populations at 365,466 SNPs, we show that the projected individuals can be used to (a) perform PCA (b) accelerate convergence of model-based ancestry inference (b) compute identity-by-state distance. These projections have the following properties: (a) by definition, the projection directions are independent of the data and hence are robust to outliers (b) existing projections do not need to be recomputed if individuals are added to or removed from the dataset (b) the theoretical upper bound on the number of projections required to summarize a dataset is nearly independent of the number of SNPs in the dataset and is proportional to the logarithm of the number of individuals in dataset. In addition, for large GWAS, where sequencing/genotyping data may be distributed across multiple physical locations, random projections can be computed and shared instead of sharing the genotype data directly. This can reduce data sharing requirements by one/two orders of magnitude.
"Genetic Snapshot of "Palaeoamerican Relicts": a characterisation of Fuegan and Pericu populations"
Cristina Valdiosera1
,2, María C. Ávila-Arcos
1
,3, Pontus Skoglund5, Andres Moreno-Estrada3, Ricardo Rodríguez4, Helena Malmström5, Josefina Mansilla6, Morten Allentoft1, Maanasa Raghavan1, Andaine Orlando1, Ilán Leboreiro6, José Luis Vera6, Christoph P. E. Zollikofer7, Marcia S. Ponce de Leon7, Colin Smith2, Carlos Bustamante3, Evelyne Heyer8, Mattias Hakobsson5, Eske Willerslev1
1Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark, 2Department of Archaeology, La Trobe University, Melbourne, Australia, 3Center for Computational, Evolutionary and Human Genomics, Stanford University, Stanford, USA, 4Centro de Investigación Sobre la Evolución y Comportamiento Humanos, Universidad Complutense de Madrid, Madrid, Spain, 55Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden, 6Instituto Nacional de Antropología e Historia, Mexico City, Mexico, 7Anthropological Institute, University of Zurich, Zurich, Switzerland, 8Laboratoire Eco-Anthropologie et Ethnobiologie, Muséum National d'Histoire Naturelle, Centre National de la Recherche Scientifique, Université Paris 7, Heyer, France
1Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark, 2Department of Archaeology, La Trobe University, Melbourne, Australia, 3Center for Computational, Evolutionary and Human Genomics, Stanford University, Stanford, USA, 4Centro de Investigación Sobre la Evolución y Comportamiento Humanos, Universidad Complutense de Madrid, Madrid, Spain, 55Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden, 6Instituto Nacional de Antropología e Historia, Mexico City, Mexico, 7Anthropological Institute, University of Zurich, Zurich, Switzerland, 8Laboratoire Eco-Anthropologie et Ethnobiologie, Muséum National d'Histoire Naturelle, Centre National de la Recherche Scientifique, Université Paris 7, Heyer, France
Although multiple lines of evidence support the notion that the origin all extant Amerindian populations is of Asian origin, the number of migrations and source populations that gave rise to the first inhabitants of the New World is still contentious. Based on the significant craniofacial discontinuity between the Pleistocene (Paleoamerican) and Holocene (Amerindian) populations, it has been suggested that the Americas were populated twice, from different Asian sources. Under this assumption, a first migration wave originating from Southeast Asia gave rise to the Paleoamericans, whereas all modern Amerindian groups would derive from a second wave of migration originating in Northeast Asia. Pericues in Baja California, Mexico, and the very southern populations of Patagonia and Tierra del Fuego display Paleoamerican craniofacial traits leading some researchers to suggest that these are a temporal extension of the first colonizers of the Americas. We have shotgun sequenced DNA from skeletal remains of Pericues and Fuegans to assert their genetic affinity to modern populations.
Inferring the effects of genetic variants on gene expression and splicing
Nilah Ioannidis, Alexis Battle, Stephen Montgomery, Weiva Sieh, Alice Whittemore, Carlos Bustamante
Stanford University, Stanford, CA, USA
Whole-genome and whole-exome sequencing technologies are
increasingly enabling studies of genetic variation in large numbers of
healthy and diseased individuals; however, interpreting the clinical
significance of the many genetic variants identified in these studies
remains a critical challenge. This task is particularly challenging in
the case of rare or novel variants that have no effect on protein
structure, such as noncoding, intronic, and synonymous variants. Here we
develop a method to interpret such variants based on their predicted
regulatory impact on gene expression and splicing, based on the
hypothesis that clinically significant variants that do not affect
protein structure are likely to affect cellular function via expression
regulation. We develop a predictive model for the regulatory effects of
genetic variants by training random forest-based learners to recognize
cis- expression quantitative trait loci (eQTLs) and splicing
quantitative trait loci (sQTLs) discovered by the Geuvadis consortium
based on RNA-sequencing of lymphoblastoid cell lines from individuals in
the 1000 Genomes Project [Lappalainen et al, Nature 2013]. Our model
uses genomic features of each variant including its position relative to
the transcription start site and nearby splice sites, conservation,
overlapping functional elements from ENCODE and Ensembl, and position
within these functional elements. We validate the model on additional
eQTL and sQTL datasets and characterize its performance on known
pathogenic noncoding, intronic, and synonymous variants, which are
expected to be enriched for predicted regulatory effects. We anticipate
that this regulatory effects predictive model will be useful in future
studies characterizing regulatory variation within the genome and for
prioritizing the likely clinical significance of rare and novel genetic
variants identified in large-scale clinical sequencing studies.Stanford University, Stanford, CA, USA
Initial results from over 400 high coverage complete human
genome sequences from ca. 130 populations of predominantly Eurasian
origin.
Mait Metspalu
1
1Evolutionary biology group, Estonian Biocentre, Tartu, Estonia, 2Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia, 3Estonian Genome Center, University of Tartu, Tartu, Estonia, 4Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia, 5Department of Biological Anthropology, University of Cambridge, Cambridge, UK, 6Department of Integrative Biology, University of California Berkeley, Berkeley, USA, 7Human Genetics Group, Institute of Molecular Biology, National Academy of Sciences, Yerevan, Armenia, 8Institute for Genetic Engineering and Biotechnology, Sarajevo, Bosnia and Herzegovina, 9Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences, Russia, 10Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia, 11Institute of Internal Medicine, Siberian Branch of Russian Academy of Medical Sciences, Novosibirsk, Russia, 12Laboratory of Molecular Biology, North-Eastern Federal University, Yakutsk, Russia, 13Institute of Genetics and Cytology, National Academy of Sciences, Minsk, Belarus, 14Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Centre National de la Recherche Scientifique, Université de Toulouse, Toulouse, France, 15Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moscow, Russia, 16Research Department of Genetics, Evolution and Environment, University College London, London, UK, 17Center for GeoGenetics, University of Copenhagen, Copenhagen, Denmark
Complete high coverage individual genome sequences carry the
maximum amount of information for reconstructing the evolutionary past
of a species in the interplay between random genetic drift and natural
selection. Here we present a novel dataset of over 400 human genomes
sequenced at 40X on the same platform (Complete Genomics) and uniform
bioinformatic pipelines. Based on SNP-chip data we generally chose three
samples to represent each population of interest. We cover a wide range
of mostly Eurasian populations with additional populations from
Oceania, South America and Africa.1Evolutionary biology group, Estonian Biocentre, Tartu, Estonia, 2Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia, 3Estonian Genome Center, University of Tartu, Tartu, Estonia, 4Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia, 5Department of Biological Anthropology, University of Cambridge, Cambridge, UK, 6Department of Integrative Biology, University of California Berkeley, Berkeley, USA, 7Human Genetics Group, Institute of Molecular Biology, National Academy of Sciences, Yerevan, Armenia, 8Institute for Genetic Engineering and Biotechnology, Sarajevo, Bosnia and Herzegovina, 9Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences, Russia, 10Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia, 11Institute of Internal Medicine, Siberian Branch of Russian Academy of Medical Sciences, Novosibirsk, Russia, 12Laboratory of Molecular Biology, North-Eastern Federal University, Yakutsk, Russia, 13Institute of Genetics and Cytology, National Academy of Sciences, Minsk, Belarus, 14Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Centre National de la Recherche Scientifique, Université de Toulouse, Toulouse, France, 15Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moscow, Russia, 16Research Department of Genetics, Evolution and Environment, University College London, London, UK, 17Center for GeoGenetics, University of Copenhagen, Copenhagen, Denmark
We present here initial results from population genetic analyses on the data.
We use recently developed methods based on length distributions of shared genomic segments to estimate the dynamics of past effective population sizes of regional populations, population split times and subsequent admixture events between various Eurasian population pairs.
We map the geographic and temporal variation of Neanderthal and Denisova introgression among different Eurasian populations.
For Y chromosome data we determined the regions of highest mapping quality and applied phylogenetic methods to determine the order and temporal dynamics of branching events in non-African Y chromosome haplogroups. We show that the relatively short branch lengths distinguishing continental non-African populations are consistent with the model of a rapid initial colonization of Eurasia and Oceania.
Anthropometric trait variation among diverse African populations: deviations from drift
Matthew Hansen, Joseph Lachance, Sameer Soi, Laura Scheinfeldt, Alessia Ranciaro, Simon Thompson, Jibril Hirbo, Sarah Tishkoff
University of Pennsylvania, Philadelphia, PA, USA
The African continent contains an immense amount of phenotypic
variation, which is commonly attributed to adaptation to a wide range of
ecological habitats and lifestyles. There has been intense debate as to
the relative amount that neutral genetic drift and natural selection
have shaped the human genome. Although a number of adaptive genes have
been identified, relatively little is known on whether particular
phenotypic traits are adaptive. Here, we use Pst-Fst comparisons to
investigate the degree to which human phenotypic variation differs from
that expected by neutral genetic drift. Populations analyzed include
hunter-gatherers, pastoralists, and agriculturalists from eastern and
western sub-Saharan Africa, representing a wide range of lifestyles and
ecological habitats. Sample phenotypes involve multiple health and
general lifestyle related traits, including weight, BMI, grip strength,
blood pressure, lactase response, and glucose levels. For each
population we calculated the amount of phenotypic variance among
populations relative to the total amount of phenotypic variance in the
trait (Pst). We genotyped nearly 700 study participants from over 40
populations using the Illumina 1M-Duo SNP array and calculated Fst
between all pairs of populations. For each trait, at least 17
populations had both phenotype and genotype data, resulting in at least
136 pairwise comparisons per trait. Deviations from expected neutral
phenotypic drift where analyzed in a Pst-Fst framework over the set of
all population pairs. Adaptive traits result in phenotypic distances
between populations that exceed genetic distances between population
(Pst >> Fst), and these traits are good candidates for follow-up
selection studies and QTL mapping. These comparisons allowed us to
identify adaptive traits on both a population and a continental scale.
In addition, a PCA analysis of correlated phenotypes was performed to
identify trait combinations with orthogonal variance contributions.University of Pennsylvania, Philadelphia, PA, USA
Integrative Genomic Studies of Evolution and Adaptation in Africa
Sarah Tishkoff
Departments of Genetics and Biology, University of Pennsylvania, Philadelphia, PA, USA
Departments of Genetics and Biology, University of Pennsylvania, Philadelphia, PA, USA
Africa is thought to be the ancestral homeland of all modern human populations. It is also a region of tremendous cultural, linguistic, climatic, and genetic diversity. Despite the important role that African populations have played in human history, they remain one of the most underrepresented groups in human genomics studies. A comprehensive knowledge of patterns of variation in African genomes is critical for a deeper understanding of human genomic diversity, the identification of functionally important genetic variation, the genetic basis of adaptation to diverse environments and diets, and the origins of modern humans. Furthermore, a deeper understanding of African genomic variation will provide the necessary foundation for powerful and efficient genome-wide association and systems biology studies to identify coding and regulatory variants that play a role in phenotypic variation including disease susceptibility. We have used whole genome SNP genotyping and high coverage sequencing analyses to characterize patterns of genomic variation, ancestry, and local adaptation across ethnically and geographically diverse African populations. We have identified candidate loci that play a role in adaptation to infectious disease, diet and high altitude, as well as the short stature trait in African Pygmies. Additionally, our studies shed light on human evolutionary history and African population history.
Origin and Adaptive Evolution of Allelic Variation at TAS2R16 Associated with Salicin Bitter Taste Sensitivity in Africa
Michael Campbell
1, Alessia Ranciaro1, Daniel Zinshteyn2, Renata Rawlings-Goss1, Jibril Hirbo1
,3, Simon Thompson1, Dawit Woldemeskel1
,4, Alain Froment5, Joseph Rucker6, Sabah Omar7, Jean-Marie Bodo8, Thomas Nyambo9, Gurja Belay4, Dennis Drayna10, Paul Breslin11
,12, Sarah Tishkoff1
1University of Pennsylvania, Philadelphia, PA, USA, 2Cornell University, Ithaca, NY, USA, 3Vanderbilt University, Nashville, TN, USA, 4Addis Ababa University, Addis Ababa, Ethiopia, 5Musee de L’Homme, Paris, France, 6Integral Molecular, Philadelphia, PA, USA, 7Kenya Medical Research Institute, Nairobi, Kenya, 8Ministry of Scientific Research and Innovation, Yaounde, Cameroon, 9Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania, 10National Institute on Deafness and Other Communication Disorders, NIH, Rockville, MD, USA, 11Monell Chemical Senses Center, Philadelphia, PA, USA, 12Rutgers University, New Brunswick, NJ, USA
Bitter taste perception influences human health and nutrition,
and the genetic variation underlying this trait is thought to play a
role in disease susceptibility. To better understand the genetic
architecture and patterns of phenotypic variability of bitter taste
perception, we examined genotype and sequence data in the promoter and
coding regions of TAS2R16, a bitter taste receptor gene, in ~600
individuals from 74 African populations in West Central, Central and
East Africa. We also performed genotype-phenotype association analyses
of threshold levels of sensitivity to salicin, a bitter
anti-inflammatory compound, in a subset of ~300 individuals from the
above populations in Africa. In addition, we characterized TAS2R16
mutants in vitro to investigate the effects of polymorphic loci
identified at this locus on receptor. Here, we report striking
signatures of positive selection in the coding region of TAS2R16,
including significant Fay and Wu's H statistics predominantly in East
Africa, indicating strong local adaptation and greater genetic structure
among African populations than expected under neutrality. Furthermore,
we observed a "star-like" phylogeny for haplotypes with the derived
allele at polymorphic site 516 associated with increased bitter taste
perception that is consistent with a model of selection for
"high-sensitivity" variation. In contrast, haplotypes carrying the
"low-sensitivity" ancestral allele at site 516 showed evidence of strong
purifying selection. However, we did not observe signals of selection
in the TAS2R16 promoter. We also demonstrated, for the first time, the
functional effect of nonsynonymous variation at site 516 on salicin
phenotypic variance in vivo in diverse Africans and showed that
variability at this site is strongly correlated with cell surface
expression of the TAS2R16 receptor in vitro, suggesting a molecular
basis for differences in salicin bitter taste recognition. In contrast,
however, we did not detect a significant association between genetic
variability in the TAS2R16 promoter and salicin bitter taste perception,
indicating that allelic variation in the coding exon mainly influences
bitter taste sensitivity. Additionally, we detected geographic
differences in levels of bitter taste perception in Africa not
previously reported and infer an East African origin for high salicin
sensitivity in human populations. Overall, this study correlates genetic
variants that are targets of selection with phenotypic variability,
demonstrating the connection between functional variation and local
adaptation at a medically-relevant locus in humans.1University of Pennsylvania, Philadelphia, PA, USA, 2Cornell University, Ithaca, NY, USA, 3Vanderbilt University, Nashville, TN, USA, 4Addis Ababa University, Addis Ababa, Ethiopia, 5Musee de L’Homme, Paris, France, 6Integral Molecular, Philadelphia, PA, USA, 7Kenya Medical Research Institute, Nairobi, Kenya, 8Ministry of Scientific Research and Innovation, Yaounde, Cameroon, 9Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania, 10National Institute on Deafness and Other Communication Disorders, NIH, Rockville, MD, USA, 11Monell Chemical Senses Center, Philadelphia, PA, USA, 12Rutgers University, New Brunswick, NJ, USA
Archaeogenetics: Bone to Biomolecule, a study of an Early Medieval Population in Ynys Gybi, Cymru
Ashley Matchett
Inter American University of Puerto Rico, Bayamon, Puerto Rico
The archaeological excavation of the early medieval site at
Towyn-Y-Capel on the island of Anglesey (Ynys Môn) in North Wales, UK,
provided the opportunity to study a large population (122 skeletons) at a
site that was in use over a period of up to 550 years (650 -1200 AD). A
multidisciplinary study was performed on the skeletal collection from
morphology to molecular chemistry and biology to assess and screen
samples for later genetic analysis.Inter American University of Puerto Rico, Bayamon, Puerto Rico
Post-sampling the assessment of skeletal sample condition was used to select material chosen for genetic analysis, and 44% of the skeletal population was selected for analysis. The morphology of samples was assessed and 87% of bones and teeth were considered to be in good or fair condition. A novel technique, Qualitative Light Fluorescence, was also used to compare the teeth to modern standards, showing a loss of 21.8% in fluorescence and indicating inorganic degradation. Histological sections taken from non-human bone finds from the site generally varied less than that indicated by the gross morphology, showing good to excellent preservation.
Well preserved skeletal samples were selected for detailed investigations into the biological and chemical condition, principally through amino acid racemisation, amino acid composition, heavy metal analysis. All samples tested had D/L Aspartic acid ratio less than 0.1, although 50% of the samples had a ratio over 0.08, which indicated that the recovery of DNA from these skeletal samples was feasible, although degraded. The element profiles showed no discernable anomalies, either due to diet or diagenesis. To consolidate genographic research, strontium isotope analysis of a small population subset, showed three anomalous ratios, which indicated widespread contact in North Atlantic Europe and unexpected residence patterns
DNA recovery was more successful in teeth than in bones. Amplification over several rounds using various primers specific for human HV1 & 2 mtDNA was conducted. Of all the samples only 14.8% of the skeletal teeth samples were amplified, although over 90% of the screened sampled were amplified and sequenced. DNA spiking trials demonstrated that some of the samples were affected by inhibition and poor template condition as validated by sequencing. Independent confirmation of successful samples was attained by sequencing, and although sequences were highly degraded. Haplogroups identification was from the sequenced HV1 sections and based on likelihood. Generally site showed a high predominance of Haplotype K(5) followed by H(2) and U(2) haplogroup profiles.
No comments:
Post a Comment