Demographic Inference and Whole Genome Scan for Positive Natural Selection in
Pygmies from Central Africa
PingHsun Hsieh1, Krishna R. Veeramah1, Joseph Lachance2, Sarah A. Tishkoff2, Jeff D. Wall3, Michael F. Hammer1, Ryan N. Gutenkunst1 1University of Arizona; 2University of Pennsylvania; 3University of California, San Francisco
African Pygmies are hunter-gatherers residing mostly in Central African rainforests. Many Pygmy populations have been influenced by neighboring Niger-Kordofanian speaking farmer populations through socio-economic contacts, particularly since the extensive agriculture expansion in sub-Saharan Africa beginning five thousand years ago (kya). This complex demographic history must be controlled in order to find true signatures of adaptation to the high temperature, high humidity, and pathogen and parasite-enriched rainforest habitat of pygmies. We sequenced and obtained whole-genome sequences at >40X coverage for Baka pygmies from Cameroon, Biaka pygmies from the Central African Republic, and Niger-Kordofanian speaking Yoruba farmers from Nigeria. We used the model-based demographic inference tool ∂a∂i to infer the history of these populations. Our best-fit model suggests that the farmer and pygmy ancestors diverged from each other 150 kya and remained isolated from each other until 40 kya. This divergence is more ancient than estimated by previous studies using fewer loci, but is confirmed using PSMC, another demographic inference tool that uses different genomic information from ∂a∂i. Interestingly, our analysis shows that models with bi-directional asymmetric gene flow between farmers and pygmies are statistically better supported than previously suggested models with a single wave of uni-directional migration from farmers to pygmies. To identify possible targets of adaptation, we conducted a genomic scan using complementary methods, including the frequency-spectrum based G2D test, the population differentiation based XP-CLR test, and the haplotype based iHS test. We performed 10,000 simulations based on the above best-fit demographic model in order to assign the significance to each reported target of natural selection. Preliminary results reveal that genes involved in cell adhesion, cellular signaling, olfactory perception, and immunity were likely targeted by natural selection in the pygmies or their recent ancestors.
Genome Wide Signals of Pervasive Positive Selection in Human Evolution
David P. Enard, Philipp W. Messer, Dmitri A. Petrov Stanford University
The role of strong positive selection in human evolution remains highly controversial. On one hand, scans for positive selection have identified hundreds of candidate loci and the genomewide patterns of polymorphism show signatures consistent with frequent positive selection. On the other hand, recent studies have argued that many of the candidate loci are false positives and that most genome-wide signatures of adaptation are due to reduction of neutral diversity by linked recurrent deleterious mutations, known as background selection. Here we analyze human polymorphism data from the 1,000 Genomes project and detect signatures of pervasive positive selection once we correct for the effects of background selection. We show that levels of neutral polymorphism are lower near amino acid substitutions, with the strongest reduction of polymorphism observed specifically near functionally consequential amino acid substitutions. Furthermore, amino acid substitutions are associated with signatures of recent adaptation that are unlikely to be generated by background selection, such as the presence of unusually long and frequent haplotypes (measured with iHS and XPEHH) and specific distortions in the site frequency spectrum (measured with CLR). We use forward simulations to show that the observed signatures require a very high rate of strongly adaptive substitutions. Strikingly, the observed signatures of positive selection do not typically center at amino acid substitutions but rather at nearby ENCODE regulatory sequences. Our results establish that adaptation was frequent in human evolution and provide support for the hypothesis of King and Wilson that adaptive divergence is primarily driven by regulatory rather than coding changes.
Genomic Variation in the Strength of Background Selection Drives Fine-Scale Population
Structure in Humans
Raul Torres, Ryan D. Hernandez University of California, San Francisco
Patterns of observed genetic polymorphism across human populations have provided great insight into those populations‘ respective demographic histories, their underlying population structure, and the extent to which natural selection has been a driving evolutionary force. The serial bottlenecks that human populations have experienced due to their migrations from Africa to Europe, Asia, and the Americas has reduced the effective population sizes of these groups and has resulted in an increased rate of genetic drift. Population genetics methods and measures, such as principle components analysis (PCA) and Fst, have uncovered the genetic structure that exists between human populations, especially at the inter-continental scale. However, because of the processes of natural selection and recombination, we should not expect the rate of genetic drift to be homogenous across the entire genome. Rather, diversityreducing selection (e.g., genetic hitchhiking and background selection) near functional loci has resulted in a mosaic pattern of different effective population sizes across the human genome. Population genetics theory predicts that regions of the genome with reduced effective population size should drift faster, yielding greater rates of population divergence. Leveraging this fact, we use the POPRES dataset, a large genotyped population reference sample, to demonstrate that the loci contributing to the principle components explaining the greatest variance across European populations are strongly correlated with the strength of background selection. The effect of reduced effective population size and increased drift holds even at putatively neutral loci. We also recapitulate similar patterns using Fst. Our analysis highlights the drivers of fine-scale population structure that exists at the intra-continental scale, which does not necessarily require the action of local adaptation.
Multiple Instances of Ancient Balancing Selection Shared between Humans and Chimpanzees
Ellen M. Leffler1, Ziyue Gao1, Susanne Pfeifer2, Laure Ségurel1,3, Adam Auton2,8, Oliver Venn4, Rory Bowden2,4, Ronald Bontrop5, Jeffrey D. Wall6, Guy Sella1,7, Peter Donnelly2,4, Gilean McVean2,4, Molly Przeworski1,3 1University of Chicago; 2University of Oxford; 3Howard Hughes Medical Institute; 4Wellcome Trust Centre for Human Genetics; 5Biomedical Primate Research Centre; 6University of California, San Francisco; 7Hebrew University of Jerusalem; 8Albert Einstein College of Medicine
Balancing selection, in which two or more alleles are maintained in a population by selection, is predicted to lead to high diversity and to haplotypes with deep coalescence times. Moreover, if balancing selection pressures are older than species split times, species may share alleles identically by descent. Such long-lived balanced polymorphism is thought to be extremely rare, though modeling suggests that its footprint may be difficult to detect, leaving open the possibility that this mode of selection is more common. Using genome-wide data from 10 Western chimpanzees from the PanMap project and 59 humans from 1000 Genomes Pilot 1, we undertook a genome-wide search for orthologous sites polymorphic for the same alleles in both chimpanzee and human. We found that SNPs are shared in excess of what is expected by chance after accounting for local variation in the mutation rate. While it is difficult to distinguish balanced polymorphism from recurrent mutation for a single SNP, the short ancestral segments on which a balanced polymorphism resides may contain additional shared ancestral polymorphisms. We therefore focused on cases of two or more shared SNPs in close proximity with the same LD patterns in both species, which is unlikely to occur by recurrent mutation. Besides the MHC, we identified 125 such loci, only two of which overlap exons. Notably, nearby genes are enriched for membrane glycoproteins, which are often found at hostpathogen interfaces. For five of the loci, all noncoding, more than two pairs of SNPs are shared with the same LD pattern and a phylogenetic tree clusters by haplotype rather than by species, providing strong evidence that the polymorphisms are ancestral and pointing to new targets of selection. These results suggest that balancing selection in response to pathogen pressures has acted on regulatory variation in both humans and chimpanzees. More generally, given our conservative criteria, long-term balancing selection may be more common than previously believed.
Ancient Balancing Selection on ABO in Primates
Laure Segurel, Emma E. Thompson, Ziyue Gao, Carole Ober, Molly Przeworski University of Chicago
The ABO histo-blood group, an important determinant of transfusion incompatibility, was the first genetic polymorphism discovered in humans. Remarkably, A, B and H (O) antigens are found in many other primate species, and the same two amino acid changes are responsible for A and B specificity in all apes, Old World Monkeys and New World Monkeys sequenced to date. We recently showed that these genetic variants have persisted for at least 20 million years and in particular that humans and gibbons share A and B types due to identity by descent from a common ancestor (Segurel et al. 2012, PNAS). Polymorphisms in ABO have been associated with susceptibility to a number of human diseases, from gastric cancers to cholera or artery diseases, but the selective pressures maintaining the polymorphism remain unknown. We present evidence suggesting that variation in ABO has been maintained because of a critical role in host-pathogen interactions, unrelated to its presence in red blood cells. On this basis, we hypothesize that the histo-blood group labels A, B, AB and O do not offer a full description of variants maintained by natural selection, implying that there are unrecognized ABO variants of functional importance in humans and other primates.
Distortions in Genealogies Due to Purifying Selection and Recombination
Lauren Nicolaisen, Michael Desai Harvard University Dept. of Physics, Dept. of Organismic and Evolutionary Biology, and Center for Systems Biology
Pervasive purifying selection can lead to significant distortions in the shapes of genealogies. Recent empirical evidence suggests that these effects of background selection may be common in many natural populations. Although these effects are tempered by recombination, significant distortions can occur even in the presence of high recombination rates. We show that, provided purifying selection and/or recombination are sufficiently strong, these distortions may be characterized by a timedependent effective population size, which has a simple, analytical form. Our results illustrate how recombination reduces distortions in genealogies, and allow us to quantitatively describe the shapes of genealogies at a single site. We also extend our results to include more complex situations, including a distribution of fitness effects and variation in the recombination rate along the genome.
Recent Brain Adaptation Revealed by Meta-analysis of Human Positively Selected Genes
Chen Xie1, Yue Huang1, Adam Y. Ye1, Chuan-Yun Li1,2, Ge Gao1, Liping Wei1 1Center for Bioinformatics, School of Life Sciences, Peking University; 2Institute of Molecular Medicine, Peking University
Since the divergence from our closest relatives, chimpanzees and bonobos, human lineage have evolved many unique features, especially the highly-developed cognitive functions. When and how our ―clever‖ brain evolved? Many previous studies tried to address this question by studying adaptive evolution in brain-related genes, but their results showed conflicting conclusion. Here, we compiled a comprehensive list of human positively-selected genes and divided them into four groups according to their identification approaches. We found that genes which are highly expressed in the central nervous system are enriched in recent adaptive events, identified by approaches based on intra-species polymorphism data, and this pattern show robustness with different datasets, parameters and analysis pipelines. By performing functional category enrichment analysis, we further demonstrated that synapse-related functions are enriched in genes experienced recent positive selection, whereas immune-related functions are enriched in ancient positive selection events identified by inter-species coding region comparison. Most of our observations still hold even when we controlled four genomic characteristics which may bias the identification of positively-selected genes including gene length, gene density, GC composition and intensity of negative selection. Our analysis resolved previous conflict and highlighted recent adaptation in human brain.
Improvements In Silica-Based DNA Extraction Increase Access to Highly Fragmented Ancient DNA
Jesse Dabney, Matthias Meyer Max Planck Institute for Evolutionary Anthropology
High-throughput sequencing and library-based sample preparation procedures have greatly improved our ability to generate DNA sequences from trace amounts of highly degraded DNA isolated from ancient biological materials. With these technologies, molecules as short as 30bp may in principle yield phylogenetically informative sequences. We have noticed, however, that ancient DNA sequencing libraries generate much fewer short sequences than would be expected from random DNA fragmentation. We found that a loss of short molecules occurs during DNA extraction using a common silica-based extraction method. We have developed an improved silica-based extraction protocol that ameliorates this bias and enables the recovery of substantial numbers of DNA sequences from samples that contain almost no molecules longer than 50bp. We hope this method will increase our access to ancient DNA from difficult sources that have previously failed to yield sufficient sequence information for biological analyses.
General Approaches for Identifying Adaptations Involving Polygenic Traits
Jeremy J. Berg1, Graham Coop1 1University of California, Davis
Much adaptation likely proceeds through selection on complex, polygenic traits. However, statistical approaches to adaptation have focused largely on identifying the population genomic signature of strong selection at individual loci. When selection acts on complex, highly polygenic traits, however, the response is often expected to be small and noisy at individual loci, with the signal of adaptation only present when shifts in allele frequency are considered in aggregate across many loci. The availability of genome wide association study (GWAS) data for traits of potential adaptive importance, along with large population genomic datasets, call for new statistical methods to identify the population genetic signature of selection on complex traits.
We draw on insights from quantitative and population genetics to develop general statistical methods to aggregate the signal of adaptation across many GWAS loci for phenotypes of interest. We demonstrate tests to identify local adaptation correlated with an ecological variable, as well as to identify the signal of local adaption more broadly. Our approach can control for the confounding effect of population structure for a large number populations with arbitrary population history, while also controlling for various potential biases introduced by the ascertainment process. We apply these tools to detect the signature of adaptation on both the global and regional scale for multiple traits and diseases in humans. We also work through the theoretical connections between our tests and the familiar quantitative genetics parameter QST.
Lastly, in anticipation of larger and more sophisticated GWAS, we lay the groundwork for methods to test for population genomic evidence of correlated selection response among multiple traits as a result of genetic covariance among traits. Our approach offers a promising way forward for addressing polygenic adaptation across an increasing array of traits and organisms.