A Genetic Census of America

More complete AncestryDNA estimates of genetic ancestry by state (interactive maps at link):
Using AncestryDNA results from over a quarter million people, the AncestryDNA science team set out to perform a “genetic census” of the United States. [. . .]

Solely using ethnicity estimated by DNA, these maps reveal spatial patterns that are telling of the ancestral origins of present day Americans: where they came from and where they eventually settled. [. . .]


For example, let’s look at the Scandinavian map. Scandinavian immigrants – from Sweden, Norway, and Denmark – tended to settle in the upper Midwest where geography, culture, and local economics felt familiar to life in the old country.

On the map, these are the greenest regions: the states with the highest amounts of Scandinavian ancestry. In other words, DNA also suggests localized migration of individuals of Scandinavian origin to North Dakota, Minnesota, and neighboring states, with little migration to other U.S. regions. History agrees with genetics!


Look at the Irish ancestry map as another example. The highest statewide averages are concentrated in Massachusetts and other states in the Northeastern U.S. – where many Irish immigrants, forced to leave their homes and lands, settled in the 19th century. Growing numbers of Irish that arrived after the 1820s were often poor and common laborers, and took jobs in the construction of buildings, canals, roads, and railways in cities in the eastern United States.

Many of these cities still show the highest average amounts of Irish ethnicity in the U.S. today! DNA affirms that many descendants of Irish immigrants still live where their ancestors initially settled – in the Northeast.


If you look at the maps for Great Britain and Europe West, you see that other ancestries are more widespread across the whole country. Leading up to the Boston Tea Party and the Declaration of Independence in 1776, large numbers of Europeans arrived in what is now the U.S., in some cases to escape religious persecution. While there were subsequently many waves of immigration, individuals primarily from Western Europe and Great Britain were our first Americans.

That we see British ancestry in many people of the U.S. may be evidence of the long history of individuals from Great Britain migrating to the United States, and far and wide across those states.

As I mentioned, the "Irish" estimates are likely inflated in much of the country, with Scotch-Irish, Scottish, and Welsh probably contributing a considerable part of the "Irish" component outside of the Northeast.

Genetic estimate of percent Irish ancestry in US

'Based on AncestryDNA ethnicity estimates for over 300,000 AncestryDNA customers*, the AncestryDNA science team set out to discover the “most Irish” regions of the U.S.':

States with the highest Irish ancestry

First, for all AncestryDNA ethnicity estimates of people born in the same state, we averaged their fractions of Irish ethnicity. Then, we found the U.S. states whose residents have the highest, and lowest, amounts of Irish ancestry.

On the map are the top five states with the highest average Irish ancestry. Massachusetts is #1, and all of the other top states are also in the Northeast.

AncestryDNA estimates its Massachusetts-born customers average 28.5% Irish genetically, which is reasonably close to my surname-based estimate of 26% (using 1940 census data).

AncestryDNA's estimates of Irish ancestry for much of the rest of the country are likely inflated, however. AncestryDNA's "Irish" cluster spills over into Scotland and Wales, and to a lesser extent even into England and France. While (in an analysis shown in the AncestryDNA white paper) 95% of Irish are placed into the "Irish" cluster, only something like 60% of British are placed into the "Great Britain" cluster (with most of the rest presumably being placed into either the "Irish" or "Europe West" clusters). AncestryDNA's estimates rely on ADMIXTURE, an allele frequency-based approach, whereas I think very large data sets and an approach that makes use of haplotype information will be needed to clearly dissect recent ancestry within Northwestern Europe.

Research on the recent human evolution will benefit from the implementation of extended genetic genealogical data. The approach to combine deep-rooted pedigrees with genetic information advances the understanding of changes in the human population genetic structure during the last centuries. This recent advance is mainly based on the extensive growth of whole genome sequencing data and available genealogical data of high quality. Moreover, according to the latest genetic genealogical research the historical non-paternity rate in Western Europe is estimated around 1% per generation within the last four centuries, which means that the expected relationship between the legal genealogy and the genetics of DNA donors exists. Therefore, genetic genealogical data will help with three research aims of human evolutionary studies: (I) detecting signals of (past) population stratification and interpreting the population structure in a more objective manner, (II) obtaining the time scale and impact of particular detected gene flow events more accurately and (III) determining temporal genetic differentiation within a population by combining in-depth pedigree data with haploid markers. Each of these research aims will be discussed with examples of the human population in Flanders (Western Europe). At the end, we will discuss the advantages and pitfalls of using genetic genealogy within studies on human evolutionary genomics.

Most approaches aiming at finding genes involved in adaptive events have focused on the detection of outlier loci, which resulted in the discovery of individually ´significant´ genes with strong effects. However, a collection of small effect mutations could have a large effect on a given biological pathway that includes many genes, and such a polygenic mode of adaptation has not been systematically investigated in humans or other mammals. We therefore propose to evidence polygenic selection by detecting signals of adaptation at the pathway or gene set level instead of analyzing single independent genes. Using a gene-set enrichment test, we identify genome-wide signals of recent adaptation among human populations as well as more ancient signals of adaptation in the human lineage and in primates.

Changes in the subsistence mode of a species can lead to adaptive evolution of new functions, while it can also cause relaxed negative selection in previously essential functions. While positive selection in humans has been intensely studied, functional processes subject to relaxed constraints in the human lineage remain largely unknown. Here we present a framework for detecting relaxation of selective constraints that affect a particular functional process specifically in one taxon. Jointly using human and chimpanzee population genomic data with mammalian comparative genomic data, we identify olfactory receptors and proteasome subunits as candidates of relaxed constraints in humans: both gene sets contain high frequency non-synonymous mutations in humans while having conserved amino-acid sequences across other mammals. We further discuss the possible underlying causes of this signal.

Compelling evidence from many animal taxa indicates that male genitalia are often under post-copulatory sexual selection for characteristics that increase a male’s relative fertilization success under sperm competition. There could, however, also be direct pre-copulatory female mate choice based on male genital traits. Before clothing, the non-retractable human penis would have been conspicuous to potential mates. This, in combination with claims that humans have a large penis for their body size compared to other primates, has generated suggestions that human penis size partly evolved due to female choice. We presented women with digitally projected fully life-size, computer-generated animations of male figures to quantify the (interactive) effects of penis size, body shape and height on female assessment of male sexual attractiveness. We generated 343 male figures that each had one of seven possible values for each of the three test traits (7x7x7 = 343). All seven test values per trait were within two standard deviations of the mean based on a representative sample of males. We calculate response (fitness) surfaces based on the average attractiveness rank each of the 343 male figure received. We also calculated individual response surfaces for 105 women (each women viewed 53 figures). Both methods yielded almost identical results. We discuss our finding in the context of previous studies that have taken a univariate approach to quantify female preferences. We discuss the hypothesis that pre-copulatory sexual selection might play a role in the evolution of genital traits.

The combined use of geometric morphometrics and quantitative genetics provides a set of powerful tools for obtaining quantitative information that is crucial for many important questions concerning the evolution of shape. In particular, the demographic information that is available for human populations make humans a unique study system for studying the mechanisms of evolutionary change in morphological traits. We investigate skull shape in the population of Hallstatt (Austria), where a collection of human skulls with associated records offer a unique opportunity for such studies. We use an individual-based statistical model to estimate the genetic covariance matrix, and characterize selection using fitness estimates from demographic data. We find clear evidence for directional selection, but not for nonlinear selection (stabilizing or disruptive selection). The predicted response to this selection, computed with genetic parameters from the population, does not match the estimate of secular change over the 150-year range of the data. We discuss possible reasons for the mismatch.

Human facial attractiveness and facial sexual dimorphism (masculinity–femininity) are important facets of mate choice and are hypothesized to honestly advertise genetic quality. However, it is unclear whether genes influencing facial attractiveness and masculinity–femininity have similar, opposing, or independent effects across sex, and the heritability of these phenotypes is poorly characterized. To investigate these issues, we assessed facial attractiveness and facial masculinity–femininity in the largest genetically informative sample (n = 1,580 same- and opposite-sex twin pairs and siblings) to assess these questions to date. The heritability was ~0.50–0.70 for attractiveness and ~0.40–0.50 for facial masculinity–femininity, indicating that, despite ostensible selection on genes influencing these traits, substantial genetic variation persists in both. Importantly, we found evidence for intralocus sexual conflict, whereby alleles that increase masculinity in males have the same effect in females. Additionally, genetic influences on attractiveness were shared across the sexes, suggesting that attractive fathers tend to have attractive daughters and attractive mothers tend to have attractive sons.

Individual identity signaling in humans

Interesting-looking poster title:
Michael Sheehan Morphological and population genomic evidence of selection for individual identity signaling in human faces

There's no abstract, but one area where I suspect selection of this sort may turn out to be relevant (at least more relevant than Peter Frost-style sexual selection) is in explaining European hair and eye color variation.

"Traits signaling identity should be highly variable, often display polymodal distributions, not be condition dependent (i.e., be cheap to produce and/or maintain), not be associated with fitness differences, exhibit independent assortment of component characters, and often occur as fixed phenotypes with a high degree of genetic determination."

"Is human facial distinctiveness an adaptive signal of individual identity? From a sociobiological perspective, humans seem to have the ‘perfect storm’ of selection pressures that might favor recognizability. We are extremely social, interacting repeatedly with large numbers of individuals, each with varying roles in our lives. We are extremely cooperative, and we make complex decisions about whether and how much to cooperate based on kinship, friendship and social reputation [39,78]."

Friendship and natural selection

NRNB Symposium on Network Biology 2012, Gladstone institutes, San Francisco: James Fowler presents Friendship and Natural Selection (link)

Friendship and Natural Selection

Nicholas A. Christakis, James H. Fowler

More than any other species, humans form social ties to individuals who are neither kin nor mates, and these ties tend to be with similar people. Here, we show that this similarity extends to genotypes. Across the whole genome, friends' genotypes at the SNP level tend to be positively correlated (homophilic); however, certain genotypes are negatively correlated (heterophilic). A focused gene set analysis suggests that some of the overall correlation can be explained by specific systems; for example, an olfactory gene set is homophilic and an immune system gene set is heterophilic. Finally, homophilic genotypes exhibit significantly higher measures of positive selection, suggesting that, on average, they may yield a synergistic fitness advantage that has been helping to drive recent human evolution.

Modeling European demographic history using neutral genomic regions

Neutral genomic regions refine models of recent rapid human population growth

In this study, we introduce targeted sequencing data for studying recent human history with minimal confounding by natural selection. We sequenced putatively neutral loci that are very far from genes and that meet a wide array of additional criteria. [. . .]

The best-fit model points to Europeans having experienced recent growth from an effective population size of about 4-7 thousand individuals as recently as 120--160 generations (3000--4000 years) ago. Growth over the last 3000-4000 years is estimated at an average rate of about 2--5% per generation, resulting in an overall increase in effective population size of two orders of magnitude.

Rare functional variants in French Canadians

Whole-Exome Sequencing Reveals a Rapid Change in the Frequency of Rare Functional Variants in a Founding Population of Humans
In this work, we analyze the whole-exome sequences of French-Canadian individuals, a founder population with a unique demographic history that includes an original population bottleneck less than 20 generations ago, followed by a demographic explosion, and the whole exomes of French individuals sampled from France. We show that in less than 20 generations of genetic isolation from the French population, the genetic pool of French-Canadians shows reduced levels of diversity, higher homozygosity, and an excess of rare variants with low variant sharing with Europeans. Furthermore, the French-Canadian population contains a larger proportion of putatively damaging functional variants, which could partially explain the increased incidence of genetic disease in the province. Our results highlight the impact of population demography on genetic fitness and the contribution of rare variants to the human genetic variation landscape, emphasizing the need for deep cataloguing of genetic variants by resequencing worldwide human populations in order to truly assess disease risk.

Charles Sumner, pre- and post-France

Charles Sumner, traveling in Maryland (February 24, 1834): "The whole country was barren and cheerless; houses were sprinkled very thinly on the road, and when they did appear they were little better than hovels [. . .] For the first time I saw slaves, and my worst preconception of their appearance and ignorance did not fall as low as their actual stupidity. They appear to be nothing more than moving masses of flesh, unendowed with any thing of intelligence above the brutes. I have now an idea of the blight upon that part of our country in which they live."

Charles Sumner, studying in Paris (January 13, 1838): "[The lecturer] had quite a large audience, among whom I noticed two or three blacks, or rather mulattoes,— two-thirds black, perhaps, — dressed quite a la mode, and having the easy, jaunty air of young men of fashion, who were well received by their fellow students. They were standing in the midst of a knot of young men; and their color seemed to be no objection to them. I was glad to see this; though, with American impressions, it seemed very strange. It must be, then, that the distance between free blacks and the whites among us is derived from education, and does not exist in the nature of things."

David McCullough, in The Greater Journey: Americans in Paris:

It was for Sumner a stunning revelation. Until this point he is not known to have shown any particular interest in the lives of black people, neither free blacks nor slaves. On his trip to Washington a few years earlier, traveling by rail through Maryland, he had seen slaves for the first time. They were working in the fields, and as he made clear in his journal, he felt only disdain for them. [. . .] He was to think that way no longer.

It would be a while before Sumner's revelation--that attitudes about race in America were taught, not part of "the nature of things"--would take effect in his career, but when it did, the consequences would be profound. Indeed, of all that Americans were to "bring home" from their time in Paris in the form of newly acquired professional skills, new ideas, and new ways of seeing things, this insight was to be as important as any.

James Watson: reduce parental age

DNA pioneer James Watson's genetic prescription: Have kids early

"If you add together all the mental diseases ... your chance of having a child with something bad is about 5 percent," Watson explained [. . .] So here's Watson's prescription: "You could reduce the frequency of this 5 percent — maybe down to one and a half percent, or 1 percent — if everyone had their children or if the DNA came from them when they were 15," he said.

Related: Paternal age and fitness in pre-industrial Finland (SMBE 2013)

Black men have lower sperm counts than white men

Semen parameters in fertile US men: the Study for Future Families

The Study for Future Families (SFF) recruited men who were partners of pregnant women attending prenatal clinics in Los Angeles CA, Minneapolis MN, Columbia MO, New York City NY and Iowa City IA. Semen samples were collected on site from 763 men (73% White, 15% Hispanic/Latino, 7% Black and 5% Asian or other ethnic group) using strict quality control and well-defined protocols. [. . .] Black men had significantly lower semen volume, sperm concentration and total motile sperm counts than White and Hispanic/Latino men.
This is consistent with the other evidence I'm aware of. Lower sperm counts have been noted in Africa, and a study in Rochester, NY, that included a small number of American blacks similarly found:
All sperm parameters were significantly lower in the small subgroup (n = 7) of African-American men compared with other men in this population (p-values for sperm parameters, < 0.001 to 0.016).
Also consistent with these results: the only autopsy studies I'm aware of (at least one of which Rushton knew of before he became selectively forgetful) both suggest black men have smaller/lighter testes than white men.

Estimating the proportion of Irish ancestry in the US and Massachusetts

[See Estimating the proportion of Puritan genes in America's white population for links to census data.]

"A Survey of Irish Surnames 1992-97" (pdf) lists the following as the 10 most common surnames in Ireland in the 1990s:

1. Murphy 2. (O)Kelly 3. Walsh(e) 4. (O)Connor 5. (O)Sullivan 6. (O)Byrne 7. (O)Brien 8. Ryan 9. Smith/Smyth 10. (O)Neill

We'll exclude Smith/Smyth for obvious reasons. The remaining 9 most common names, all of Gaelic origin, cover 7.85% of the 1990s Irish population. (With the 1890 data, the number would be 7.67%; but that's leaving out some of the variants included in the 1990s survey.) Northern Ireland's inclusion in the survey might end up inflating our surname-based Irish Catholic population estimates by something like 10%, but I'm not worried about this level of error right now.

The number of US whites bearing one of the nine most common Irish surnames in 2000, from Census data: 1188571

The extrapolated equivalent total number of Irish individuals among the US white population in 2000: 15141032

Which comes out to 7.78% of the ancestry of the US non-Hispanic white population in 2000.

15 million (or maybe 13.5 million) descendants is certainly a more plausible biological outcome of 4.5 million Irish immigrants than the "40 million Irish Americans" we see from census self-identifications.

But it appears there's considerably less disconnect between levels of Irish ancestry and Irish self-identification in Massachusetts (vs. the US as a whole).

In the 1940 Census (the 2000 Census surname data is not available broken down by state), 87028 Massachusetts whites had one of the nine most common Irish names. Based on that, we can estimate the number of Irish in MA was 1108637 -- or 25.9% of the total 1940 MA white population of 4280019.

The 2005-2009 American Community Survey 5-Year Estimates put the Irish proportion of the Massachusetts population, based on self-identification, at 23.7% (vs. 11.9% for English). Or, considering only the non-Hispanic white population, something like 29% identify as Irish.

This better agreement likely reflects relatively lower levels of intermarriage in MA, as might be expected from the state's greater Irish concentration.

The future of genealogy

An ASHG 2013 abstract from AncestryDNA:

Reconstruction of Ancestral Human Genomes from Genome-Wide DNA Matches.

Individuals who lived long ago may still have much or all of their genome present in modern populations. The genomes of these individuals exist in small segments broken down by recombination and inherited in part by his or her descendants. If such an individual had many children, leading to a large number of descendants today, much of the ancestral genome will be present in modern populations. For the pairs of descendants with the “target” ancestor as their most recent common ancestor (MRCA), any region of their genomes shared identical-by-descent (IBD) most likely represents the corresponding region of the ancestor’s genome. Given a set of pairs of individuals linked to the same MRCA, we develop a novel computational approach to reconstruct the haplotypes of the MRCA from the IBD segments and haplotypes of the descendants. With simulated data we assess the performance of our method, affected by factors such as quality of genealogical trees used to infer the MRCA, reliability of inferred IBD, coverage of IBD segments, number of descendants of the MRCA, and number of sampled descendants. To demonstrate the utility of our method, we examine over 125,000 individuals in the AncestryDNA database with phased genome-wide single nucleotide polymorphism data and detailed genealogical information. After first identifying regions of the genome shared IBD between all individuals, we selected one group of several hundred individuals with an 18th century couple as a known MRCA. Using our method to tile together these individuals’ IBD segments, we are able to reliably construct the ancestral couple’s four haplotypes in large genomic regions with high coverage of IBD segments. In regions of the genome with lower IBD coverage, we are unable to identify and construct all haplotypes with certainty. Our study demonstrates the possibility of reconstructing the genomes of human ancestors, with large family sizes and a large number of living descendants, who lived one to even 12 generations ago. The ability to reconstruct the genomes of human ancestors using genetic and genealogical data has exciting implications in the fields of population genetics, medical genetics, and genealogy research.

Blaine Bettinger has a longer post, The Science Fiction Future of Genetic Genealogy, inspired by the abstract.

While the potential for this sort of thing has been apparent for years, it's good to see concrete steps being taken in this direction. A related (perhaps slightly over-optimistic) 2010 post by Tamura Jones:

Evidence both purifying selection and positive selection act on MC1R in S. Europe

Mol Biol Evol (2013) doi: 10.1093/molbev/mst158 First published online: September 17, 2013

Simultaneous purifying selection on the ancestral MC1R allele and positive selection on the melanoma-risk allele V60L in South Europeans

Martínez-Cadenas et al.

In humans, the geographical apportionment of the coding diversity of the pigmentary locus MC1R is, unusually, higher in Eurasians than in Africans. This atypical observation has been interpreted as the result of purifying selection due to functional constraint on MC1R in high UVB radiation environments. By analyzing 3,142 human MC1R alleles from different regions of Spain in the context of additional haplotypic information from the 1000 Genomes (1000G) Project data, we show that purifying selection is also strong in Southern Europe, but not so in Northern Europe. Furthermore, we show that purifying and positive selection act simultaneously on MC1R. Thus, at least in Spain, regions at opposite ends of the incident UV-B radiation distribution show significantly different frequencies for the melanoma-risk allele V60L (a mutation also associated to red hair and fair skin and even blonde hair), with higher frequency of V60L at those regions of lower incident UV-B radiation. Besides, using the 1000G South-European data, we show that the V60L haplogroup is also characterized by an EHH pattern indicative of positive selection. We, thus, provide evidence for an adaptive value of human skin depigmentation in Europe and illustrate how an adaptive process can simultaneously help maintain a disease-risk allele. In addition, our data support the hypothesis proposed by Jablonski and Chaplin (2010), which posits that habitation of middle latitudes involved the evolution of partially depigmented phenotypes that are still capable of suitable tanning.

"Alienated" ethno-religious minorities preferring "inclusive" anti-majority narratives

Even more importantly, says Ryn, Catholics recognize in Straussians figures who share their own “alienation” about living in a predominantly Protestant country. [. . .] Straussians provide a narrative about the American founding that make ethnic Catholics feel secure about their Americanness.

Paul Gottfried on Leo Strauss, Allan Bloom, and their Catholic dupes:

Arthur Meier Schlesinger channeling Madison Grant (1921)

The swarming of foreigners into the great industries occurred at considerable cost to the native workingmen, for the latter struggled in vain for higher wages or better conditions as long as the employers could command the services of an inexhaustible supply of foreign laborers. Thus, the new immigration has made it easier for the few to amass enormous fortunes at the expense of the many and has helped to create in this country for the first time yawning inequalities of wealth.

Most sociologists believe that the addition of hordes of foreigners to the population of the United States has caused a decline in the birth-rate of the old American stock, for the native laborer has been forced to avoid large families in order to be in a position to meet the growing severity of the economic competition forced upon him by the immigrant. This condition, joined to the tendency of immigrant laborers to crowd the native Americans farther and farther from the industrial centers of the country, has caused the great communities and commonwealths of the Atlantic seaboard, about whose names cluster the heroic traditions of revolutionary times, to change completely their original characters. Puritan New England is today the home of a population of whom two-thirds were born in foreign lands or else had parents who were. Boston is as cosmopolitan a city as Chicago; and Faneuil Hall is an anachronism, a curiosity of bygone days left stranded on the shores of the Italian quarter. In fifteen of the largest cities of the United States the foreign immigrants and their children outnumber the native whites; and by the same token alien racial elements are in the majority in thirteen of the states of the Union. When President Wilson was at the Peace Conference, he reminded the Italian delegates that there were more of their countrymen in New York than in any Italian city; and it is not beside the point to add here that New York is also the greatest Irish city in the world and the largest Jewish city.

Whatever of history may be made in the future in these parts of the country will not be the result primarily of an "Anglo-Saxon" heritage but will be the product of the interaction of these more recent racial elements upon each other and their joint reaction to the American scene. Unless the unanticipated should intervene, the stewardship of American ideals and culture is destined to pass to a new composite American type now in the process of making. [. . .]

To the immigrant must also be assigned the responsibility for the accelerated growth of political and industrial radicalism in this country. While most of the newcomers quietly accepted their humble place in American society, a minority of the immigrants consisted of political refugees and other extremists, embittered by their experiences in European countries and suspicious of constituted authority under whatever guise.

From an essay in which the half-Jewish child of immigrants helpfully explains "The Significance of Immigration in American History" (the inevitable conclusion, naturally, being that America is "a nation of immigrants" -- or something like that). More:

Social changes in New England in the past fifty years (1900)

If some supernatural observer could have taken a bird's-eye view of New England in 1850 and again in 1900, he would read the story of change in plain characters. Approaching New England, as would become a Superior Intelligence, by way of Boston, he would find the region for some fifteen miles around the gilded dome on Beacon Hill so "filled in " as to form a continuous city with a million people, nearly half of them — figuring back for three generations — being Irish, about one-sixth "Old Americans," and the rest Germans, British, Scandinavians, Italians, Frenchmen, Chinamen, and citizens generally. [. . .] In Fall River, with 85 per cent, of foreign population, he might inquire his way half a dozen times before meeting a person who spoke English.

Having left a New England of full-blooded Yankees, which supplied its own wants and sent little abroad, he finds a population half foreign, dependent on others for its corn and grain and beef and mutton, but supplying half the nation with boots and shoes, making three-fourth's of its cottons and using half its wool.

Social changes in New England in the past fifty years

By Edwin Webster Sanborn

Fifty years ago the new order of things had made little change in the outward appearance of New England. It was still a compact community, peopled for the most part by direct descendants of the old Puritan stock. It was a land of farmers, and the type of New England life was the country village. Commerce and fisheries were important sources of wealth; but merchants and seafaring men, as well as the minister, lawyer, doctor, and mechanic, generally owned a little land, and helped to make agriculture the prevailing occupation. Factories had been slowly taking the place of household' industry, yet manners and way of living belonged to the homespun age.