Was there ever a Ruling Class? 1000 years of Social Mobility

According to Gregory Clark, because of regression to the mean and the lack of any (e.g., racial) barrier preventing gene flow across classes, there was never a persistent ruling class in England.

Notes from a presentation by Clark last year containing "work in progress from a planned book on social mobility over the long run" (pdf):

What is the fundamental nature of human society? Is it stratified into enduring layers of privilege and want, with some mobility between the layers, but permanent social classes? Or is there, over generations, complete mobility between all ranks in the social hierarchy, and complete long run equal opportunity? [. . .]

This book systematically exploits a new method of tracing social mobility over many generations, surnames, to measure the persistence of classes over as much as 800 years, 24 generations. It looks at societies where surnames are inherited, unchanged, by children from fathers. In such cases they thus serve as a tracer of the distant social origins of the modern population (and interestingly also as a tracer of the Y chromosome).

In this role surnames are a surprisingly powerful instrument for measuring long run social mobility. The results they reveal are clear, powerful, and a shock to our casual intuitions.

(1) In England, where we can trace social mobility back to 1066 using surnames, there were never any long persistent ruling and lower classes for the indigenous population: not in medieval England, and not now. About 5-6 generations were, and are, enough to erase most echoes of initial advantage or want. For the English class is, and always was, an illusion. Histories such as those of the Stanley family turn out to be rare exceptions, not the rule.

(2) Paradoxically, while England reveals complete long run mobility, the rates of social mobility per generation, better measured by looking over multiple generations, turn out to be lower than is conventionally estimated. But the mathematics of mobility is such that even such slow regression to the mean, over time, will completely erase initial advantage and want.

(3) The rate of social mobility in England was as high in the middle ages as it is now. The arrival of the whole apparatus of free public education in the late nineteenth century, and the elimination of nepotism in government and private firms, has not improved the rate of social mobility.

(4) The extraordinarily complete long run mobility of England is likely typical of other western European societies. But other countries, in contrast, do exhibit persistent social classes over hundreds of years. In the US, for example, the Black population has persisted at the bottom of the social order, and the Jewish population at the top. In Chile surname evidence shows the indigenous population has remained at the bottom since the Spanish conquest of 1541. [. . .]

(7) Though parents at the top of the economic ladder in any generation in preindustrial England did not derive any lasting advantage for their progeny, there was one odd effect. Surname frequencies show was that there was a permanent increase in the share of the DNA in England from rich parents before 1850. After 1850 a frequency effect operated, but in reverse. Surname frequencies show the DNA share of families in England who were rich in 1850 declined relative to that of poor families of the same generation by 2010. [. . .]

What is the meaning and explanation of these results? This is a much more contentious and difficult area. The book argues for the following conclusions:

A. Why can’t the ruling class in a place like England defend itself against downwards mobility? If the main determinants of economic and social success were wealth, education and connections then there would be no explanation of the consistent tendency of the rich to regress to the society mean. Only if genetics is the main element in determining economic success, if nature trumps nurture, is there a built-in mechanism that ensures the observed regression. That mechanism is the intermarriage of the rich with those from the lower classes. Even though there is strong assortative mating, since this is based on the phenotype created in part by chance and luck, those of higher than average innate talent tend to systematically mate with those of lesser ability and regress to the mean.

B. Racial, ethnic and religious differences allow long persisting social stratification through the barriers they create to this intermarriage. Thus for a society to achieve complete social mobility it must achieve cultural homogeneity. Multiculturalism is the enemy of long run equality.

On the Normans:

Another R1b paper

The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269 (full text freely accessible):
Recently, the debate on the origins of the major European Y chromosome haplogroup R1b1b2-M269 has reignited, and opinion has moved away from Palaeolithic origins to the notion of a younger Neolithic spread of these chromosomes from the Near East. Here, we address this debate by investigating frequency patterns and diversity in the largest collection of R1b1b2-M269 chromosomes yet assembled. Our analysis reveals no geographical trends in diversity, in contradiction to expectation under the Neolithic hypothesis, and suggests an alternative explanation for the apparent cline in diversity recently described. We further investigate the young, STR-based time to the most recent common ancestor estimates proposed so far for R-M269-related lineages and find evidence for an appreciable effect of microsatellite choice on age estimates. As a consequence, the existing data and tools are insufficient to make credible estimates for the age of this haplogroup, and conclusions about the timing of its origin and dispersal should be viewed with a large degree of caution.
I find it hard to get too excited about this paper. As discussed previously, amateurs looking at more finely-resolved subclades using larger numbers of STRs do find trends in diversity that seem to point to an E. European origin for W. European R1b. I expect we'll have to wait a couple years, for overwhelming evidence to accumulate in the form of ancient DNA results and SNP-based dating, before seeing the correct route and timing of the entry of R1b into Europe widely agreed upon by academics. BBC article:
The extent to which modern Europeans are descended from these early farmers versus the indigenous hunter-gatherers who settled the continent thousands of years previously is a matter of heated debate. [. . .]

More than 100 million European men carry a type called R-M269, so identifying when this genetic group spread out is vital to understanding the peopling of Europe. [. . .]

A more recent origin for R-M269 than the Neolithic is also possible. But researchers point out that after the advent of agriculture, populations in Europe exploded, meaning that it would have been more difficult for incoming migrants to displace local people.

From the paper:
If the R-M269 lineage is more recent in origin than the Neolithic expansion, then its current distribution would have to be the result of major population movements occurring since that origin. For this haplogroup to be so ubiquitous, the population carrying R-S127 would have displaced most of the populations present in western Europe after the Neolithic agricultural transition.
Although the debate is commonly framed as Paleolithic vs. Neolithic, many lines of evidence suggest the correct answer is the third option: major post-Neolithic population movement.

Adaptive introgression of Denisovan HLA alleles across Asia?

John Hawks:
If it turns out that we have widespread adaptive introgression in Asia today from Denisovans, that will change the game of studying the origins of these populations. Based on the genome-wide comparison, it looks like the genetic interaction that led to the habitation of Asia did not involve Denisovans, who contributed only to populations at the most eastern extreme of habitation in island Southeast Asia. But the only Denisovans we know about lived near the geographic center of the Asian landmass, not at the extreme southeastern extreme.

The HLA pattern may suggest a more widespread pattern of mixture across Asia, which was later overwritten by population movements of people who didn't have Denisovan ancestry. That means that the habitation of Asia was a process of successive migrations and replacements, which imperfectly covered up the evidence of archaic intermixture. The genes that remain as signs of this intermixture are those that had selective advantages in later populations.

The paper: The Shaping of Modern Human Immune Systems by Multiregional Admixture with Archaic Humans

Deary / Visscher IQ paper

Those who read Sailer learned of this study a couple weeks ago. I've finally gotten around to looking at the actual paper, which seems convincing enough to me in doing what it says it does -- demonstrating "human intelligence is highly heritable and polygenic".

TGGP draws attention to comments by a blogger (Kevin Mitchell) who claims the paper "failed to establish the polygenic nature of the trait", but I don't see that Mitchell has a case. Mitchell:

I would interpret these findings very differently. What the authors do is analyse GWAS data in a very unusual way – they are not interested in finding specific SNPs affecting the trait, they simply use the SNPs to measure genetic relatedness between individuals.

As Mitchell then acknowledges, the paper does include a standard GWAS, the results of which are negative: at the level of individual SNPs not a single "replicable genome-wide significant association" is found. This is not surprising given the relatively small sample size and the (for me) expected polygenic nature of intelligence, but it (along with previous negative findings) tends to rule out any significant role for common variants of large effect in determining IQ.

The study uses SNPs across the genome to measure this relatedness and then shows it correlates with phenotypic similarity – i.e., the trait is heritable. We knew that already.

What they claim is that you can break down this effect by chromosome or by subregion. When they use the SNPs along longer chromosomes they seem to get a bigger effect – “explaining more of the phenotypic variance”. The inference is that thousands of SNPs, scattered across the whole genome, contribute to the trait or, more specifically to variance in the trait across the population (the implication is that they contribute to the value of the trait in individuals).

There is an alternative explanation for this effect, however, which is that using more SNPs simply gives a better estimate of genetic relatedness. So, the SNPs on chromosomes 1 (the longest) give a better estimate than those on chromosome 21 (the shortest) – they index relatedness with more precision. As a result, they correlate better with phenotypic similarity – this looks like you have “explained more of the variance”. In fact, getting such a signal from SNPs on chromosome 1 does not mean that any of the causal variants are actually on chromosome 1. Nor does the fact that such signals can be derived from anywhere in the genome mean that there are thousands of variants across the genome affecting the trait.

What Mitchell is claiming here is that the results could be explained by cryptic relatedness and/or population structure. However, the researchers address both issues, by excluding samples that appear to be related to other samples nearer than the level of 4th cousins and by including as covariates in their models the first few components of an MDS analysis. For non-close relatives in unstructured populations, how similar two individuals are on chromosome 1 tells us nothing about how similar they are on any other chromosome. Visscher was more explicit on this point in a commentary on the height paper:

What is the evidence that population structure is not causing the observed effects?

We took several steps to avoid population structure inflating the estimate of the variance explained by the SNPs. We excluded one individual from any pair that had an estimated relationship > 0.025 (approximately equivalent to between 3rd and 4th cousins). We fitted the first 20 principal components from the relationship matrix in the statistical model so that any population substructure that they picked up was excluded from the variance explained by the SNPs. Critically, we then estimated the correlation between the relationship matrices estimated from different chromosomes and did not find significant correlation. We tested a set of SNPs that are ancestry-informative in Europe for association with height and did not observe inflation of the test-statistics.

For the purpose of this paper, we performed an additional simulation experiment (inspired by comments from Dan Stram) by assuming that the causal variants were all carried on one set of chromosomes (odd numbers) and another set of chromosomes (even numbers) carried SNPs from which we estimated relatedness. If there is structure in the population then this would imply that a pair of individuals that are closely related on odd chromosomes will also be closely related on even chromosomes. We used the observed genotype data of 3,925 individuals and 295K SNPs as the basis of the simulation, and simulated 1,000 causal variants on the odd chromosomes with a total heritability of 80%. Then we performed a restricted maximum likelihood (REML) analysis of the simulated phenotypes on the genetic relationship matrix estimated from the SNPs on the even chromosomes. The estimates and standard errors (SEs) from 10 simulation replicates are shown in Table 1. Since REML estimates of variance are always positive, if the true variance explained is zero, we expect half the replicates to return an estimate of 0.0 and half to return an estimate with mean value 0.8 times the standard error. This is exactly what happened. Therefore we conclude (again) that there is no structure in the data that would inflate the estimate of the variance explained by the SNPs.

Steve Hsu correctly points out:

If I understand correctly, you want to claim that the observed population variation could be due to a few rare variants of large effect. But then it would be surprising for this study to have found .5 of the total variation to be associated with SNPs — compare to earlier studies using twins/adoptions/siblings that found narrow sense heritability of about .6 or so. I would not expect the rare alleles you hypothesize to be in good LD with SNPs (which are designed to tag common variants), so we would expect to lose a big chunk of the .6 additive heritability.

For example, in the Visscher paper on height they had to hand wave about imperfect LD to recover the full .8 or so of heritability. In this case the global fit comes out very close to .6, which suggests common rather than rare variants (at least, they are well tagged by SNPs). But if they are common variants their individual effect sizes must be small and there are a lot of them. Let me know if I am missing something.


I don’t think the population variation is caused by “a few” rare variants – I think it is (or could be at least) caused by a larger number of rare variants – different ones in different people.

This is getting to be a pretty silly argument: "different ones in different people" would add up to a very large number, which sounds "polygenic" enough to me (regardless of how many people have the major allele at most variable sites). And again: rare variants will be tagged less effectively (if at all) by common SNPs, so the causal variants whose effects are being estimated in this study can't be too rare. The contribution of rare variants to variability in intelligence is likely largely on top of the effect identified here, and probably mostly negative: an unusually high number of rare, deleterious mutations will tend to interfere with brain development and diminish IQ; an unusually low number will result in a higher IQ on average, explaining at least in part the associations commonly found between intelligence and other markers of "good genes" (health, physical attractiveness, and so on). A priori, though, it makes no sense to expect this type of variation to be the only or overwhelming source of genetic variability in IQ. Clearly, a very large number of genes affect brain development, and I expect pretty much all of these genes to be polymorphic. It's also clear tradeoffs affecting IQ exist (such as between brain size and energy expenditure) and that specific IQ-influencing alleles will have varying effects on fitness in different times and places. So it seems obvious to me common variants should be expected to play a major role in inter-individual and inter-population IQ differences.

Incidentally, looking again at the supplementary material for the height paper recently, I noticed the following addition:

In the version of this supplementary file originally posted online, Supplementary Fig. 2a and 2b were incorrect. The legend stated that in Supplementary Fig. 2a, PC1 versus PC2 was plotted when in fact PC2 versus PC3 was shown. Similarly, in Supplementary Fig. 2b, PC4 versus PC5 was plotted rather than PC3 versus PC4 as stated. This error is purely graphical and does not in any way affect the results or conclusions presented in the article.
Dasein spotted the strange-looking PCA at the time. I didn't think it materially affected that paper's conclusion, but I'm pleased to see that confirmed and the issue resolved.

Y haplogroup R1b and light hair in Italy

Via Italian Wikipedia.

Update addressing some questions/comments:

(1) The map specifically shows the frequency of blond hair; so yes the frequency of light hair in general will be higher.

(2) The map is adapted from Biasutti's Razze e popoli della Terra. The data was originally collected by Ridolfo Livi in 1859-1863.

(3) The Biasutti/Livi map shows a higher frequency of blond hair in Corsica than in Sardinia. In keeping with the apparent pattern elsewhere in Italy, the frequency of R1b appears to be markedly higher in Corsicans than in Sardinians (in this paper, "HG 1" in combination with "HG 22" roughly corresponds to R1b).

(4) "Does R1b necessarily correlate with light hair?" In Italy it pretty clearly does. If you mean am I suggesting a strict correspondence between light hair and haplogroup R1b, obviously I am not. Looking at Europe as a whole, I doubt much of a correlation exists. But the evidence is consistent with the bearers of R1b (or more specifically subclades of R-L11) being lighter than the previous inhabitants of Italy. This doesn't mean the original carriers of R-M417 and some subclades of I weren't probably also lighter-haired, or that as R1b spread throughout Europe and mixing occurred, R1b always remained associated with light hair. It does tend to add yet more weight against attempts to link R1b in Europe to migration of Neolithic farmers from Anatolia, but dispensing with that question for good awaits large, high-resolution studies of ancient and modern DNA.

"haplogroup R1b is found in some of it's highest concentrations among European peoples in Spain and Portugal -- two countries hardly known for blondes."

Within Iberia, though, it's certainly possible the pattern will hold. Among Iberians, Basques have some of the highest frequencies of both R1b and blondism. According to Coon: 'The French Basques are by no means all brunet; Collignon finds 22 per cent of blue eyes, 44 per cent of "medium," and 34 per cent of dark. Black hair is found in 7 per cent of the group, brown in 77 per cent, and light brown to blond in 16 per cent. Among the Spanish Basques the incidence of blondism is somewhat lower, but the Basques are still light when compared to most other inhabitants of Spain.'

Variance of R-P312 lineages highest in eastern Europe

I take this as further evidence most R1b arrived in western Europe by way of eastern (but not southeastern) Europe:

Recently-revealed structure in Y haplogroup R1a

Posters at dna-forums.com using data from the 1000 Genomes Project to identify new Y subclades have arrived at the following structure below M417:

Early results from commercial and academic testing suggest the bulk of Central Asian, Middle Eastern, and South Asian R1a will turn out to be Z93+ and L342.2+. An academic, posting at dna-forums:
It appears that Z93 and Z95, which, according to the heuristic tree from the 1K genomes project, are above L342.2 do separate most of the Europeans who are ancestral for Z93 and Z95 from the Pakistanis, Indians, Iranians, Ashkenazi Levites and the Eastern Turks (probably Kurds). [. . .] We do have some very preliminary results on Z93 and Z95 that would indicate that almost all Balkan and East European R1a1's are ancestral for Z93 and Z95. Also most of Western Turkey but not Eastern Turkey. I think that the Tuscans who are derived for Z93 and Z95 must be originally of Ashkenazi ancestry (perhaps also the Iberian).
Note: Ashkenazi Levite R1a is L342.2+. I can see no reasonable grounds on which to propose the Z93+ L342.2- TSI and IBS samples are of Jewish origin. More from the academic:
Most Pakistanis are Z93/Z95. We haven't tested many Indians, but the few we have are Z93/Z95. We haven't genotyped any other Z or L SNPs on R1a1 backgrounds. What amazes me is the clear geographic bifurcation between Middle East/South Asian Z93/Z95 (and by inference L342.2) and European markers such as M458. This points to a vary old what we term vicariance pattern between Europe and the Middle East with respect to R1a1. Maybe the original source of R1a1 is somewhere in the middle such as Armenia or Turkey and some R1a1 moved to Europe to become M458 and other newly discovered L# lineages and other R1a1's move to Iran/Pakistan/India/Central Asia to become Z93/Z95. I think that this bifurcation occurred at least 10,000 years ago, but then of course we tend to use the evolutionary mutation rates on YSTRs.
Another poster points out: "Dividing by 3 [to bring the estimate more in line with real mutation rates] gives an age of 3300 years, almost exactly the estimate from Nordtvedt's spreadsheet." Someone else recently estimated the TMRCA for L342.2+ at around 3,600 years. So: if current patterns hold, the bulk of South Asian R1a unambiguously falls within European R1a variation. While I fully expect, when we eventually see results for these markers in large academic samples published, the papers will feature evolutionary mutation rates and less than parsimonious attempts to fit the distribution of M417 sublineages to archaeology, it's pretty clear to me Z93 and L342.2 originated on the Steppe within the past 4000 years or so and spread with Indo-Iranian.

Swedish McDonald's

Open thread (7)

Links, off-topic discussion, etc. Previous open threads: 1 2 3 4 5 6

Editorial and preliminary paper on People of the British Isles project

Both freely accessible.

A British approach to sampling:

The acronym ‘PoBI’ may not yet be familiar to human geneticists in the way that ‘HGDP’, ‘WTCCC’ or ‘HapMap’ are, but a paper in this issue of EJHG1 that introduces the ‘People of the British Isles’ project to the scientific community aims to change this. The PoBI project will collect up to 5000 DNA samples from diverse regions of the British Isles, taking great care to sample individuals with several generations of ancestry in rural locations. These samples are intended to serve as controls for future medical genetic studies, and to provide insights into the peopling of the British Isles over the last few millennia. [. . .] Although readers will have to wait for future publications to discover the insights from these large-scale genetic analyses, the current paper describes the sampling strategy and initial 3865 samples in some detail, outlines an approach to investigating fine-scale population structure using surnames, and presents some preliminary genetic analyses of a handful of chosen loci. [. . .]

In addition to collecting blood, the project recorded surnames. Using data from a census performed in 1881, these were classified as ‘local’ or ‘non-local’, and the two classes examined separately. The authors then modelled a population such as that from central England as a mixture between south-western (taken to represent Ancient Britons) and eastern (Anglo Saxon) populations, and estimated the contribution of each population to the central England autosomal genotypes. These contributions differed between the local surname class (mostly eastern) and the non-local class (half and half), which the authors take as evidence of subtle population structure. Published genetic analyses using much larger numbers of markers have already detected low, but significant levels of genetic structure within Britain in more straightforward ways,4, 5 even with less stringently ascertained samples (Figure 1): Europe-wide south-east to north-west gradients extend into the British Isles. We can look forward to deeper insights into genetic differentiation and its causes when large-scale genetic analyses of the PoBI samples are available.

[. . .] anthropological and evolutionary geneticists should rejoice in the assembly of this resource, the foresight of The Wellcome Trust in funding the project over a decade or so, and hope that resources are available for establishing more cell lines and performing more genome-wide sequencing, so that both the full set of samples and their sequences can be made widely available.

It is obvious why British people interested in their ancestry, and medical geneticists working with British subjects should welcome PoBI, but why should others pay attention? PoBI will not provide information about global genetic diversity in the way that HGDP7 and HapMap8 do, but its microcosmic survey of genetic variation in a set of small islands off the western coast of the Eurasian continent is revealing the level of differentiation that builds up over millennia via events well documented by archaeology and history, so these alternative data sets can be compared to address questions about the initial peopling of the area, and its subsequent reshaping by internal and external forces. And if the characteristics of the British – politeness, eccentricity, or drunken loutishness, according to your viewpoint and experience – have any genetic basis, perhaps PoBI can provide a starting point for identifying it! 

People of the British Isles: preliminary analysis of genotypes and surnames in a UK-control population:
There is a great deal of interest in a fine-scale population structure in the UK, both as a signature of historical immigration events and because of the effect population structure may have on disease association studies. Although population structure appears to have a minor impact on the current generation of genome-wide association studies, it is likely to have a significant part in the next generation of studies designed to search for rare variants. A powerful way of detecting such structure is to control and document carefully the provenance of the samples involved. In this study, we describe the collection of a cohort of rural UK samples (The People of the British Isles), aimed at providing a well-characterised UK-control population that can be used as a resource by the research community, as well as providing a fine-scale genetic information on the British population. So far, some 4000 samples have been collected, the majority of which fit the criteria of coming from a rural area and having all four grandparents from approximately the same area. Analysis of the first 3865 samples that have been geocoded indicates that 75% have a mean distance between grandparental places of birth of 37.3 km, and that about 70% of grandparental places of birth can be classed as rural. Preliminary genotyping of 1057 samples demonstrates the value of these samples for investigating a fine-scale population structure within the UK, and shows how this can be enhanced by the use of surnames.

Swedish population structure

Via Dienekes, The Genetic Structure of the Swedish Population:
An analysis of genetic differentiation (based on pairwise Fst) indicated that the population of Sweden's southernmost counties are genetically closer to the HapMap CEU samples of Northern European ancestry than to the populations of Sweden's northernmost counties. [. . .] We have shown that genetic differences within a single country may be substantial, even when viewed on a European scale.
The paper is in PLoS ONE (i.e., it's open access). More:

Few of whites' best friends are black

Friends for better or for worse: Interracial friendship in the United States as seen through wedding party photos (pdf):
Four findings stand out. First, the few survey estimates of close adult interracial friendships may overstate their actual prevalence, especially whites’ reporting of close friendships with blacks. My results show that very few whites have black friends who are close enough to be in their wedding party (3.7%), less than all previous estimates among adults. I reasoned that estimates of cross-race friendships for whites based on the wedding party photos would be lower than those based on existing survey measures because wedding parties include only the closest friends who may often have to conform to intergenerational norms about racial contact and the expectations of extended family. Wedding parties also limit the pool of friends to a small number and cannot be exaggerated out of normative pressure. Compared with what would be expected if there were homogenous opportunity for friendships, whites are most likely to have a close E/SE Asian friend and least likely to have a black friend. These results suggest that Jackman and Crane’s (1986: p. 460) declaration using data from 1979 still rings true: “only a tiny minority of whites could rightly claim that ‘some of their best friends’ are black.”

Second, I hypothesized that there would be an asymmetry, by race, of inviting a friend to be in the wedding party and being invited to be in a friend’s wedding party, with whites being invited more than they invite friends of other races. Adjusting for group size, whites and E/SE Asians are equally likely to invite and be invited, but whites invite blacks only half as much as blacks invite whites, and E/SE Asians invite blacks only one- fifth as much as blacks invite E/SE Asians. This finding is consistent with the notion that whites are less accepting of interracial friendships, a finding that is no longer detectable in survey-based attitudinal data.