White Paper 23-04: Global Similarity’s Genetic Similarity Map
Global Similarity: Genetic Similarity Map is a 23andMe feature that situates customers of unknown ancestry in the midst of a two-dimensional plot of reference individuals of known ancestry from around the world. The plot, or map, is constructed from the genetic distances between the reference individuals and the customers, such that the closer two individuals are related genetically, the nearer they appear in the map. The effect is that, when certain conditions are satisfied, customers appear nearest to the group of individuals to which they are most closely related. For example, a person with four Irish grandparents will tend to cluster amid Irish reference individuals, and be farther from French reference individuals, and further yet from Italian reference individuals.
This document is a technical description of the feature and the procedure used to produce the genetic similarity maps.
Two points in response to Dienekes:
(1) The plots are generated using multidimensional scaling, not PCA.
(2) The white paper points out that the feature "is not well-suited to individuals of mixed ancestry, and does not provide the precision that the feature does with people of homogeneous ancestry", and notes that "Ancestry Painting" is "well-suited to studying customers with mixed ancestry." I see little value in some of the additions Dienekes suggests. Instead, improved STRUCTURE-type analyses with more relevant reference populations might be more appropriate for Hispanic/Jewish/Afram customers.