Analyzing 1,200 individuals from 11 populations genotyped for more than 500,000 SNPs (Population Reference Sample), we present a systematic exploration of the extent to which geographic coordinates of origin within Europe can be predicted, with small panels of SNPs. Markers are selected to correlate with the top principal components of the dataset, as we have previously demonstrated. Performing thorough cross-validation experiments we show that it is indeed possible to predict individual ancestry within Europe down to a few hundred kilometers from actual individual origin, using information from carefully selected panels of 500 or 1,000 SNPs. Furthermore, we show that these panels can be used to correctly assign the HapMap Phase 3 European populations to their geographic origin. [. . .] It is also worth noting that the largest average error was in the German samples and that the most accurately predicted populations were the Southern European and Irish ones. [. . .] Interestingly, within Europe, individual origin seems much easier to predict along the North to South axis than along the East to West axis. This could indicate increased gene flow along the latter axis.Estimated coordinates for the CEU sample (Utah whites):
Another intra-European AIMs paper
Drineas P, Lewis J, Paschou P (2010) Inferring Geographic Coordinates of Origin for Europeans Using Small Panels of Ancestry Informative Markers. PLoS ONE 5(8): e11892.