Table Of ContentCurrent Biology, Volume 24 
Supplemental Information 
Lactase Persistence Alleles Reveal 
Partial East African Ancestry 
of Southern African Khoe Pastoralists 
Gwenna Breton, Carina M. Schlebusch, Marlize Lombard, Per Sjödin, Himla Soodyall, and Mattias 
Jakobsson
Inventory of Supplemental Information 
Figures S1a-c. Haplotype plots. Related to Figure 2 
Figure S2. Genome local ancestry inference. Related to Figure 2 
Figure S3. ADMIXTURE runs. Related to Figure 3 
Figure S4. TREEMIX analysis. Related to Figure 3 
Figure S5. Principal Components Analysis. Related to Figure 3 
Figure S6. Selection scan results. Related to Results and Discussion 
Figure S7. Selection coefficient estimation. Related to Results and Discussion 
Table S1. Population information. Related to Table 1  
Table S2. Polymorphism frequencies. Related to Table 1 
Table  S3.  Formal  tests  of  admixture  and  dating  of  admixture  times.  Related  to  Results  and 
Discussion 
Table S4. Population information for admixture analyses. Related to Figure 3 
Supplemental Introduction 
Supplemental Experimental Procedures 
Supplemental Discussion 
Supplemental References
Supplementary Figures: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure S1a: Haplotype plots over a 54.6 kb region of chromosome 2, showing the haplotype block surrounding the East 
African LP-variant for the merged dataset. All individuals containing the G variant at the 136,608,746 position in the southern 
African dataset has been extracted and only haplotypes containing the G variant at the 136,608,746 position (known from 
southern African only datasets above) was visualized together with all MKK haplotypes (without any filtering). The consensus 
sequence, showing the major allele in the Nama population, is presented on top of the figure. Haplotypes of individuals are 
shown below the consensus sequence and positions that differ from the consensus sequence are colored red, while positions 
similar to the consensus sequence are shown in blue. The Y axis indicates the population groups to which haplotypes belong to 
and the X-axis, the base pair number of the SNPs on chromosome 2.
Figure S1b: Haplotype plot for southern African individuals over a 54.6 kb region of chromosome 2, showing the haplotype 
block surrounding the European LP-variant. Individuals containing the European variant, 13910T, at position 136,608,646 on 
chromosome 2 (homozygous and heterozygous) were extracted. Thereafter the haplotypes of these individuals were sorted 
according to the variant they contain at position 136,608,646 and visualized here. The consensus sequence, showing the major 
allele in the Coloured Wellington population, is presented on top of the figure. Haplotypes of individuals are shown below the 
consensus sequence and positions that differ from the consensus sequence are colored red, while positions similar to the 
consensus sequence are shown in blue. The Y axis indicates the population groups to which haplotypes belong to and the X-
axis, the base pair number of the SNPs on chromosome 2. The position of the European LP-variant is highlighted in green on 
the X-axis. A green block outlines the common haplotype background, or haplotype block, associated with the European LP-
variant.
Figure S1c: Haplotype plots over a 54.6 kb region of chromosome 2, showing the haplotype block surrounding the European 
LP-variant for the merged dataset. All individuals of the southern African dataset containing the T variant at position 
136,608,646 have been extracted and their haplotypes sorted according to the variant at the 136,608,646 position before 
visualization together with all CEU haplotypes (without any filtering). The consensus sequence, showing the major allele in the 
Coloured Wellington population, is presented on top of the figure. Haplotypes of individuals are shown below the consensus 
sequence and positions that differ from the consensus sequence are colored red, while positions similar to the consensus 
sequence are shown in blue. The Y axis indicates the population groups to which haplotypes belong to and the X-axis, the base 
pair number of the SNPs on chromosome 2. The position of the European LP-variant is highlighted in green on the X-axis.
Figure S2: Genome local ancestry inference surrounding the LCT and MCM6 genes (positions highlighted red on X-axis) 
using parental populations of Ju’/hoansi San (red), East African Maasai (blue), West African Yoruba (green) and European 
HapMap CEU (yellow) populations. X-axis show the position along chromosome 2 and bars along the Y-axis are the 14 Nama 
chromosomes, the Y-axis for each individual shows the number of assignments to each of the four parental populations. 
Symbols on the right indicate whether the 14010C mutation was present in the particular individual (each individual is 
represented by two horizontal bars – chromosomes). Symbols are $ for homozygous for 14010C; # for heterozygous for 14010, 
+ for homozygous for 14010G (variants were typed separately via sequencing the LCT control region).
Figure S3: Full ADMIXTURE results: Genetic clustering analysis. Clustering of 766 individuals from 43 populations (233,363 
SNPs) assuming two to 11 clusters (K=2-11).
Figure S4: TREEMIX results. A) Tree of Khoe-San populations with comparative East and West African populations, assuming 3 
migration edges. B) Tree of Khoe-San populations with comparative European, East and West African populations, assuming 6 
migration edges.
Figure S5: PCA results. A) First 4 principal components (PCs) including all Khoe-San populations together with East African, 
West African and European comparative populations. B) First two PCs of only the Nama together with East African, West African 
and European comparative populations.
Figure S6: Selection scans. |iHS| values in the region chromosome 2 from position 135.5-137.5 Mb including the LCT and MCM6 
genes. Black dots are individual |iHS| values while the gray line is an average over 30 SNPs with a 1 SNP step length. The orange 
horizontal bars give the genome wide |iHS| average for the particular population and the vertical lines indicate the standard 
deviation. The locations of genes in the region are shown by blue rectangles and the gene names are given above.
Description:Supplemental Introduction . San (Khoe-speaking)  Sotho-Tswana  4The GUG group is a mixed group of San and Bantu-speaking people who had  Schaffner, S.F., Yu, F., Peltonen, L., Dermitzakis, E., Bonnen, P.E., Altshuler,