Update on Androgen Receptor gene

An Encore for the Repeats: New Insights into an Old Genetic Variant
It is commonly accepted that the length of the polyQ tract influences the transactivation capacity of the receptor in an inverse manner; that is, the longer the tract, the lower the activity. To support this hypothesis, a clear negative impact on AR activity is documented in relationship with pathological expansions of the repeat length (40 or more), known as the Kennedy syndrome (5). This syndrome is characterized by spinobulbar muscular atrophy and hypoandrogenism due to partial androgen insensitivity. On the other hand, controversies still exist about the effect of variations in polyQ within the normal polymorphic range. The normal distribution of the (CAG)n is reported as 6–39 repeats, with a median of 21–22 in White Caucasian, 19–20 in African-American, 22–23 in Asian, and 23 in Hispanic populations. Clinical observations showing a linear correlation between testosterone level and CAG repeat length support the notion of a functional effect of the polymorphism within the normal range. In fact, increased circulating testosterone and estradiol levels in men with a higher number of CAG repeats can be considered as a compensatory mechanism aimed to overcome the weaker AR activity (6, 7). However, such a linear correlation has not been clearly demonstrated by in vitro experiments. The first two functional studies reported that the longest tract (Q31) displayed lower activity when compared with the shortest one (Q15). However, no significant differences were observed by comparing these two types of alleles to an intermediate number of CAG repeats (20 or 24) (8, 9). Quite strikingly, two recent articles provided evidence for the lack of a stepwise reduction in activity with increasing CAG length across the polymorphic range (10, 11). The reporter gene assay with three different CAG lengths (16, 22, and 28) has indeed shown the highest AR activity in the presence of 22 CAG repeats(10). The other study, performed in a human prostate tumor cell model, has provided mechanistic insights into how both increased and decreased polyQ allele length may negatively affect receptor function (11). This study has revealed a critical polyglutamine size (Q16-Q29) for optimal androgen-induced AR signaling, which corresponds to 91–99% of AR alleles within different ethnic groups. These novel in vitro findings have introduced a new concept for the analysis of AR-CAG repeat length in relationship to AR-related diseases, indicating that linear regression models are likely to be inappropriate.

The study by Davis-Dao et al. (12) indicates a disadvantage only in the case of short CAG repeats; however, upcoming investigations will probably shed light on whether the “optimal range” hypothesis can be applied also to this specific pathological context. In fact, the stratified analysis of nearly 4000 subjects, included in articles dealing with male infertility and AR-CAG length, has provided clinical evidence for the potential benefit of a CAG range corresponding to 22–23 triplets in spermatogenesis (24). However, it must be taken into consideration that this specific range may not be the same across different ethnic groups and may even vary in different tissues because the effect of polyQ repeat on transactivation is cell specific, presumably due to distinct profiles of coregulator proteins (11). Moreover, it is possible that spermatogenesis, more than the process of testis descent, depends predominantly on the genomic action of androgens, and thus on the direct consequence of the CAG length on transactivation. Clearly, more functional studies are needed for the interpretation of clinical data in different types of androgen-dependent diseases.

That blacks average fewer AR-CAG repeats has been held up as evidence blacks are more "masculinized". I was unconvinced one could draw that conclusion even accepting an inverse relationship between CAG repeat length and AR activity (since the CAG repeat represents only one link in androgen-related pathways, and there may be any number of other racial differences in relevant genes). Now it appears that compared to blacks, whites may in fact be more likely to have "optimal" CAG repeat lengths.

In addition, variation in another polymorphism of the AR gene, GGN repeat length, could conceivably lower AR activity in blacks relative to whites:

Short GGN repeats seem to be associated with decreased semen volume, possibly due to suboptimal AR activity. ["Androgen receptor gene GGN repeat length and reproductive characteristics in young Swedish men"]

Contrary to previously published data from Caucasians and Asian populations, which have the 2 by far most common GGN alleles of 23 and 24 in the former, and 21 and 22 in the latter, we found 4 common alleles of 20, 21, 22, and 23 in our study population with the highest frequency of 20 GGN allele followed by 22, 21 and 23 (GGN)n (Fig. 2). ["Androgen receptor gene CAG and GGN polymorphisms in infertile Nigerian men"]

Some more background from the first article above:

Throughout the human genome there are trinucleotide repeat sequences susceptible to either expansion or contraction during replication, giving rise to length polymorphisms in the general population. The polymorphic CAG repeat, which encodes an uninterrupted polyglutamine (polyQ) tract in the N-terminal transactivation domain of the androgen receptor (AR), is the most extensively studied genetic variant in individuals with disorders of the male reproductive system.

Despite an impressive number of studies, the pathogenic role of this polymorphism and its clinical relevance are still a matter of debate. Although a recent meta-analysis of 33 publications (1) supports a pathogenetic role for longer polyQ length in male infertility, the authors conclude their work stating that there is a need for new, well-designed studies (1). In fact, available data do not allow us to establish what range of AR-CAG repeat lengths predisposes impaired sperm production or to estimate the entity of the associated risk (1). Similar to other genetic variants, the literature related to CAG repeats suffers from an abundance in conflicting case-control association studies and a paucity of functional data (2). There are several plausible explanations for these apparent controversies, mostly related to: 1) poor study design (inappropriate selection of patients and controls, particularly with respect to their phenotype and their ethnic/geographic origin, and underpowered size of the study population); and 2) intrinsic complexity of the interaction between the AR and its endogenous/environmental ligands. An additional intricacy derives from the presence of another polymorphic trinucleotide repeat, (GGN)n, in the first exon of the AR gene, which may modulate the functional effect of the CAG repeat length, stressing the need for a combined analysis of the two AR polymorphisms (3, 4).

"Small is Beautiful: Genetic Studies in the Founder Population of Iceland"

Reports on a talk at the 1000 Genomes Project meeting a couple days ago:

13 Jul Nicolas Robine Nicolas Robine ?@notSoJunkDNA Augustine Kong (deCode Genetics) at #1000genomes

13 Jul Karol Estrada Karol Estrada ?@karls_es Augustine Kong: deCode has genotyped 100,000 samples, and whole-genome sequenced 2,200 samples #1000genomes

13 Jul Nicolas Robine Nicolas Robine ?@notSoJunkDNA AK: 100k chip-typed individuas to study "recombination as a phenotype", and examine "transmission distortion" #1000genomes

13 Jul Goncalo Abecasis Goncalo Abecasis ?@gabecasis Augustine Kong talks about gene mapping in Iceland. A population that is just the right size. Definitely not too small. #1000genomes

13 Jul Karol Estrada Karol Estrada ?@karls_es AK: imputations with 2200 seq. individuals have high accuracy (r^2>0.90 for variants down to 0.1%! #1000genomes

13 Jul Goncalo Abecasis Goncalo Abecasis ?@gabecasis AK: Rate of mutation doubles with every 16 year increase in paternal age. #1000genomes

13 Jul Karol Estrada Karol Estrada ?@karls_es AK: Genomic segments of Norwegian ancestry have 8 fold more singletons than average in deCode's dataset #1000genomes

Alex Forrest-Hay ?@aforre #1000genomes Augustine Kong: 400k SNPs would cover the Icelandic genome sufficiently to enable accurate imputation

Interview with Kari Stefansson:
We have sequenced the whole genomes of 2,500 people. We have genotyped about 120,000 Icelanders with an Illumina chip. We can impute whole genome sequence down to variants with less than 0.1% frequency into about 370,000 Icelanders -- there are only 320,000 living today!”

“We basically have the whole genome sequence of an entire nation.”