Today it is known that the genome of most organisms
consists in large parts of repetitive DNA. The human genome for example is made
up by 46% of repeats. In this context short tandem repeats (STRs) or simply
tandem repeats (TRs) also known as simple sequence repeats (SSRs) or
microsatellites with a length of 1 to 9 nucleotides (nt) have emerged as
possible regulatory elements (Gemayel et al., 2010). An example for a TR is the
sequence ACTGACTGACTG, where the unit ACTG consisting of four nucleotides is
repeated three times. Compared to non-repeat parts of the genome these TRs have
mutation rates that can be 10 to 100000 times higher, which makes TRs very
unstable. Interestingly most of these mutations are not caused by point
mutations but by repeat polymorphisms. Repeat polymorphisms happen when the
number of repeat units is varied, meaning one or more repeat units are added or
deleted (Gemayel et al., 2010). At present, there are two
mechanisms that explain how a repeat can change in length – strand slippage and
recombination. Strand slippage occurs during DNA replication when the realignment
of the two single strands is mismatched. One strand of DNA is being
“looped-out” and resulting in addition or deletion of units (Gemayel et al., 2010, Ellegren, 2004).
Examples for recombination are gene conversion and unequal crossing over (Gemayel et al., 2010). One reason why STRs are
thought to be regulators of some kind is, that they are often found in
regulatory regions (e. g. promotors) or coding regions of genes (Legendre et al., 2007, Li et al., 2002).
In the genome of Arabidopsis thaliana for
example 13.6% of genes have repeats in their open reading frames (ORFs)
with similar percentages found in other organisms (Gemayel et al., 2010). TR variations are able to
cause severe diseases in humans for example Huntington disease (HD) and
Spinobulbar muscular atrophy (SBMA). Both diseases are caused by a CAG repeat
expansions in a certain gene (IT15
gene for HD and androgen receptor gene for SBMA) (MacDonald et al., 1993, Spada et al., 1991).
The repeat length needs to reach a certain threshold to cause the disease (Rubinsztein et al., 1996, Spada et al., 1991).
Additionally, for HD the length of the repeat is correlated to onset and severity
of the disease, linking the TR variation to a specific phenotype (Duyao et al., 1993).
Another example where TR variation is linked to a certain phenotype is a
triplet repeat expansion in the ill1
gene of the A. thaliana Bur-0 strain,
that causes impaired growth at high temperatures (Sureshkumar et al., 2009). All these repeats are
trinucleotide tandem repeats (TNRs) and like most repeats in coding regions
they consist of 3 nt units, like tri- and hexanucleotide repeats. This probably
is the case to avoid frame shifts, which lead to a loss of function for a gene (Legendre et al., 2007, Metzgar et al., 2002).
However, TR variations can also have a useful effect, offering an explanation
as to why seemingly detrimental TRs have not been selected against during
evolution. Bacteria, especially pathogenic ones, use TRs to switch between
phenotypes and thusly escape the immune reaction of the host (Gemayel et al., 2010). This strategy is called phase
variation and of the first example was found during a study of surface genes
from Neisseria gonorrhoeae (Stern et al., 1986). Phase variation basically is an ON/OFF
switching of phenotypes that occur during infection. In N. gonorrhoeae, the encoding region encoding the membrane signal
peptide of the P.II gene family has a CTCTT repeat, which has been shown to be
variable. By changing the length of the repeat frame shifts are caused that
lead to correctly translated proteins dubbed phenotype ON or incorrectly
translated proteins said to be phenotype OFF (Stern et al., 1986, Gemayel et al., 2010).
This ON/OFF switching is different for every cell and it is believed that the
resulting variability in bacterial cells gives them the ability to survive the
host (Kita et al., 1991). The mechanism has so far mainly been shown for
prokaryotes but with more and more genome-wide studies they might just be the
starting point for the identification for such regulatory processes. These
prokaryotic examples clearly show how TRs mediate phenotypic change, which
allows them to quickly adapt to a non-favourable environment. Instead of being
“selfish-DNA” or cause of disease repeats might have played a crucial role
during evolution and are so called “evolutionary hot spots”(Orgel and Crick, 1980, Gemayel et al., 2010).
Due to their location in or near promotors and coding regions repeats might be
able to facilitate quick evolution of certain genes, which enables an organism
to adapt much faster to a changing environment (Gemayel et al., 2010). The underlying mechanisms of
such a rapid adaptation however remain unknown but are an intriguing starting
point for further research.