Research Article
Compositional features and codon usage pattern of TP63 gene

https://doi.org/10.1016/j.compbiolchem.2019.107119Get rights and content

Highlights

  • Analysis of compositional features revealed variation in base content across TP63 gene isoforms were GC rich.

  • The overall CUB of TP63 gene was low.

  • Among 13 isoforms of TP63 gene, nature selected against the CTA codon in 8 isoforms.

  • Nature favored five over-represented (RSCU>1.6) codons namely CTG, CAG, ATC, AAC and GCC during evolution.

Abstract

The tumor protein p63encoded by the gene TP63 acts as a homologue of p53 protein. TP63 gene is the transformation factor with two initiation sites for transcriptional process and is related with stress, signal transduction and cell cycle control. The biasness in the preference of a few codons more frequently over other synonymous codons is the codon usage bias (CUB). Natural selection and mutational pressure are the two prime evolutionary forces acting on CUB. Here, the bioinformatic based analysis was performed to investigate the base distribution and CUB of TP63transcript variants (isoforms) as no work was performed earlier. Analysis of compositional features revealed variation in base content across TP63 gene isoforms and the GC content was more than 50%, indicating GC richness of its isoforms. The mean effective number of codons (ENC), a measure of CUB, was 51.83, i.e. overall CUB of TP63 gene was low. Among 13 isoforms of TP63 gene, nature selected against the CTA codon in 8 isoforms and favored five over-represented (RSCU > 1.6) codons namely CTG, CAG, ATC, AAC and GCC during evolution. Correlation between overall nucleotide composition and its 3rd codon position revealed that both mutational pressure and natural selection moulded its CUB. Further, the correlation between ENC and aromaticity depicted that variation of CUB was related to the degree of aromaticity of p63 protein.

Introduction

The amino acid sequence in a protein is defined by the genetic code. As many as 61 codons encode just twenty amino acids and 3 codons act as termination signals in a growing polypeptide chain on the ribosomes. This reveals the degenerate property of the genetic code where in about two to six codons usually encode the same amino acid except two amino acids i.e. methionine and tryptophan. Such a bunch of codons codifying the same amino acid is named as synonymous codons. Very often, the usage of the synonymous codons for an amino acid in the mRNA transcripts is highly unequal, a trend noticed across every organism, and termed as codon usage bias or CUB (Grantham et al., 1981). In all domains of life, the genetic code is conserved none the less the direction of codon bias differs from organism to organism. The degree of CUB is by no means the same among genomes and genes (Hershberg and Petrov, 2008).

Many features of coding sequences namely gene length, GC content and the properties of encoded proteins like hydropathicity and aromaticity are linked to CUB (Bains, 1987; Bernardi and Bernardi, 1986; Lobry and Gautier, 1994; Lynn et al., 2002). Mutational biases and natural selection are the major evolutionary factors influencing the bias in codon utilization (Bulmer, 1991).

In the beginning, it was believed that synonymous mutations occurring in coding sequences have no effect as these do not alter the amino acid sequence in a protein and are referred to as “silent” mutations. However, further research revealed that CUB is associated with many cellular processes and might even affect human ailments (Bali and Bebok, 2015). CUB does have an effect on translation and mRNA degradation. Few researchers have found the link of synonymous mutation with amyotrophic lateral sclerosis, cystic fibrosis and Crohn’s disease (Bali and Bebok, 2015; Bartoszewski et al., 2010; Lazrak et al., 2013). Studies on MDR1 gene (Multidrug Resistance1) have shown that synonymous mutation changes the substrate specificity resulting in a modified structure and function of the protein (Kimchi-Sarfaty et al., 2007). Similarly, there exists a few other good illustrations that show synonymous mutation does affect the protein function to a certain extent (Carpen et al., 2006; Matsuo et al., 2007). Further, about two decades ago in two bacterial studies (Komar AA et al., 1998; Komar Anton A et al., 1999), it was shown that synonymous codon substitution alters the translational efficiency of mRNA molecules through different cellular mechanisms. CUB also modulates the proper folding required for a particular protein to function effectively in the cell (Yu et al., 2015).

There are two evolutionary explanations for bias in codon utilization namely, selectionist explanation and neutral (mutational) explanation. In fact, there are two main schools of thought for the explanation of CUB. One group says that CUB is created and sustained by selection pressure because it supports the accuracy and efficiency of protein. Generally, in comparison to the less frequent codons found in coding sequences, the preferred codons are easily recognized and bound by the more abundant tRNA molecules loaded with amino acids. It is probable that selection favors the preferred codons because these can perform the translation process smoothly in the contexts of proficiency and precision with minimal error in protein synthesis. However, it is not certain whether selection operates for altering translation accuracy or efficiency. The second group believes that mutational bias causes CUB in a genome/gene. Mutational bias varies with the organism (Sharp et al., 2005; Stenico et al., 1994). Studies have shown that even the more frequent or less frequent codons differ widely as well (Chen et al., 2004; Ikemura, 1985).

CUB makes an important contribution towards the understanding of genome evolution besides codon optimization (Burgess-Brown et al., 2008; Sharp and Matassi, 1994). Though codon bias is so ubiquitous in nature, yet the mechanism of codon bias is not fully understood.

TP63 (or p63) gene encodes the tumor protein p63 which is a homologue of p53 protein. The p63 protein was identified almost after two decades of p53 identification. The p63 is designated as a transformation-related protein. In terms of structural resemblance, TP63 and TP73genes belong to p53 gene family (Tan et al., 2001; Yang et al., 1998). The TP63 is said to be oldest among them (p53, p63, p73) though it was identified significantly later. The TP63 gene, a transformation factor with two transcription initiation sites, is associated with signal transduction, stress, cell cycle control, in addition to cancer biology (Wu et al., 2003).

Based on the availability of transactivation domain, p63 protein can be categorized into two major groups. One class is characterized by the presence of N-terminal transactivation domain, whereas the other class is deficient in this domain i.e. TAp63 and ΔNp63. Every individual form generates three isotypes resulting in as many 6 (α, β, ϒ for TAp63 and ΔNp63) isotypes with varied actions (Koster and Roop, 2004; Yang et al., 1998).

In embryonic development as well as in mature epithelia, it plays important roles in epithelial differentiation and rejuvenation. This gene can regulate cellular proliferation, differentiation, besides determining cell fate, precisely for this reason TP63 is said to possess developmental role (Koster and Roop, 2004; Mills et al., 1999; Yang et al., 1999).

Some studies have demonstrated that epithelial defects including various ectodermal derivatives, hair and skin were missing in p63-deficient mice (Mills et al., 1999; Yang et al., 1999). Mills et al observed that p63 deficient mice had either defective limbs or their limbs were missing; the skin was with insufficient stratification; mammary glands, teeth, hair follicles were also were missing (Mills et al., 1999). These findings related to the loss-of-function was a strong indicator of the formation of stratified epithelia, but the gain of functions suggested that it directs a role in epithelial cell fate (Koster and Roop, 2004).

In the event of epidermal morphogenesis, p63 plays a crucial part; it is observed that among the isoforms the one containing transactivation domain i.e. TAp63 isoform is found to be expressed in the uncommitted (naïve) surface ectoderm butΔNp63 is expressed later in the process of stratification. The right balance of these two isoforms shapes the epidermal development (Koster and Roop, 2004).

Deficiency in p63 protein causes cellular senescence and promotes aging (Keyes et al., 2005). TP63 gene mutations cause many defects for instance, cleft lip and palate, electrodactyly, Hay-wells syndrome (Dixon et al., 2011). Additionally, several disorders are also set off by p63 functional problems and not due to the mutation of this gene. These are ectrodactyly cleft palate syndrome, lacrimo-auriculo-dento-digital (LADD) syndrome, curly hair ankyloblepharon nail dysplasia (Holder-Espinasse et al., 2007).

The p63 protein is also used for diagnostic purpose as it is effective in differentiating prostatic adenocarcinoma from benign prostrate. Normal prostatic glands have basal cells and are able to stain due to p63 while the malignant prostrate cannot stain (Herawi and Epstein, 2007).

With the advent of new technological development and readily available coding sequences of genes, the codon bias pattern along with the knowledge of frequent codons, less frequent codons, over represented codons and under represented codons could be learnt. Here, we analyzed the potential evolutionary determinants causing bias in the codon pattern of TP63 gene. This study also elucidated the over-represented and the under-represented codons which might help increase or decrease the expression of the gene. Knowing the codon pattern of TP63 gene, we would get a better picture of the codon architecture of this important gene which might be beneficial in synthetic biology or heterologous gene expression studies (Hatfield and Roth, 2007). The study assumes significance in the light of codon optimization which is one of the applied approaches for obtaining the desired expression of a gene using coding sequence.

Section snippets

Sequence retrieval

The coding sequences (cds) of 13 different transcript variants (isoforms) of human TP63 gene were retrieved from Nucleotide database of National Centre for Biotechnology Information (NCBI)(https://www.ncbi.nlm.nih.gov/nuccore) in FASTA format. The selected cds had an exact multiple of three bases with proper start and stop codons. The accession numbers that were used in this analysis are reported in S1.

Base content

The overall base constitution(A%, C%,T% and G%) and the base constitution at the wobble

Nucleotide compositional analysis in different TP63 isoforms

The nucleotide compositional analysis was performed in 13 TP63 isoforms to find the effective role of different nucleotides on the pattern of codon usage in the isoforms. The values of the general nucleotide compositions were calculated and it was found that except the TP63 isoform 12 (where mean A% had the highest compositional value, followed by C%, G% and T%), the remaining isoforms had the highest compositional value of C% which was followed by A%, G% and T%. However, after calculating the

Discussion

The study of CUB was reported in various organisms right from the prokaryotes to multicellular eukaryotes (Akashi, 2001; Akashi and Eyre-Walker, 1998; Duret, 2002). However, no such study till date was reported in human TP63 gene that produces a transformation-related protein. Here, we studied the patterns of codon usage in 13 isoforms of TP63 gene. As we all know, the codon usage pattern is non-uniform in coding sequences, therefore, the knowledge about the patterns of codon usage acquires

Conclusion

The study of synonymous codon usage in the 13 isoforms of human TP63 gene encoding p63 protein exhibited low codon usage bias due to high ENC values. It preferred the most frequent usage of C and G-ending codons which strongly supported the significant role of compositional constraint under the presence of mutation pressure. The neutrality plot also revealed the dominant role of natural selection in affecting the codon usage pattern of the gene. However, the usage of AT and GC content were not

Human and animal rights and informed consent

This article does not contain any study with human or animal subjects performed by the any of the authors.

Declaration of Competing Interest

The authors declare no conflict of interests in this work.

Acknowledgement

We are thankful to Assam University, Silchar, India for providing the lab facility to carry out this research work.

References (57)

  • S. Karlin et al.

    What drives codon choices in human genes?

    J. Mol. Biol.

    (1996)
  • A.A. Komar et al.

    Synonymous codon substitutions affect ribosome traffic and protein folding during in vitro translation

    FEBS Lett.

    (1999)
  • J. Kyte et al.

    A simple method for displaying the hydropathic character of a protein

    J. Mol. Biol.

    (1982)
  • T.H. Mazumder et al.

    Transcription factor gene GATA2: association of leukemia and nonsynonymous to the synonymous substitution rate across five mammals

    Genomics

    (2016)
  • P.M. Sharp et al.

    Codon usage and genome evolution

    Curr. Opin. Genet. Dev.

    (1994)
  • F. Wright

    The ‘effective number of codons’ used in a gene

    Gene

    (1990)
  • A. Yang et al.

    p63, a p53 homolog at 3q27–29, encodes multiple products with transactivating, death-inducing, and dominant-negative activities

    Mol. Cell

    (1998)
  • S. Zhao et al.

    The factors shaping synonymous codon usage in the genome of Burkholderia mallei

    J. Genet. Genom.

    (2007)
  • N. Altman et al.

    Points of Significance: association, correlation and causation

    Nat. Methods

    (2015)
  • G. Bernardi et al.

    Compositional constraints and genome evolution

    J. Mol. Evol.

    (1986)
  • M. Bulmer

    The selection-mutation-drift theory of synonymous codon usage

    Genetics

    (1991)
  • J.D. Carpen et al.

    A silent polymorphism in the PER1 gene associates with extreme diurnal preference in humans

    J. Hum. Genet.

    (2006)
  • S.L. Chen et al.

    Codon usage between genomes is constrained by genome-wide mutational processes

    Proc. Natl. Acad. Sci.

    (2004)
  • M. Choudhury et al.

    Nucleotide composition and codon usage bias of SRY gene

    Andrologia

    (2017)
  • M.N. Choudhury et al.

    Which evolutionary forces dictate codon usage in human testis specific genes

    Int. J. Pharm. Pharm. Sci.

    (2016)
  • M.J. Dixon et al.

    Cleft lip and palate: understanding genetic and environmental influences

    Nat. Rev. Genet.

    (2011)
  • R. Grantham et al.

    Codon catalog usage is a genome strategy modulated for gene expressivity

    Nucleic Acids Res.

    (1981)
  • M. Herawi et al.

    Immunohistochemical antibody cocktail staining (p63/HMWCK/AMACR) of ductal adenocarcinoma and Gleason pattern 4 cribriform and noncribriform acinar adenocarcinomas of the prostate

    Am. J. Surg. Pathol.

    (2007)
  • Cited by (3)

    • Allele frequency analysis of GALC gene causing Krabbe disease in human and its codon usage

      2020, Gene
      Citation Excerpt :

      Previous studies in some RNA viruses reported mutational pressure appeared to be a major factor in CUB formation (Shackelton et al., 2006). COA plot constructed for TP73 and TP63 genes of human reported scattered distribution of AT and GC ended codons, suggesting the role of mutational pressure and natural selection in shaping their codon usage patterns (Barbhuiya et al., 2019; Chakraborty et al., 2019a). In the present study, neutrality plot analysis indicated the major role of mutation along with a minor role of natural selection in shaping the codon bias in the gene.

    View full text