1 Introduction

The ability of proteins to form specific, stable complexes with other proteins is fundamental to many biological processes (Routledge et al. 2009). Mutations in human genes can change the sequence and structure of a protein, impair its function, and could lead to disease (Steward et al. 2003). Wide ranges of highly debilitating diseases are associated with the failure of proteins to maintain their native structures (Goldman et al. 1997). The alpha thalassemia/mental retardation X-linked (ATRX) protein belongs to the Switch 2, sucrose non-fermenting 2 (SWII/SNF2) ATP-dependent helicase families that exhibit chromatin remodeling activity (Iwase et al. 2011; Baumann and De La Fuente 2009; De La Fuente et al. 2011). The ATRX gene is located on the X chromosome at Xq13. It is 300 kb large gene spanning and contains 36 exons (Dhayalan et al. 2011; Wong et al. 2010). The ATRX protein has a helicase/ATPase domain at its carboxyl terminus and this domain is characteristic of the catalytic subunits of chromatin remodeling complexes. At its amino terminus, the ATRX protein has a domain composed of a Plant Homeo Domain (PHD) and a GATA-like zinc-finger motif, known as the ATRX-DNMT3A-DNMT3L (ADD) domain because this configuration of Zinc fingers has been found in the DNA-methyltransferases DNMT3A and DNMT3L (Valadez-Graham et al. 2012; Xue et al. 2003). Mutations within the coding region of the ATRX gene cause ATR-X syndrome in human patients, an X-linked genetic disease, which is characterized by variable combinations of severe mental retardation, characteristic dysmorphic facial features, alpha thalassemia, seizures, urogenital abnormalities, and sex reversal (Baumann and De La Fuente 2009; Berube et al. 2008; Baker et al. 2008). ATRX is a large protein (2,492 residues) in which almost all non-truncating mutations associated with ATR-X syndrome fall within 97 conserved residues of the ADD domain or 733 conserved residues of the Snf2 domain. Such pathogenic mutations are not seen in the remaining poorly conserved, structurally disordered 1,662 residues of ATRX. It seems likely that mutations in these regions are not observed in ATR-X syndrome because they act as neutral polymorphisms rather than lethal mutations (Mitson et al. 2011). The association of ATRX mutations with a reduction in alpha globin synthesis in alpha thalassemia patients suggests that the protein plays a role in the regulation of alpha globin gene expression (Wong et al. 2010). Genome-wide analysis has shown that in euchromatin the predominant targets of ATRX are sequences containing VNTRs (variable number of tandem repeats). Many of these are G and C rich with high proportion of CpG dinucleotides. These observations explain why ATRX mutations affect the α-globin cluster but not the β-globin cluster that cause α-thalassaemia. The α-cluster lies in a GC-rich subtelomeric region containing a high density of CpG islands and G-rich TRs (tandem repeats) whereas the β-globin cluster has none of these features (Law et al. 2010). The ADD domain of the ATRX protein is a hot spot for mutations causing the ATR-X syndrome (Dhayalan et al. 2011). The disease causing mutations fall into two groups viz., the majority affect buried residues thereby affecting the structural integrity of the ADD domain and the another group affects a cluster of surface residues which are likely to perturb a potential protein interaction site. The effects of individual point mutations on the folding state and the stability of ADD domain correlate well with the levels of mutant ATRX protein in patients, providing insights into the molecular pathophysiology of ATR-X syndrome (Argentaro et al. 2007).

In the present study, we computationally performed the sequence and structure-based mutagenesis to explore the function of ADD domain of ATRX by investigating its potential interaction with histone H3-peptide through inter–intra molecular interactions, docking studies and normal mode analysis.

2 Materials and methods

2.1 Data sets

The protein sequence and variants (single amino acid polymorphisms/missense mutations/point mutations/nsSNP) for ADD domain were obtained from the Swissprot database (Yip et al. 2008) available at http://www.expasy.ch/sprot/ to find out the detrimental point mutants. The 3D Cartesian coordinates were obtained from Protein Data Bank for in silico mutation modeling and docking studies based on detrimental point mutations.

2.2 Prediction of protein stability upon single amino acid substitution using support vector machine-based tool, I-Mutant 2.0

We predicted nsSNP causing protein stability change using I-Mutant 2.0 (Capriotti et al. 2005), a support vector machine (SVM)-based tool for the automatic prediction of protein stability change upon single amino acid substitution available at http://gpcr.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi. Its predictions were performed starting either from the protein structure or, more importantly, from the protein sequence. The output files show the predicted free energy change value or sign (DDG), which was calculated from the unfolding Gibbs free energy value of the mutated protein minus the unfolding Gibbs free energy value of the native type. Positive DDG values mean that the mutated protein possesses high stability and vice versa.

2.3 Analyzing the functional consequence of nsSNP by sequence homology-based method, SIFT

We used the program SIFT (Ng and Henikoff 2003), to specifically detect the deleterious single amino acid polymorphism, available at http://blocks.fhcrc.org/sift/SIFT.html. Sorting intolerant from tolerant (SIFT), a sequence homology-based tool, presumed that important amino acids would be conserved in the protein family. Hence, changes at well-conserved positions tend to be predicted as deleterious. We submitted the query in the form of protein sequences. The underlying principle of this program supposed that, SIFT take a query sequence and use multiple alignments to predict tolerated and deleterious substitutions for every position of the query sequence. The SIFT predicted whether submitted SAP nsSNP affected the protein function based on sequence homology and amino acid properties. The cut-off value was a tolerance index of ≥0.05. The higher the tolerance index, the lesser the functional impact of a particular amino acid substitution. Thus, the calculated score represented the likelihood of mutability at the site of amino acid substitution (Ng and Henikoff 2001).

2.4 Simulation for functional change in point mutant by structure homology-based method, PolyPhen

Analyzing the damaged point mutations at the structural level was considered to be very important to understand the functional activity of the concerned protein. We used the server PolyPhen (Ramensky et al. 2002), available at http://coot.embl.de/PolyPhen/ for this purpose. The input options for the PolyPhen server were protein sequence or SWALL database ID or accession number together with sequence position of two amino acid variants. We submitted the query in the form of protein sequence with mutational position and two amino acid variants. It analyzed the impact of nsSNP by mapping the amino acid substitutions on the protein 3D structure to explore whether the amino acid substitution was likely to destroy the hydrophobic core of the protein, the electrostatic interaction and other important features of protein. It calculated position-specific independent (PSIC) scores for each of the two variants and then computed the PSIC score difference between them. The higher the PSIC score difference, the higher was the possible functional impact of a particular amino acid substitution.

2.5 Modeling single amino acid polymorphism location on protein structure to compute total energy and RMSD

Structure analysis was performed for computing the total energy and evaluating the structural deviation between native and mutant types by means of root mean square deviation (RMSD). We used the web resource Protein Data Bank (Berman et al. 2000) to identify the 3D structure of ADD domain (PDB ID: 3QLC) of ATRX protein. We confirmed the mutation position and the mutation residue in ADD domain. The mutation was performed using SWISSPDB viewer and the energy minimization for 3D structures was performed by NOMAD-Ref server (Lindahl et al. 2006). This server used Gromacs as default force field for energy minimization based on the methods of steepest descent, conjugate gradient and L-BFGS methods (Delarue and Dumas 2004). We used the iFold server for simulated annealing, which was based on discrete molecular dynamics and was one of the fastest strategies for simulating protein dynamics. This server was also efficient in sampling the vast conformation space of biomolecules in both length and time scales (Sharma et al. 2006; Mohammed Said et al. 2012). The total energy was computed for both the native and mutants of ADD domain by GROMOS96 (Schuler et al. 2001) implemented in DeepView. Divergence in mutant structure with native structure was due to mutations, deletions, and insertions (Han et al. 2006; Rajasekaran and Sethumadhavan 2010a) and the deviation between the two structures could alter the functional activity (Varfolomeev et al. 2002) which was evaluated by their RMSD values.

2.6 Computing intra- and inter-molecular interactions in ADD domain of ATRX protein

We used PIC (Protein Interactions Calculator) server (Tina et al. 2007) for computing intra- and inter-molecular interactions for both native and mutant structures available at: http://crick.mbu.iisc.ernet.in/~PIC. It accepted atomic coordinate set of a protein structure in the standard Protein Data Bank (PDB) format. Interactions within a protein structure and interactions between proteins in an assembly were essential considerations in understanding molecular basis of stability and functions of proteins and their complexes. There were several weak and strong interactions that render stability to a protein structure or an assembly. It computes various interactions such as interaction between apolar residues, disulphide bridges; hydrogen bond between main chain atoms; hydrogen bond between main chain and side chain atoms; hydrogen bond between two side chain atoms; interaction between oppositely charged amino acids (ionic interactions), aromatic–aromatic interactions, aromatic–sulfur interactions and cation–π interactions.

2.7 Docking between ADD domain and histone H3-peptide

Protein–peptide interactions were among the most prevalent and important interactions in the cell that play a key role in major cellular processes, including signal transduction, transcription regulation, cell localization, protein degradation, and immune response. We used Rosetta FlexPepDock web server (London et al. 2011) (http://flexpepdock.furmanlab.cs.huji.ac.il/), a high-resolution protocol for the refinement of protein–peptide complex structures implemented in the Rosetta modeling suite frame-work. Started from a coarse model of the interaction, FlexPepDock performed a Monte Carlo-Minimization-based approach to refine all the peptide’s degrees of freedom (rigid body orientation, backbone and side chain flexibility) as well as the protein receptor side chain conformations. The modeling of a protein–peptide interaction was divided into several consecutive steps, each representing a smaller subproblem, in line with prevalent approaches for modeling and docking of globular proteins: (1) model the receptor structure, (2) predict potential binding sites on the receptor surface, (3) model the peptide backbone in the binding site, and (4) refine the peptide–protein complex to high resolution.

2.8 Exploring the flexibility of binding pocket by normal mode analysis

A quantitative measure of the atomic motions in proteins was obtained from the mean square fluctuations of the atoms relative to their average positions. These were related to the B-factor (Yuan et al. 2005). Analysis of B-factors, therefore, was likely to provide newer insights into protein dynamics, flexibility of amino acids, and protein stability (Parthasarathy and Murthy 2000). It was to be noted that, protein flexibility was important for protein function and for rational drug design (Carlson and McCammon 2000). Also, flexibility of certain amino acids in protein was useful for various types of interactions. Moreover, flexibility of amino acids in binding pocket was considered to be a significant parameter to understand the binding efficiency. In fact, loss of flexibility impaired the binding effect (Hinkle and Tobacman 2003) and vice versa. Hence, this was analyzed by the B-factor, which was computed from the mean square displacement <R 2> of the lowest-frequency normal mode using ElNémo server (Suhre and Sanejouand 2004).

3 Results and discussion

3.1 SNP dataset

The ADD domain and a total of 17 variants namely G175E, N179S, P190A, P190L, P190S, L192F, V194I, C200S, Q219P, C220R, C220Y, W222S, C243F, R246C, R246L, G249C, and G249D investigated in this work were retrieved from Swissprot database.

3.2 Identifying the detrimental missense mutations with well-recognized computational tools

In this study, we focused our attention to investigate the functional impact of amino acid substitution in ADD domain associated with ATR-X syndrome using existing computational methods I-Mutant 2.0, SIFT and PolyPhen. Over the past 5 years, computational approaches to in silico analysis of amino acid substitutions on 3D structure of protein to understand the structural and functional impact of mutant structure of protein have improved considerably. In this aspect, we have brought in silico model with two diverse approaches that would be helpful to experimental biologists as an alternative method in determining the functional nsSNPs in ATR-X syndrome. In this analysis, we used sequence (SIFT) and structure-based methods (I-Mutant 2.0 and PolyPhen), the most common approaches used in SNP prediction. According to I-Mutant 2.0, more negative the free energy value (DDG value), less stable the given point mutation was likely to be. Out of 17 variants, 5 variants (C200S, P190A, P190S, C220R, and C220Y) which showed DDG values of −3.27, −3.11, −2.91, −2.65, and −2.60, respectively, were considered to be less stable and deleterious as listed in Table 1. The other 12 variants showed the DDG values ranging from −0.53 to −2.16. Moreover, we categorized these 17 non-synonymous variants into the following groups such as non-polar, polar, aromatic and positively charged amino acids based on their physio-chemical properties as shown in Fig. 1. Of the 17 variants showed a negative ∆∆G, two variants (C220Y and C243F) changed from polar amino acids to aromatic amino acids, two variants (P190A and P190L) changed from polar to non-polar, two variants (G175E and G249D) changed from non-polar to negatively charged followed by six variants viz., L192F, G249C, C220R, R246L, R246C, W222S changed their amino acids from non-polar to aromatic, non-polar to polar, polar to positively charged, positively charged to non-polar, positively charged to polar, and aromatic amino acid to polar amino acids, respectively. In addition, four variants (N179S, P190S, C200S, and Q219P) retained its polar amino acids and one more variant V194I retained its non-polar amino acid. Indeed, by considering only amino acid substitutions based on physico-chemical properties, we could not be able to identify the detrimental effect. Rather, by considering the sequence conservation along with the above said properties could have more advantages and reliable to find out the detrimental effect of missense mutations.

Table 1 Functionally significant detrimental missense mutations by total energy, RMSD, I-Mutant2.0, SIFT, and PolyPhen
Fig. 1
figure 1

Shows the native and mutated residues of ADD domain of ATRX protein represented as sticks. a Non-polar amino acid residues (G175, L192, V194, and G249) (blue) of native ADD domain and b their corresponding mutated amino acids (E175, F192, P194, and G249C) (blue) at their respective position. c Polar amino acids (N179, P190, C200, Q219, C220, and C243) (red) of native ADD domain and d their corresponding mutated amino acids (S179, A190, S200, P219, R220, and F243) (red) at their respective position. e Aromatic and positively charged amino acid residues (W222 (magenta) and R246 (cyan)) of native ADD domain, and f their corresponding mutated amino acid residues (S222 (magenta) and C246 (cyan)) at their respective position (color figure online)

The conservation level of a particular position in a protein was determined using a sequence homology-based tool, SIFT. Protein sequences of 17 variants were submitted independently to SIFT program to check its tolerance index. Higher the tolerance index, lesser the functional impact, a particular amino acid substitution was likely to have and vice versa. Out of 17 variants, 12 variants, G175E, P190S, L192F, V194I, C200S, C220R, C220Y, W222S, C243F, R246C, R246L, and G249C had the tolerance index score of 0.00, the other five variants, N179S, P190A, P190L, G249D had the tolerance index score of 0.01 followed by one variant Q219P with tolerance index score of 0.02 as outlined in Table 1. A lower score 0.00–0.05 indicated that the nsSNPs were more damaging to protein function. Out of 17 variants, 12 variants mentioned above had the less tolerance index score and were more deleterious. Such scores enabled the quantitative comparison and ranking of SNPs in the order of their biological significance, and were useful for biologist to decide which SNPs of a gene they should first look at.

The structural levels of alteration have been determined by applying PolyPhen program. Protein sequence with mutational position and amino acid variants associated to 17 single point mutants investigated in this work had been submitted as input to the PolyPhen server and the results as shown in Table 1. A PSIC score difference of 0.00 and above was considered to be damaging. A total of 17 variants were considered to be damaging by PolyPhen and these 17 variants exhibited a PSIC score difference between 0.193 and 1.000. It was interesting to observe that both sequence and structure-based computational methods were predicted as these 17 variants were less stable, deleterious and damaging by I-Mutant 2.0, SIFT and PolyPhen, respectively. Therefore, we considered all these 17 variants as detrimental based on our results obtained by well-known computational tools. Further, we focused our attention on these 17 variants being mapped into the ADD domain to investigate the structural analysis by comparing total energy, RMSD, inter–intra molecular interactions, docking studies and normal mode analysis of mutant type with native type structure.

3.3 Computing total energy and RMSD by modeling of mutant structures

Non-synonymous single nucleotide polymorphism (nsSNP) of genes introduced amino acid changes to proteins, and played an important role in providing genetic functional diversity. To understand the structural characteristics of nsSNPs, we have mapped a set of nsSNPs derived from the Swissprot database to the structural surface of ADD domain (PDB ID: 3QLN). Mutation analysis was performed based on the results obtained from I-Mutant 2.0, SIFT, PolyPhen. Mutation at specified position was performed by SWISSPDB viewer independently to get mutant structures. Energy minimization for all the mutant models and their native structures was achieved using the Nomad-Ref server, followed by simulated annealing using iFOLD server. The employment of energy minimization preserved native geometry of 3D structure of protein. In many cases, an initial structure obtained by homology modeling method, or by amino acid substitution had certain sites in which atoms were closely positioned, valent bonds were much extended, or side chains exist in unusual conformations. For regularization of this non-equilibrium spatial structures the method of simulated annealing was employed (Sharma et al. 2006). The total energy for all the mutant and native structures after minimization are listed in Table 1. The total energy for the native protein was −8,991.874 kJ/mol. Change in total energy due to mutation was noticeable in the 3QLN mutants ranging from −8,521.195 to −8,614.933 kJ/mol. Higher the total energy, lesser the stability of protein structure would be (Rajasekaran and Sethumadhavan 2010b). Since, all the 17 mutants had the total energy higher than the native structure; we considered these missense mutations to have deleterious effect based on structural stability.

RMSD was the measure of the deviation of the mutant structures from their native structure conformations. Higher the RMSD value, the more the deviation between the native type and mutant type structures. Among all the 17 mutants, mutant V194I exhibited a high RMSD value of 1.15 Å followed by two mutants C200S and L192F with 1.12 and 1.11 Å RMSD score, respectively. It was interesting to observe that these three mutants were also predicted as more deleterious and damaging according to SIFT and PolyPhen, respectively. So the sequence and structure-based analysis of ADD domain was useful to understand the structural and functional impact of protein. The remaining 14 mutants had the RMSD value ranging between 0.29 and 0.98 Å as outlined in Table 1. Assuming that the minimized native structure had ideal mutual orientation of functional groups and the RMSD values characterized the effect of amino acid substitution on spatial 3D structure architecture (atom deviations from initial structure) (Sharma et al. 2006). Structural changes, in turn, affected functional activity due to amino acid substitution disturbing the binding efficiency of ADD domain with its interacting partner leading to ATR-X syndrome. Figure 2 illustrates the superimposed structure of native with all the 17 mutant structures of ADD domain. It was to be noted that, 8 variants, namely N179S, C220R, C220Y, C243F, R246C, R246L, G249C, and G249D (Argentaro et al. 2007; Badens et al. 2006) which were considered as detrimental missense mutations by various sequence and structure-based computational methods were well supported with experimental studies performed elsewhere.

Fig. 2
figure 2

Superimposed structure of native ADD domain (green) with all 17 mutant structures of ADD domain. a ADD domain mutant G175E (blue). b ADD domain mutant N179S (yellow). c ADD domain mutant P190A (magenta). d ADD domain mutant P190L (cyan). e ADD domain mutant P190S (orange). f ADD domain mutant L192F (tint). g ADD domain mutant V194I (grey). h ADD domain mutant C200S (blue). i ADD domain mutant Q219P (yellow). j ADD domain mutant C220R (magenta). k ADD domain mutant C220Y (cyan). l ADD domain mutant W222S (orange). m ADD domain mutant C243F (tint). n ADD domain mutant R246C (grey). o ADD domain mutant R246L (blue). p ADD domain mutant G249C (magenta). q ADD domain mutant G249D (cyan) (color figure online)

3.4 Computing intra-molecular interactions in ADD domain of ATRX protein

Interactions within a protein structure and interactions between proteins in an assembly were essential considerations in understanding molecular basis of stability and functions of proteins and their complexes. There were several weak and strong intra-molecular interactions that rendered stability to a protein structure. Hence these intra-molecular interactions have been computed by PIC server to further substantiate the stability of protein structure. Based on our analysis, we found a total number of 491 intra-molecular interactions in native ADD domain which includes 79 hydrophobic, 392 hydrogen bonding (i.e., 168 main chain–main chains, 121 main chain–side chain, 103 side chain–side chain), 8 ionic, 6 aromatic–aromatic, 3 aromatic–sulfur, and 3 cation–π interactions. Based on our in silico mutagenesis studies, we found that all the 17 mutant structures encompass the intra-molecular interactions between the ranges of 438–477 as could be seen from Table 2. From this result, we confirmed that all 17 mutant structures having decreased stability due to reduction in intra-molecular interaction compared to native type ADD domain. Moreover, we also analyzed the inter-molecular interactions between the ADD domain and histone H3-peptide for further course of investigation.

Table 2 Intra-molecular interactions in ADD domain of ATRX protein

3.5 Rationale of binding efficiency for native and mutant structures of ADD domain with its interacting partner’s histone H3-peptide

Large-scale identification of protein–protein interactions in functional complexes represented an efficient route to elucidate the regulatory rules of cellular functions (Du et al. 2009; Manisk Kumar and Krishna 2013). In this present study, we identified the potential interactions between the ADD domain of ATRX and histone H3-peptide to understand more about how the deleterious mutations in ADD domain affect the binding affinity with H3-peptide, lead to ATR-X syndrome. Hence, we performed protein–peptide docking by FlexPepDock server for identifying the binding affinity between them in terms of Rosetta energy score. In this regard, we selected the PDB ID: 3QLC which had ADD domain complex with H3-peptide. To understand the binding affinity of H3-peptide with both native and mutants of ADD domain, we unbound the H3-peptide from ADD domain from the PDBID: 3QLC (chain A contain ADD domain and Chain C contain Histone H3-Peptide). We performed the point mutation by SWISSPDB viewer for the 17 variants using the PDB ID: 3QLC (Chain A). Energy minimization for all the mutant models and their native structures was achieved using the Nomad-Ref server and followed by simulated annealing using iFOLD server to get optimized structures. Subsequently, docking was performed for both the native type and all the 17 mutant types of ADD domain with H3-peptide.

Table 3 shows the Rosetta energy score between histone H3-peptide and native ADD domain that was found to be −88.709 kcal/mol whereas with 17 mutants, the Rosetta energy score was found between the ranges −41.790 and −84.918 kcal/mol as outlined in Table 3. The binding affinity between all the 17 mutant structures of ADD domain and histone H3-peptide had been found to be less as compared to native complex. Figure 3 shows the docked complex of native and all the 17 mutant structures with histone H3-peptide. Moreover, we computationally found that two mutants, V194I and Q219P displayed more pronounced reduction in binding to histone H3-peptide, compared to other mutants with Rosetta energy score of −41.790 and −41.915 kcal/mol, respectively, as depicted in Fig. 3h, j.

Table 3 Inter-molecular interactions of ADD domain with histone H3-peptide
Fig. 3
figure 3

Docked complex structure of ADD domain of native ATRX and all the 17 mutant structures with histone H3-peptide (red). a Native ADD domain (green). b ADD domain mutant G175E (blue). c ADD domain mutant N179S (yellow). d ADD domain mutant P190A (magenta). e ADD domain mutant P190L (cyan). f ADD domain mutant P190S (orange). g ADD domain mutant L192F (tint). h ADD domain mutant V194I (grey). i ADD domain mutant C200S (blue). j ADD domain mutant Q219P (yellow). k ADD domain mutant C220R (magenta). l ADD domain mutant C220Y (cyan). m ADD domain mutant W222S (orange). n ADD domain mutant C243F (tint). o ADD domain mutant R246C (grey). p ADD domain mutant R246L (blue). q ADD domain mutant G249C (magenta). r ADD domain mutant G249D (cyan) (color figure online)

The atomic resolution exploration of protein–peptide interactions found essential for all biological functions because it allowed the comprehensive knowledge of the physical basis of affinity, as well as the understanding of molecular recognition at intermolecular interface level (Goldman et al. 1997; Talavera et al. 2011; Moreira et al. 2007; Jingyu et al. 2013). So, we further evaluated these 17 variants of ADD domain that affected by binding affinity with histone H3-peptide through interactions of interest that disrupted in the interacting residues at the interface level. It is noted that, the major intermolecular interactions are noticed such as hydrophobic interactions, inter-molecular hydrogen bonding (i.e., main chain/main chain interactions, main chain/side chain, side chain/side chain interactions) and ionic interactions. In this analysis, we found 29 inter-molecular interactions mediated by 3 hydrophobic interactions, 23 hydrogen bonding namely 8 main chain–main chains, 9 main chain–side chains and 6 side chain–side chain followed by 3 ionic interactions between the native ADD domain and histone H3-peptide. Whereas, all the 17 mutants structure of ADD domain with histone H3-peptide have inter-molecular interactions between the ranges of 12–24 as could be seen from Table 3. Hence, reductions in the inter-molecular interactions between all the 17 mutants structure of ADD domain with histone H3-peptide affect its binding efficiency.

3.6 Identifying the number of amino acids with decreased flexibility in the interacting residues of ADD domain

We selected the PDB ID 3QLC, the complex of ADD domain (chain A) with histone H3-peptide (chain C). It could be seen from Table 4 that, 14 amino acids namely Asp(212), Met(216), Asp(217), Glu(218), Gly(226), Gly(227), Asn(228), Leu(229), Ile(230), Cys(231), Asp(233), Arg(250), Ser(254), and Trp(263) were identified as binding residues for ADD domain with histone H3-peptide by PDB sum (Laskowski 2001). For the further course of investigation, we performed normal mode analysis to compare the flexibility of binding amino acids of both native and mutants of ADD domain to confirm the detrimental effect of all the 17 mutants by using the program Elnemo. Table 4 depicts the flexibility of amino acids of both native type and mutant types by means of normalized mean square displacement <R 2>. We considered the <R 2> of binding amino acids of native ADD domain as referred to compare the binding amino acids of 17 mutants. Hence we further sorted out these data into three different ranges of flexibility. One was the <R 2> of binding amino acids of mutants which was exactly the same as <R 2> of the binding amino acids of native named as ‘identical flexibility.’ The other was the <R 2> of binding amino acids of mutants which was higher than <R 2> of the binding amino acids of native named as ‘increased flexibility.’ And the last was the <R 2> of binding amino acids of mutants which was lesser than <R 2> of binding amino acids of native named as ‘decreased flexibility.’ The three different ranges of flexibility of binding amino acids of 17 mutants could be seen from Table 5. In case of ATRX syndrome most of the mutations occurred in the binding regions of ADD domain as mentioned above. Based on the results obtained from 17 mutant structures, we identified a loss of flexibility in mutant structures by comparison with native crystal structure of ADD domain. From this result, we confirmed that, out of 238 binding residues in all 17 mutant structures, 2 binding residues having identical flexibility, 39 binding residues having increased flexibility, and 197 binding residues having decreased flexibility. This result could clearly exemplified that decreased flexibility in the binding residues of 17 mutant structures of ADD domain changed the conformational state of interfacial residues and could affect the evolutionary molecular recognition of histone H3-peptide according to “conformational selection” method (Keskin et al. 2005). Interestingly, two variants L192F and Q219P had more pronounced effect on ADD domain by computational tools (I-Mutant 2.0, SIFT, PolyPhen) and structure-based methods (total energy, RMSD, inter–intra molecular interactions, Rosetta energy score and normal mode analysis) leads to cause ATR-X syndrome.

Table 4 Comparison of normalized mean square displacement of H3-peptide binding amino acids in native and mutants of ATRX
Table 5 Binding amino acids of mutants with different ranges of flexibility based on < R2>

4 Conclusions

In this study, we computationally analyzed the 17 detrimental missense mutations in sequence and structure-based level to understand the structural and functional effect of these missense mutations in ADD domain of chromatin remodeling protein ATRX. From this work, we concluded that all the 17 mutants have been considered to be less stable, deleterious and damaging by I-Mutant2.0, SIFT and PolyPhen programs, respectively. The RMSD between the native type and the mutant type of ADD domain had also been found in the range of 0.29–1.15 Å. Moreover, docking was performed between histone H3-peptide with native and mutant structures of ADD domain which showed the Rosetta energy score between −41.790 and −84.918 kcal/mol. Our computational results suggest that the normal function of ADD domain required its high-affinity binding to histone H3-peptide and mutations in ADD domain which directly affected the binding of histone H3-peptide that underlie ATR-X syndrome. Further, we evaluated that the decreased binding affinity of mutants ADD domain with histone H3-peptide were due to pronounced reduction in inter–intra molecular interactions. The less binding affinity had been observed on these 17 mutants with histone H3-peptide as the majority of binding amino acids of those 17 mutants had ‘decreased flexibility’ by normal mode analysis which could be the cause for ATR-X syndrome. Identified detrimental missense mutations by sequence (SIFT) and structure-based (I-Mutant 2.0, PolyPhen) computational tools had good agreement with structure-based analysis (total energy, RMSD, inter–intra molecular interactions, docking studies and normal mode analysis). The overall scope and novelty of our work was (1) to consider computationally a suitable protocol for identifying the detrimental missense mutation before wet lab experimentation and, (2) to provide optimal path for further clinical and experimental studies to characterize this mutants of ADD domain in depth.