A knot or not a knot? SETting the record ‘straight’ on proteins

https://doi.org/10.1016/S1476-9271(02)00099-3Get rights and content

Abstract

A novel knot found in the SET domain is examined in the light of five recent crystal structures and their descriptions in the literature. Using the algorithm of Taylor it was established that the backbone chain does not form a true knot. However, only two crosslinks corresponding to hydrogen-bonds were needed to form a knotted structure. Such loosely knotted structures formed by hydrogen-bonded crosslinks were assessed as lying between covalent crosslinks (such as disulphide bonds) and threaded-loops which are formed by close (unbonded) contacts between different parts of the chain. The term pseudo-knot was introduced (from the RNA field) to distinguish hydrogen-bonded ‘knots’.

Introduction

In the field of mathematical topology, the recognition of different knots remains a difficult problem (Menasco and Rudolph, 1995) but there is no dispute that the property of being knotted is something that is either there or is not. Things are not so simple in the more murky world of protein topology and the question of whether the backbone chain of a protein is knotted (ignoring crosslinks) does not have such a simple yes/no answer (Mansfield, 1994). The difference comes from the nature of the pieces of string considered in each field. In mathematics there are no loose ends and knots are only defined in circular pieces of string, whereas proteins (with a few exceptions) are not cyclic and so, strictly, are of little topological interest at all.

Despite the mathematical restriction of a knot to circular strings, there are few fishermen or sailors who accept this constraint and the common definition of a knot as ‘a loop in a string that tightens when pulled’ is also the prevailing view held by protein crystallographers. This approach has been formalised in a recent computer program (Taylor, 2000) where the test for a knot in a protein was to see whether a straight line was obtained after the chain had been repeatedly smoothed and straightened. Assuming that the string cannot pass through itself and is infinitely slippy, the only remaining problem to solve was how to grab the ends. Rather than add an external bridging loop, this was done by keeping the two termini fixed in place and letting the chain shrink between them. This has the advantage that the ends can be altered (like a series of deletion mutants) and the exact location of the knotted core defined.

Clear knots in the protein chain are rather rare and it is always of interest to examine closely those that are identified. Sometimes these must be treated with caution as knots almost always involve loops, which with their greater exposure to solvent are more mobile than other parts of the protein chain and hence less well resolved. This can lead to errors in chain tracing and the erroneous creation of a knot. It is always better if there is more than one independent solution of the structure (preferably at as high a resolution as possible). This was the situation for the most deeply buried knot identified (Taylor, 2000) in the structure of the acetohydroxy acid isomeroreductase which had been solved twice at 1.60 Å (1QMG) (Thomazeau et al., 2000) and 1.65 Å (1YVE) (Biou et al., 1997).

More recently, two new knots have been identified. One of these is a clear knot found in two homologous structures of an RNA methyltransferase (Nureki et al., 2002, Michel et al., 2002) (PDB codes: 1IPA and 1GZO, respectively (Berman et al., 2000)). These have 1.9 Å root-mean-square (RMS) deviation over the 145 equivalent residues in the knotted domain. The other knotted structure is not such an obvious knot in a domain of the structure of a histone lysyl methyltransferase (this is completely unrelated to the RNA methyltransferase). Within a few months, a fragment of the histone methyltransferase structure has been reported independently by five groups (Wilson et al., 2002, Min et al., 2002, Trievel et al., 2002, Zhang et al., 2002, Jacobs et al., 2002) (PDB codes: 1H3I, 1MVH, 1MLV, 1ML9 and 1MT6, respectively). These structures revealed a multi-domain protein, consisting of a catalytic domain (referred to as the SET domain) preceded by another domain (sometimes called a preSET domain) which, depending on the specific protein, is either a relatively ‘structureless’ zinc-binding motif (Zhang et al., 2002, Min et al., 2002) or a topologically simple β-sheet (Wilson et al., 2002, Jacobs et al., 2002). By contrast the SET domain has a novel complex fold consisting of three interconnected β-sheets arranged around a short central 310 helix. In one structure there is a large plant specific domain inserted (Trievel et al., 2002) within the SET domain (see Roguev et al., 2001 for an overview of the variations in domain structure over this wide family).

Despite the wide variation in domain structure, the SET domain itself is relatively well conserved (Table 1) yet in the published reports it has been variously described as a knot (Jacobs et al., 2002), being knot-like (Trievel et al., 2002) or containing a threaded-loop (Wilson et al., 2002, Zhang et al., 2002, Min et al., 2002). In this paper we examine these descriptions and investigate whether the SET structure may indeed be topologically ambiguous.

Section snippets

Results and discussion

To examine the differing topological descriptions of the SET domain, the algorithm of Taylor (2000) was applied to one of the recent crystal structures: the histone methyltransferase protein SET7/9 (Wilson et al., 2002) residues 135–343. The domain was completely reduced to a straight line by the the application of the algorithm in 43 cycles of iteration, which is typical for a globular protein of this size (Fig. 1). The speed of this reduction indicates that the fold does not even contain a

Conclusions

Given the topological complexity of the fold of the SET domain, it is perhaps not surprising that the crystallographers arrived at differing conclusions about its knotted state. Strictly, it is not a knot and is ‘merely’ a threaded-loop. However, our analysis has brought to light the phenomenon that, in general, proteins exhibit a progression of increasingly solid crosslinks from close approach (van der Waals ‘bonds’) through hydrogen-bonds to disulphide (or other) covalent-bonds. Unlike

References (18)

There are more references available in the full text version of this article.

Cited by (28)

  • Statistical topology and knotting of fluctuating filaments

    2018, Physica A: Statistical Mechanics and its Applications
  • Preparation, Biochemical Analysis, and Structure Determination of SET Domain Histone Methyltransferases

    2016, Methods in Enzymology
    Citation Excerpt :

    The cSET contains conserved residues that are important for substrate binding and for catalysis. Notably, this region folds back onto the SET domain threading through a loop formed by a preceding stretch of the sequence to create a “pseudoknot” (Dillon, Zhang, Trievel, & Cheng, 2005; Taylor, Xiao, Gamblin, & Lin, 2003). This pseudoknot structure packs against the SET domain and contributes to the formation of the cofactor and substrate-binding pockets.

  • The promise and failures of epigenetic therapies for cancer treatment

    2014, Cancer Treatment Reviews
    Citation Excerpt :

    All HMTs. except KMT4 (i.e. DOT1L), contain the SET-domain, however, the catalytic domain of KMT4 shares similar structural folds to the SET domain of PRMTs.129 The Pre-SET, I-SET, and Post-SET domains vary in nature and sequence among the different HMTs and are present in different combinations with the core SET-domain.

  • Protein knots and fold complexity: Some new twists

    2007, Computational Biology and Chemistry
View all citing articles on Scopus
View full text