NaeI is transformed from DNA endonuclease to DNA topoisomerase and recombinase by a single amino acid substitution. The crystal structure of NaeI was solved at 2.3 Å resolution and shows that NaeI is a dimeric molecule with two domains per monomer. Each domain contains one potential DNA recognition motif corresponding to either endonuclease or topoisomerase activity. The N‐terminal domain core folds like the other type II restriction endonucleases as well as λ‐exonuclease and the DNA repair enzymes MutH and Vsr, implying a common evolutionary origin and catalytic mechanism. The C‐terminal domain contains a catabolite activator protein (CAP) motif present in many DNA‐binding proteins, including the type IA and type II topoisomerases. Thus, the NaeI structure implies that DNA processing enzymes evolved from a few common ancestors. NaeI may be an evolutionary bridge between endonuclease and DNA processing enzymes.
DNA endonucleases, topoisomerases, recombinases and ligases are ubiquitous enzymes essential for genetic processes such as replication, transcription, recombination and repair of DNA. Restriction endonucleases in prokaryotes protect the host against invading genomes, showing a great ability to recognize and cleave short specific DNA sequences hidden within the large background of DNAs. Over 3000 restriction endonucleases are known, representing >200 different sequence specificities (reviewed by Roberts and Macelis, 2000). Restriction enzymes show little sequence homology either with each other or with other protein families, although possible homology was found between EcoRII and the integrase family (Topal and Conrad, 1993). There are three types of restriction endonucleases (reviewed by Wilson, 1991). Type I enzymes are multimeric proteins that contain both endonuclease and methylase activities, recognize specific sequences and cleave at distant random sites outside the recognition sequence. ATP and S‐adenosyl‐methionine are required cofactors. Type II enzymes, major contributors to the biotechnology revolution, are dimeric proteins that recognize and cleave within specific, usually palindromic, DNA sequences 4‐8 bp in length. Type III enzymes are complexes of restriction and methylation subunits, require ATP binding but not hydrolysis, and cleave ∼25 bp away from the recognition sequence.
NaeI was isolated from a strain of the actinomycete Nocardia aerocolonigenes (ATCC 23870) cultured from soil in Japan (Shinobu and Kawato, 1960; Labeda, 1986). NaeI is a type IIe endonuclease (Yang et al., 1994) with unique properties. It is allosteric (Conrad and Topal, 1989; Yang and Topal, 1992), binding two GCC↑CGG recognition sequences to cleave DNA into blunt‐ended products. NaeI is a prototype of the growing number of type IIe endonucleases (Krüger et al., 1988; Conrad and Topal, 1989; Oller et al., 1991; Reuter et al., 1993) and is the first restriction endonuclease found to form a covalent intermediate with its DNA substrate (Jo and Topal, 1995). A 10 amino acid sequence in the N‐terminal domain of NaeI is similar to the active site of DNA ligase I, except for Leu43 in NaeI in place of the lysine essential for ligase function. Remarkably, substitution L43K converts NaeI from endonuclease to topoisomerase and recombinase (Jo and Topal, 1995). Thus, NaeI appears to be a missing link that relates these DNA‐binding proteins. We report here the structure of apoNaeI at 2.3 Å resolution, the first structure of a type IIe endonuclease.
Results and discussion
Two different DNA‐recognition structural motifs
The structure of the NaeI monomer consists of nine α‐helices, six 310 helices and 13 β‐strands, which aggregate into two domains (Figure 1A). The N‐terminal domain is composed of residues 10‐162 (the first nine residues were not observed in the structure) and has approximate dimensions 27 × 32 × 38 Å3. It contains a core structure of a six‐stranded β‐sheet flanked with five helices on one side and two strands and three helices on another side. Random mutagenesis of NaeI shows that most if not all amino acids involved in DNA cleavage reside in the N‐terminal domain (Holtz and Topal, 1994), suggesting that the N‐terminal domain is the active site of the endonuclease. We tentatively name the N‐terminal domain the ‘Endonuclease’ (Endo) domain for its topological similarity to other type II restriction endonucleases.
The C‐terminal domain of NaeI (residues 172‐311) consists of a two‐layer β‐sandwich adjacent to four helices (Figure 1), and has approximate dimensions 28 × 31 × 42 Å3. The domain contains the DNA‐binding motif of catabolite activator protein (CAP). CAP is a helix‐turn‐helix (HTH) motif observed in many DNA processing proteins (reviewed by Harrison, 1991; Pabo and Sauer, 1992; Nelson, 1995), including the type IA and type II DNA topoisomerases (reviewed by Berger et al., 1998). Finding this structural link between NaeI and topoisomerases may be especially important because the L43K mutation gives NaeI topoisomerase activity. We tentatively name the C‐terminal domain the ‘Topoisomerase’ (Topo) domain.
The Endo domain of NaeI was well ordered, whereas the Topo domain showed several disordered loops and had relatively higher B‐factors. Topo domain residues 191‐194, 254‐258 and 312‐317 were not observable in the structure, compared with only missing Endo domain residues 1‐9. The average B‐factor was 61.6 Å2 for all atoms of the Topo domain, in comparison with 41.2 Å2 for those of the Endo domain. Thus, the Topo domain appears to be loosely packed and to possess significant conformational flexibility. The Endo and Topo domains are linked by a relatively extended hinge loop including Tyr163‐Glu171. The two domains pack against one another via van der Waals interactions as well as a few hydrogen bonds.
Two monomeric molecules of NaeI are tightly associated into a dimer via interfacial interactions between helices H4 of each molecule, and are related by a local molecular 2‐fold axis (Figure 2). Dimerization of the Endo domains forms a cleft that has rough dimensions of 15 Å deep by 22 Å wide and that is covered by two fragments around B4 and B7 (Figure 2). This cleft contains the active site residues of NaeI and is the endonuclease site of NaeI, based on its similarity to the active sites of other type II restriction endonucleases. The Topo domains of the NaeI dimer show much less interdomain interaction and form a cleft with ∼13 Å between the HTH motifs (Figure 2B). The 310 helix before H9 is likely to be the recognition helix for DNA binding in a manner similar to the CAP motifs of the topoisomerases. This suggests that the Topo domain is responsible for the topoisomerase activity of NaeI L43K (see discussion below). The assignment of the Endo domain to substrate binding implies that the Topo domain binds activator DNA and is responsible for initiating the conformational changes that enable DNA cleavage (Conrad and Topal, 1989; Yang and Topal, 1992).
Conformational asymmetry of the NaeI dimer
Careful inspection of the NaeI dimer revealed that the relative orientations of the 2‐fold axes for the Endo and Topo domains in the NaeI dimer are slightly different, with ∼3° rotational offset and >10 Å translation. Superposition of the backbone atoms of the NaeI homodimer revealed root mean square (r.m.s.) displacements of 0.63 and 1.04 Å for the Endo and Topo domains, respectively, and 2.02 Å for the entire homodimer. When the transformation matrix that best superimposed the Endo domains was applied to the Topo domains, an overall r.m.s. separation of 5.04 Å was observed for 504 backbone atoms of the Topo domain (Figure 3). This indicates that the quaternary structure or the domain orientation in each monomer of NaeI is significantly different.
Since the Endo domain compares well to dimers of other restriction endonucleases and the Topo domain is loosely packed, the conformational asymmetry is most likely to be dependent on the orientation of the Topo domain. The asymmetry does not appear to be due to crystal packing because the Topo domains are not heavily involved in the formation of the crystal lattice, thus implying that the asymmetry represents a native conformational state of the enzyme. The biological implications of this asymmetry are unknown. Comparison of the two hinge loops in the dimer failed to detect a dramatic conformational change associated with a single residue. Instead, the conformational differences appear to be distributed over much of the Tyr163‐Glu171 loop. This reinforces the notion that these residues serve as a hinge region to alter molecular conformation and enable communication between the domains during catalysis.
Evolutionary origin of nucleases
Restriction endonucleases generally show very little sequence homology with each other or with other endo‐ or exonucleases. Reported structures for the type II restriction endonucleases, however, possess a common core topological motif (CCM; Venclovas et al., 1994; Newman et al., 1995). Structural superposition using the DALI (Holm and Sander, 1993) and CCP4 programs detected similarities of NaeI with eight restriction endonucleases, DNA repair endonucleases Vsr (very short patch repair) and MutH (mismatch repair), and the 5′‐3′ λ‐exonuclease (recombination and repair). The superposition between these enzymes showed r.m.s. deviations of ∼2 Å for the backbone atoms of the 21‐71 compared residues, implying structural conservation of the molecular cores of the nucleases.
Structure‐based sequence alignment revealed absolute conservation of three β‐strands in all 12 compared structures: B2, B3 and B5 in NaeI (Figure 4). Two of these β‐strands, equivalent to B2 and B3 in NaeI, bind divalent metals and DNA substrate (McClarin et al., 1986; Kim et al., 1990; Winkler et al., 1993; Cheng et al., 1994; Kostrewa and Winkler, 1995; Newman et al., 1995) and thus their conservation is essential for nuclease activity. NaeI β‐strand B5, or the equivalent in the other endonucleases, is apparently not involved in metal binding and DNA cleavage (Figure 4), but may be critical for formation of the β‐sheet and the hydrophobic core of the nuclease molecules. For example, NaeI Cys116 and Val118 in B5 and Trp105 and Leu107 in B4 form the hydrophobic core of the NaeI Endo domain. This hydrophobic core appears necessary to place loop 102‐109 correctly for DNA recognition by NaeI (see discussion below). Thus, we argue, based on their apparently essential structural and catalytic roles for endonuclease activity, that the three conserved β‐strands are the common origin of divergent evolution for the nucleases. This differs from the prevailing view that the endonuclease CCM consists of a five‐stranded β‐sheet flanked by two α‐helices (Venclovas et al., 1994; Kovall and Matthews, 1999).
Also highly conserved is α‐helix H4 in NaeI (Figure 4). All of the compared nucleases except for EcoRI and MunI have an equivalent helix, although longer or shorter, occupying a similar spatial location. H4, however, differs in function among the different enzymes. H4 in NaeI (residues Lys59‐Phe77) mediates dimerization (Figures 1 and 2). The corresponding helix interacts only with DNA substrate in BamHI, and plays a dual role of dimerization and DNA binding in EcoRV, PvuII and BglI. Thus, helix H4 is apparently not an evolutionarily conserved essential element, but instead a consequence of divergent evolution among nucleases.
Conservation and versatility of catalytic mechanisms
Structure‐based sequence alignment identified four conserved NaeI residues: Glu70, Asp86, Asp95 and Lys97 (Figure 4). Residues equivalent to Asp86 and Asp95 of NaeI have been identified as the divalent metal‐binding site in the enzyme‐DNA‐metal complexes of BamHI (Newman et al., 1995; Viadiu and Aggarwal, 1998), EcoRI (McClarin et al., 1986; Kim et al., 1990), EcoRV (Horton et al., 1998; Martin et al., 1999) and BglI (Newman et al., 1998). The conserved lysine appears to stabilize the doubly charged pentavalent transition state (Kostrewa and Winkler, 1995) and to orient the attacking water molecule (Horton et al., 1998). Alignment of the nuclease catalytic motifs supports a common catalytic mechanism among these enzymes, in which the divalent metal stabilizes the DNA binding and possibly activates a water to attack the scissile phosphate directly, resulting in a pentavalent transition state stabilized by the conserved lysine (reviewed by Kovall and Matthews, 1999). This general scheme of catalysis has also been proposed for other DNA processing enzymes. For example, T7 polymerase uses two magnesium ions for binding of the DNA and an arginine for stabilization of the transition state (Doublie et al., 1998).
The structure‐based sequence alignment also showed that DNA repair enzymes MutH (Ban and Yang, 1998), Vsr (Tsutakawa et al., 1999) and λ‐exonuclease (Kovall and Matthews, 1997, 1998) have Glu56/Glu77/Lys79, Asp21/Asp51/His64 and Glu85/Asp119/Glu129/Lys131, respectively, at or near their metal‐binding sites (Figure 4). This implies that these DNA processing enzymes share a similar catalytic mechanism with the type II restriction endonucleases. In that light, it is interesting that a Vsr homolog is part of the NaeI operon (Taron et al., 1995), strengthening the link between these two activities.
On the other hand, the structural superposition also shows divergence in catalytic residues among the endonucleases. First, a negatively charged residue Glu/Asp at the topological position Glu70 in NaeI (Figure 4), a residue generally considered for metal binding (Kovall and Matthews, 1999), is substituted by Lys61 in BamHI and by Leu39 in PvuII. Secondly, position Asp95 in NaeI is occupied by Ser188 in Cfr10I (Bozic et al., 1996). These substitutions suggest that different protein residues are involved in the metal binding in these enzymes. Thirdly, Lys97 in NaeI, assumed to stabilize the catalytic intermediate in EcoRV (Horton et al., 1998), is substituted by Glu113 in BamHI (Figure 4), suggesting the versatility of the detailed reaction pathway (see the review by Kovall and Matthews, 1999).
Two endonuclease subgroups
The structures of type II endonucleases have been divided into two subgroups based on whether the respective endonucleases cleave their cognate recognition sequences to give 5′ overhanging versus blunt‐ended products (Pingoud and Jeltsch, 1997; Kovall and Matthews, 1999). Our structural alignment, however, suggests that endonucleases are better categorized by the orientation of β‐strand B6 in NaeI and by the occurrence of β‐strand versus α‐helix at position B4 in NaeI (Figure 4). The endonucleases in subgroup I (NaeI, BglI, EcoRV and PvuII) contain two β‐strands that correspond to and have the same chain polarity as B6 and B4 in NarI. On the other hand, the endonucleases in subgroup II (EcoRI, BamHI, Cfr10I, MunI and FokI) show β‐strands with an opposite chain direction to β‐strand B6 of subgroup I, and α‐helices in place of β‐strand B4 in NaeI (Figure 4). By this classification, the secondary structure elements show maximum comparability within each subgroup but minimal comparability across the subgroups.
Superposition of a monomer of NaeI over that of either EcoRV or PvuII automatically superimposed the second monomer. Similar superposition of NaeI with any other endonucleases of Figure 4, however, placed the second monomer at very different locations. This suggests that NaeI, EcoRV and PvuII have a similar dimerization scheme, probably needed for the blunt cleavage of their recognition sequence.
Two different patterns of DNA recognition
DNA recognition elements have been reported, but not systematically compared for EcoRI (McClarin et al., 1986; Kim et al., 1990), BamHI (Newman et al., 1995), MunI (Deibert et al., 1999), EcoRV (Winkler et al., 1993) and PvuII (Cheng et al., 1994). Our structural superposition demonstrates that these recognition elements, although they have different amino acid components and recognize different DNA sequences, occupy the same location to interact with the major groove of the cognate DNAs (Figure 5). The recognition can be grouped into two patterns. The members in subgroup I, including NaeI, apparently use a β‐strand and a β‐like turn for DNA recognition (R1 and R2 in Figures 2A, 4 and 5A). We tentatively call it β‐strand recognition. In contrast, those in subgroup II use an α‐helix and a loop (Figures 4 and 5B). We tentatively call the latter α‐helix recognition, to differentiate between the two types of DNA recognition by endonucleases. We predict that these two patterns of recognition may apply to most, if not all, type II restriction endonucleases.
It is interesting to note that the DNA mismatch repair enzyme MutH contains two fragments, Leu90‐Val97 and Tyr180‐Leu188, corresponding to the putative recognition sequences Gln102‐Pro109 and Gly141‐Arg149 of NaeI (Figure 4). This implies that MutH probably has a β‐strand recognition mechanism in a manner similar to subgroup I, although monomeric MutH cleaves only one strand of the DNA duplex in comparison to the cleavage of both DNA strands by dimeric type II endonucleases.
CAP motif of NaeI Topo domain
The DNA‐binding motif of the Escherichia coli CAP, also known as an HTH motif, is found in many DNA‐binding proteins (reviewed by Harrison, 1991; Pabo and Sauer, 1992; Nelson, 1995; Berger et al., 1998). In the Topo domain of NaeI, the α‐helices H7 and H8, a 310 helix before H9 and two β‐strands B10 and B11 form a structural scaffold resembling CAP. A structural comparison program, DALI (Holm and Sander, 1993), revealed that the CAP in NaeI is superimposable with DNA‐binding domains of many DNA processing proteins, as shown in a partial list of topoisomerases Ia and II [Protein Data Bank (PDB) entry 1rva and 1bgw], catabolite gene activator (2cgp), transcription regulators of E2F‐DP (1cf7), MotA (1bja), OMPR (1opc), Smtb (1smt) and LexA (1lea), RNA adenosine deaminase Zα (1qbj), mating‐type protein MATA1 (1akh), nitrate/nitrite regulator NarL (1a04), chromosomal protein histone H5 (1hst) and restriction endonuclease FokI (2fok). DNA binding by each protein commonly involves the third helix of CAP, or the 310 helix before H9 in NaeI (Figure 6). Structure‐based sequence alignment of the CAP motifs of these proteins showed no conservation of sequence (data not shown). This is not surprising considering the variety of sequences for DNA recognition. The structural homologies imply that the CAP motif of NaeI is a DNA‐binding site. Indeed, the NaeI Topo domain, cloned and isolated from E.coli, specifically binds cognate DNA recognition sequence with an ∼10‐fold lower affinity than wild‐type NaeI (Colandene and Topal, 1998).
Putative topoisomerase mechanism of NaeI
NaeI protein contains the sequence 39TLDQLYDGQR48, which is similar to a fragment at the active site of DNA ligase I except for leucine in place of the ligase essential lysine (Jo and Topal, 1995). The single mutation L43K gives NaeI DNA topoisomerase activity (Jo and Topal, 1995, 1998). Structural comparison showed that NaeI and topoisomerases IA and II possess a common CAP motif (Figure 6). The CAP domains of topoisomerases IA and II are implicated in DNA binding and contain an active site tyrosine that forms covalent phosphotyrosyl intermediates for cleavage of DNA (Lima et al., 1994; Berger et al., 1996, 1998; Keck and Berger, 1999). Thus, the CAP motif of NaeI might explain the topoisomerase activity of the NaeI L43K mutant. The catalytic residues Tyr319 of topoisomerase IA and Tyr783 of topoisomerase II, however, correspond to Leu249 and Gly270, respectively, in NaeI. In fact, there is no tyrosine located in the Topo domain of NaeI, suggesting a different mechanism for the topoisomerase activity of the NaeI L43K mutant.
In the NaeI dimer, the two recognition helices of the CAP motif are separated by ∼13 Å (Figure 2B). Modeling the CAP‐DNA structure (Schultz et al., 1991) onto the NaeI Topo domain showed that both 310 helices simultaneously interact with the same DNA fragment, and NaeI serine residues, instead of tyrosine, reside near the DNA‐binding site (Figure 7), suggesting potential formation of a phosphoserinyl intermediate. The suggestion is consistent with the use of serine in place of tyrosine by the transposon γδ resolvase for covalent bond formation with DNA (Newman and Grindley, 1984). The best serine candidate for covalent interaction with the modeled DNA is Ser234, its side chain hydroxyl group sitting ∼5 Å away from the phosphodiester scissile bond of the DNA NaeI recognition sequence (Figure 7).
On the other hand, two alternative mechanisms for the L43K topoisomerase activity are also possible. First, Leu43, as part of helix H2, lies at a position central to the hydrophobic core of the NaeI dimer and too far from DNA binding to be involved in the topoisomerase activity in the present structural conformation (Figure 7). The L43K substitution, however, may deform the NaeI dimer because of the intolerance of the hydrophobic core for the positively charged lysine. Thus, a dramatic conformational change may be induced by the L43K mutation, bringing Tyr44 to a position close to the CAP region to serve as the catalytic residue for topoisomerase activity (Figure 7). This is consistent with the overall picture that conformational change is an essential step for topoisomerase activity (Keck and Berger, 1999). Alternatively, the CAP motif could serve as a recognition site to anchor the DNA substrate while the topoisomerase activity takes place at the endonuclease site of the Endo domain. In this consideration, Tyr87 located near the metal‐binding site might serve as the nucleophile to ligate DNA.
The crystal structure of NaeI at 2.3 Å resolution shows two domains, the Endo and Topo domains, which contain two structural motifs resembling the active site of restriction endonuclease and CAP DNA recognition. Structure‐based sequence alignment demonstrated that the Endo domain of NaeI possesses a conserved core motif and catalytic residues in common with the other restriction endonucleases and with repair nucleases MutH, Vsr and λ‐exonuclease. This suggests a common evolutionary origin of these nucleases. The Topo domain showed a CAP structure in common with other DNA‐binding proteins such as topoisomerases IA and II, implying its role in the topoisomerase activity of the NaeI L43K mutant. The ability of NaeI to bridge the endonuclease and topoisomerase families suggests that NaeI is an evolutionary precursor to the type II restriction endonucleases, which have lost the Topo domain in return for embellishment of the Endo domain.
Materials and methods
Wild‐type NaeI was purified as described previously (Colandene and Topal, 1998). Selenomethionyl‐substituted NaeI was prepared following the method outlined by Doublie (1997). Briefly, E.coli strain CAA1 harboring plasmid pmalc2:NaeI (Colandene and Topal, 1998) was grown in LB culture medium at 37°C overnight. Cells were pelleted by centrifugation and resuspended in modified M9 minimal medium that was pre‐warmed to 37°C. The modified medium was supplemented with the following amino acids (per liter): 100 mg Lys, 100 mg Thr, 100 mg Phe, 50 mg Leu, 50 mg Ile, 50 mg Val and 50 mg SeMet. Cells were grown at 37°C to an A595 of 0.5‐0.7, and isopropyl‐β‐d‐thiogalactopyranoside (IPTG) added to 0.2 mM. The temperature was adjusted to 20°C and the cells were grown for an additional 24‐30 h. Purification of selenomethionyl NaeI from cell paste is the same as for wild‐type NaeI (Colandene and Topal, 1998).
ApoNaeI and selenomethionyl NaeI were crystallized using microdialysis against a buffer of 20 mM malaic acid, 20 mM NaCl, 5 mM CaCl2, 1 mM β‐mercaptoethanol, 2% PEG3350 and 1.5% ethanol at 4°C and pH 6.1. The crystals have the space group P21 with a = 99.4, b = 55.9, c = 59.0 Å and β = 95.7°, and contain two molecules of NaeI in the crystallographic asymmetric unit. The crystals have ∼0.6 mm in each dimension and almost perfect shape, but diffract only to moderate resolution.
NaeI endonuclease was crystallized after considerable difficulty. The crystals showed a relatively large mosaicity of ∼1, which suggests significant conformational flexibility of the NaeI molecule. Significant flexibility was also implied by the results of isoelectrofocusing. Multiple isoelectrofocusing bands were observed to cover ∼2 pH units for a single molecular weight of NaeI (a single band in SDS‐PAGE). Precise control of pH, crystal growth conditions and cryo‐cooling protocols gave crystals that diffracted to a resolution of ∼2.3 Å and enabled solution of the three‐dimensional structure of the NaeI apomer.
Data collection and structure determination
For data collection, the crystals were soaked for 1 day in stabilization buffer, which contains the same components as the crystallization buffer but 15% PEG3350, and then transferred to 30% PEG3350 buffer for 2‐6 h. The cryosolvent of 30% PEG3350 plus 5% PEG400 was used to freeze the crystals in liquid nitrogen. The best diffraction data for the native and selenomethionyl NaeI were collected at Brookhaven using synchrotron beam line X12C (Table I). All data were processed with program HKL (Otwinowski and Minor, 1997). The structure of NaeI was determined by multiwavelength anomalous diffraction (MAD) using selenomethionine as the anomalous scatterer. The automatic phasing program SOLVE 1.17 (Terwilliger and Berendzen, 1999) yielded an overall figure of merit of 0.52 for 20 893 reflections at 2.5 Å resolution and a partially traceable electron density map. The MAD phases were further improved by solvent flattening, histogram and local 2‐fold symmetry averaging, as implemented in the CCP4 program package. The structure was modeled using the program FRODO and was refined by CNS (Brünger et al., 1998).
The atomic coordinates of NaeI have been deposited into the PDB with entry code 1ev7.
We would like to thank Dr R.Huber for the coordinates of MunI and Dr M.Newman for the coordinates of BglI. We thank Dr Sweet for help with diffraction data collection at beamline X12C at the National Synchrotron Light Source. This research is partially supported by NIH grant GM52123.
- Copyright © 2000 European Molecular Biology Organization