Crystal structure of restriction endonuclease BglI bound to its interrupted DNA recognition sequence

Matthew Newman, Keith Lunnen, Geoffrey Wilson, John Greci, Ira Schildkraut, Simon E.V. Phillips

Author Affiliations

  1. Matthew Newman*,2,
  2. Keith Lunnen3,
  3. Geoffrey Wilson3,
  4. John Greci3,
  5. Ira Schildkraut3 and
  6. Simon E.V. Phillips*,1
  1. 1 School of Biochemistry and Molecular Biology, and North of England Structural Biology Centre, University of Leeds, Leeds, LS2 9JT, UK
  2. 2 Present address: Imperial Cancer Research Fund,44 Lincoln's Inn Fields, London, WC2A 3PX, UK
  3. 3 New England Biolabs, 32 Tozer Road, Beverly, MA, 01915, USA
  1. *Corresponding authors. E-mail: M.Newman{at} or E-mail: SEVP{at}
View Full Text


The crystal structure of the type II restriction endonuclease BglI bound to DNA containing its specific recognition sequence has been determined at 2.2 Å resolution. This is the first structure of a restriction endonuclease that recognizes and cleaves an interrupted DNA sequence, producing 3′ overhanging ends. BglI is a homodimer that binds its specific DNA sequence with the minor groove facing the protein. Parts of the enzyme reach into both the major and minor grooves to contact the edges of the bases within the recognition half‐sites. The arrangement of active site residues is strikingly similar to other restriction endonucleases, but the co‐ordination of two calcium ions at the active site gives new insight into the catalytic mechanism. Surprisingly, the core of a BglI subunit displays a striking similarity to subunits of EcoRV and PvuII, but the dimer structure is dramatically different. The BglI–DNA complex demonstrates, for the first time, that a conserved subunit fold can dimerize in more than one way, resulting in different DNA cleavage patterns.


Type II restriction endonucleases comprise one of the major families of endonucleases (Roberts and Halford, 1993; Pingoud and Jeltsch, 1997). They usually recognize a short palindromic DNA sequence between 4 and 8 base pairs in length, and in the presence of Mg2+, specifically catalyse the hydrolysis of phosphodiester bonds at precise positions within or close to this sequence. Although type II restriction endonucleases are ubiquitous within prokaryotes, there is generally no sequence similarity among them.

Despite this lack of sequence similarity, crystallographic analyses of several restriction endonucleases have revealed considerable three‐dimensional similarity, correlating well with the type of cleavage pattern produced by these enzymes. EcoRV (Winkler et al., 1993) and PvuII (Athanasiadis et al., 1994; Cheng et al., 1994) have conserved subunit cores, consisting of a central five‐stranded β‐sheet flanked by α‐helices, which interact to form structurally similar homodimers. They bind their uninterrupted 6‐bp recognition sequences with the minor groove facing the protein and cleave the DNA to produce blunt‐ended fragments. BamHI (Newman et al., 1995) and EcoRI (McClarin et al., 1986) also have conserved subunit cores, consisting of five β‐strands and two α‐helices, but they are topologically distinct from the EcoRV‐like enzymes and form very different dimer structures. They bind their 6‐bp recognition sequences with the major groove facing the protein and cleave to produce four base 5′ overhangs. Cfr10I (Bozic et al., 1996) and the catalytic domain of FokI (Wah et al., 1997) have a similar subunit fold to the BamHI‐like enzymes, and also produce DNA products with four base 5′ overhangs. Despite the structural differences between the EcoRV‐ and BamHI‐like enzymes, their active sites are well conserved, consisting of a triad of charged amino acid residues (Aggarwal, 1995; Pingoud and Jeltsch, 1997).

BglI, the type II restriction endonuclease from Bacillus globigii (Wilson and Young, 1976; Lee and Chirikjian, 1979), produces a different cleavage pattern from those of known structures. It recognizes the interrupted DNA sequence GCCNNNNNGGC and cleaves between the fourth and fifth unspecified base pair to produce 3′ overhanging ends (Bickle and Ineichen, 1980; Lautenberger et al., 1980; Van Heuverswyn and Fiers, 1980). A subunit of BglI has a molecular mass of 35 kDa, and consists of 299 amino acid residues. It displays no significant sequence homology to any other restriction endonucleases. Owing to the nature of its cleavage pattern, it was anticipated that BglI would represent a new structural class of restriction endonucleases.

To understand further the molecular basis of DNA recognition and hydrolysis by type II restriction endonucleases, the structure of BglI bound to a DNA oligonucleotide containing its specific recognition sequence was determined to a resolution of 2.2 Å. We have compared the structure of BglI with those of EcoRV and PvuII. Surprisingly, despite the fact BglI recognizes an interrupted DNA sequence and produces 3′ overhanging ends, whereas EcoRV recognizes a contiguous 6‐bp sequence and produces blunt ends, a subunit of BglI shows significant structural similarity to subunits of the EcoRV‐like enzymes. It appears that dramatically different cleavage patterns are achieved by BglI and EcoRV using conserved subunit folds, combined with alternative modes of dimerization. In addition, we propose a two‐metal‐ion catalytic mechanism based on the presence of two calcium ions that are co‐ordinated at the active site of BglI.

Results and discussion

Structure determination

The gene coding for the BglI restriction endonuclease was cloned from B.globigii genomic DNA and overexpressed in an Escherichia coli strain that co‐expressed the BglI methyltransferase. Purified BglI endonuclease had a specific activity of ∼1 250 000 U/mg. Crystals were obtained of BglI bound to a 17 bp DNA oligonucleotide containing its specific recognition sequence. In order to obey the symmetry imposed by the crystal lattice, the DNA oligonucleotide that was used in the final crystallization experiments incorporated an A:A mismatch at the central base pair (Figure 1). The structure was solved using the technique of multiple isomorphous replacement with anomalous scattering. An experimental electron density map calculated at 2.5 Å resolution was of extremely high quality and enabled most of the protein and all the DNA to be built. The structure has been refined to 2.2 Å resolution with good statistics and stereochemistry (Table I). Two hundred and fifty three solvent atoms were included in the final refined model. All protein main chain torsion angles are located in the energetically allowed regions for l‐amino acids.

Figure 1.

Sequence of the 17 base pair DNA fragment used in the BglI–DNA complex. Recognition half‐sites are boxed. The sites of cleavage are indicated by arrows. The position of the crystallographic two‐fold is shown and the symmetry related strand is indicated by #. To avoid crystallographic disorder, an A:A mismatch was incorporated at the centre of the oligonucleotide.

View this table:
Table 1. Data collection and refinement statistics

Overall structure

The structure of the BglI bound to a DNA oligonucleotide containing its specific recognition sequence is shown in Figure 2A. There is only one protein subunit plus one DNA strand in the asymmetric unit of the crystal structure, and thus the model of the biologically active homodimer is generated by the application of a crystallographic two‐fold operation. The BglI dimer measures ∼75×55×55 Å. There is a large DNA binding cleft, ∼45 Å in length, which is long enough to accommodate 14 base pairs of a DNA duplex. The DNA is bound with the minor groove at the centre of the 17 bp oligonucleotide facing the protein, as in the EcoRV and PvuII protein–DNA complexes (Winkler et al., 1993; Cheng et al., 1994). Thus, it appears that enzymes that cleave their DNA to produce blunt‐ended or single‐stranded 3′ overhanging ends approach the DNA from the minor groove side, so that the active site is oriented correctly with respect to the scissile phosphodiester bond (Cheng et al., 1994).

Figure 2.

(A) Structure of the BglI dimer bound to DNA containing its recognition sequence, with a crystallographic two‐fold running vertically through the centre of the complex. The DNA helical axis is horizontal. α‐helices and 310‐helices are shown in purple, β‐strands in green and the DNA in red. (B) Secondary structural assignment of a BglI subunit and comparison with EcoRV and PvuII (PDB codes 1rva and 1pvi, respectively). Secondary structural elements were defined using the Kabsch and Sander (1983) algorithm implemented in PROCHECK (Laskowski et al., 1993), but were modified in some regions according to the hydrogen‐bonding criteria of Baker and Hubbard (1984). 310‐helices are shown, but are not labelled. The N‐ and C‐terminal ends are labelled. Only relevant labels are given for EcoRV, as in Winkler et al. (1993), and PvuII, as in Athanasiadis et al. (1994). Common elements of secondary structure between BglI, EcoRV and PvuII are shown in purple for α‐helices and green for β‐strands. The dimerization sub‐domains are coloured blue for EcoRV and PvuII. Produced using MOLSCRIPT (Kraulis, 1991) and RASTER3D (Merritt and Murphy, 1994).

A subunit of BglI has an α/β structure with a large central six‐stranded β‐sheet flanked by α‐helices (Figure 2B). The central sheet contains strands β1, β2, β3, β7, β8, β9 and β127 and β9 form a single pseudo‐continuous β–strand). β1, β2 and β3 are anti‐parallel and form a β–meander which contains the active site residues Asp116, Asp142 and Lys144. α‐helix α4 packs against the concave surface of the central β‐sheet, and contributes residues to both the dimer interface and the active site (Glu87). α4 has an inserted residue (Ala89) that is accommodated through the formation of a bulge that protrudes from the side of the helix (Kavanaugh et al., 1993; Harrison et al., 1994). The helix does not incorporate any proline residues, and does not contain any other distortion. A smaller three‐stranded anti‐parallel β‐sheet (β4, β10 and β11) is situated at the bottom of the larger β‐sheet. This smaller sheet contains most of the residues involved in specific DNA recognition.

The dimer interface is extremely large and is formed by residues comprising one side of the BglI subunit. The interface area calculated using the method of Lee and Richards (1971) is ∼3100 Å2, which is significantly larger than that for proteins of a similar molecular weight (Janin et al., 1988). The equivalent areas for EcoRV and BamHI were ∼2200 and 1600 Å2, respectively. The dimer interface is formed by several segments of the polypeptide chain: the N‐terminus, the C‐terminal region of helix α3, helix α4, the loop between β1 and β2, just before strand β5, and the loops between β7 and β8, and between β8 and β9 (Figure 3). The dimer interface contains 40 hydrogen bonds, including 12 salt bridges. Additionally, 22 water molecules are involved in water‐mediated hydrogen bonds between the two subunits.

Figure 3.

Schematic diagram showing the secondary structural topology of BglI, EcoRV and PvuII. α‐helices and 310‐helices are shown as cylinders and β‐strands as broad arrows. α‐helices and β‐strands are labelled according to Figure 2B. 310‐helices are unlabelled. The N‐ and C‐terminal ends are labelled. The extent of the secondary structural element is indicated by residue numbers. Residues thought to be involved in either the active site or specific DNA recognition are shown. For EcoRV, the recognition R‐loop is between βi and βj, and the Q‐loop is between βc and βd.

Similarity to EcoRV and PvuII

Surprisingly, despite the fact that BglI is the first structure of a type II restriction endonuclease that binds an interrupted recognition site and hydrolyses its phosphodiester bonds to produce 3′ overhanging ends, analysis of the subunit fold reveals extensive similarities to EcoRV (Winkler et al., 1993) and PvuII (Athanasiadis et al., 1994; Cheng et al., 1994). A region consisting of five β‐strands of the central β‐sheet (incorporating the β‐meander), an α‐helix and two β‐strands of the smaller β‐sheet is shared by the three enzymes (β1, β2, β3, β8, β9, α4, β4 and β11 in BglI; βc, βd, βe, βg, βh, αB, βf and βj in EcoRV; and βa, βb, βc, βe, βf, αB, βd and βg in PvuII; Figures 2B and 3). This common region includes residues that comprise the active sites and DNA recognition elements. A pair‐wise least‐squares superposition of BglI and EcoRV gives a root mean square (r.m.s.) difference of 2.1 Å from 91 Cα pairs that can be aligned closer than 3.8 Å. BglI and PvuII are less similar (PvuII being significantly smaller), giving an r.m.s. difference of 1.5 Å for 41 Cα pairs that can be aligned closer than 3.8 Å. Outside the common region, however, BglI shows significant differences to EcoRV and PvuII, many of which occur in parts of the protein structure involved in dimerization (Figure 2B).

DNA structure

The DNA is primarily B‐form but has a slight curvature, resulting in a 20° bend away from the protein (Figure 2A). There are no major bends or kinks of the type seen in EcoRI or EcoRV (specific) DNA complexes (McClarin et al., 1986; Winkler et al., 1993). The average DNA helical parameters for the BglI–DNA complex are 32° for twist and 3.5 Å for rise (Table II), with most of the sugars in the standard 2′‐endo conformation for B‐DNA. The A:A mismatch at the centre of the oligonucleotide does not distort the B‐form structure significantly, with both bases remaining in the anti conformation, but results in a localized shift of the adenine bases within the DNA helix. The DNA helical parameters at the mismatch have similar values to those obtained from an identical mismatch in the NF–κB p50 homodimer complex (Mueller et al., 1995).

View this table:
Table 2. DNA helical parameters

Some of the largest deviations from B‐form DNA occur within, or adjacent to, the recognition half‐site. There is a 16° roll between base pairs 3 and 2 opening up the minor groove [other roll angles are generally small (<|8°|)]. There are positive propeller angles at base pairs 3 and 2 (7° and 13°, respectively, where the mean value for B‐DNA is −13°). Base step 3–2 also has the smallest twist of 26°. Surprisingly, there are large tilts within the recognition half‐site, associated with base steps 5–4 and 4–3. These are of similar magnitude but of different sign, thus the resultant effect on the DNA helix is negligible. Tilt angles about the short axis of a base pair are rarely seen as they are unfavourable energetically. The major groove width varies from 10.6 Å at the centre of the oligonucleotide, to 13.9 Å in the recognition half‐site, and the minor groove width varies from 4.6 Å at the ends of the duplex to 9.4 Å at the recognition half‐site. The average values for B‐form DNA are 11.7 and 5.7 Å for the major and minor groove widths, respectively (Saenger, 1984). The widening of both the major and minor grooves probably provides improved access for the recognition elements of the protein.

DNA recognition

A BglI subunit sits entirely on one DNA half‐site, and cleaves within that half of the DNA oligonucleotide (Figure 4A). There is no cross‐over mode of binding of the type seen in BamHI, where a protein subunit forms contacts to both DNA half‐sites (Newman et al., 1995). It has been proposed that the cross‐over mode of binding is important as it guarantees that specific recognition leads to the correct formation of both active sites, followed by concerted double‐strand cleavage (Pingoud and Jeltsch, 1997). Although there is no cross‐over binding in BglI, there are extensive subunit–subunit contacts (see above), which include an intricate intersubunit hydrogen‐bonding network that involves the active site region. The β1–β2 loop of one subunit, which contains the active site residue Asp116, interacts with conserved helix α4 of the other subunit. It is possible that these intersubunit contacts could fulfil the same function in BglI as the cross‐over mode of binding does in BamHI, i.e. non‐specific binding in one half‐site leads to inactivated catalytic centres in both protein subunits.

Figure 4.

(A) A simplified view of BglI and EcoRV showing the DNA recognition and active site elements. Only residues 62–159 and 262–282 are shown for BglI and only residues 34–109 and 177–193 are shown for EcoRV. The intersubunit 2‐fold is oriented vertically for both enzymes. One subunit is coloured green and the other is purple. Active site residues (Asp116 and Asp142 for BglI, and Asp74 and Asp90 for EcoRV) are shown as a ball‐and‐stick representation. The DNA is shown in red by bonds drawn through the O5′ positions for simplicity. Cyan spheres indicate the position of the active site phosphates. (B) A view of the interactions between the outer (G:C), middle (C:G) and inner (C:G) base pairs of the recognition sequence of BglI. Dashed lines indicate hydrogen bonds. Selected water molecules are shown as red spheres. Interactions are shown for both the major and minor groove edges of the base pairs. Buttressing interactions, which serve to orient and immobilize recognition groups, are also shown. For the inner base pair, part of the hydrogen‐bonding network with the scissile phosphodiester bond is shown. The calcium ions plus their liganded waters (including Wat4) have been omitted for clarity, as has the base of Ade2. Produced using MOLSCRIPT (Kraulis, 1991) and RASTER3D (Merritt and Murphy, 1994).

Although BglI contacts bases within both the major and minor grooves, it does not surround the DNA like EcoRV. This is reflected by a smaller BglI dimer–DNA interface than that for EcoRV (1671 versus 2354 Å2, respectively). The only direct base contacts made by BglI are within the recognition sequence, with most occurring in the major groove. There are no direct contacts to the unspecified five base pairs between the two recognition half‐sites, although there are contacts to the sugar‐phosphate backbone within this region.

Major groove. The major groove contacts involve residues situated on, or near to, the small three stranded recognition β‐sheet, comprising strands β4, β10 and β11 (Figures 3 and 4A). Strands β4 and β11 have counterparts in both EcoRV (Winkler et al., 1993) and PvuII (Athanasiadis et al., 1994; Cheng et al., 1994), forming structurally conserved DNA recognition regions. The details of specific base contacts vary, however: EcoRV recognizes DNA via the R‐loop between strands βi and βj (Winkler et al., 1993), whereas PvuII uses anti‐parallel strands βd and βg (Cheng et al., 1994). Within the recognition half‐site of BglI there are eight direct protein–DNA hydrogen bonds and one water‐mediated hydrogen bond (Figure 4B). Thus, the hydrogen‐bonding potential with the major groove is satisfied totally. In addition to these hydrogen bonds, there are numerous buttressing interactions which help to correctly orient and immobilize protein groups involved in DNA recognition.

The outer G:C (‐5#:5) base pair is contacted by Arg279 from strand β11 and Asp150 from just before strand β4. The guanidinium group of Arg279 lies in the plane of Gua‐5#, donating two hydrogen bonds [Nη1...N7 (3.4 Å), Nη2...O6 (2.8 Å)]. Arginine–guanine interactions are probably the most commonly observed interactions in protein–DNA complexes (Pabo and Sauer, 1992). The outer cytosine base, Cyt5, is involved in a single hydrogen bond to Asp150 [N4...OD1 (3.2 Å)]. Contacts to the middle C:G (‐4#:4) base pair arise solely from Lys266, situated on the loop between strands β10 and β11. Lys266 main chain torsion angles are in the αL conformation and the side chain is extended in the plane of the C:G base pair. Cyt‐4# is involved in a single hydrogen bond to the main chain carbonyl group of Lys266 [N4...O (2.8 Å)]. The side‐chain amino group of Lys266 is involved in a bifurcated hydrogen bond to Gua4 [NZ...N7 (2.8 Å), NZ...O6 (3.4 Å)]. Main chain and side‐chain hydrogen bonds from a single lysine to both the guanine and cytosine bases in a G:C base pair are uncommon, but have been observed previously in λ repressor–operator complex (Beamer and Pabo, 1992). Only the guanine base of the inner C:G (‐3#:3) base pair is involved in direct protein contacts. The guanidinium group of Arg277 (situated just before strand β11) lies in the plane of Gua3 and donates two hydrogen bonds to the N7 and O6 atoms [NH1...N7 (3.0 Å), NH2...O6 (2.6 Å)]. Cyt‐3# is hydrogen bonded to a highly ordered water molecule [N4...OH2 (3.0 Å), B = 9 Å2]. This water molecule is further hydrogen bonded to residues Asp267 [OD2...OH2 (3.1 Å)] and Asp268 [N...OH2 (2.8 Å)], which are situated on the loop between strands β10 and β11. In addition, an intricate hydrogen‐bonding network, involving two highly ordered water molecules, links Arg277 to the active site phosphate. This feature, coupling DNA‐recognition elements to the active site phosphate, has also been observed within the recognition interface of the BamHI–DNA complex (Newman et al., 1995).

Minor groove. BglI contacts bases within the minor groove of the recognition half‐site via the large loop (residues 59–82) between helices α3 and α4 (Figures 3 and 4A). PvuII forms specific minor groove base contacts using a similar part of its structure, between helices αA and αB (Asp34, Figure 3), but contacts a different half‐site from BglI (Nastri et al., 1997). EcoRV also forms minor groove base contacts, but uses a different part of its structure to achieve this, known as the Q‐loop (Winkler et al., 1993; Figures 3 and 4A). BglI forms a direct protein–DNA hydrogen bond between the main‐chain amide of Lys73 and Cyt‐3# [N...O2 (3.2 Å)], and three water‐mediated hydrogen bonds to Gua3, Gua4 and Cyt5 (Figure 4B). Although these hydrogen bonds confer no DNA specificity, the GCC sequence may be recognized by a process of indirect read‐out owing to its preference for a wide minor groove (Goodsell et al., 1993). Other DNA sequences without such a wide minor groove may give restricted access to the α3–α4 loop, disrupting hydrogen bonds to bases and nearby phosphate groups.

Phosphate contacts. There are extensive contacts between the enzyme and the sugar‐phosphate backbone of the DNA. The BglI dimer contacts both DNA strands at all phosphate groups from −7 to 6, except for nucleotides −2 and −1. Each subunit makes 17 direct hydrogen bonds to the phosphate groups of the DNA duplex, with 21 more being mediated by water molecules. Of the direct phosphate contacts, 10 involve side‐chain groups (five of which are from arginine or lysine residues) and seven are from main chain NH groups. The residues involved in direct or water‐mediated hydrogen bonds with the DNA backbone come from several segments of the enzyme: the loop between α3 and α4, the loop between β1 and β2, the C‐terminal end of β3, residues near the small three‐stranded recognition β‐sheet (involving strands β4, β10 and β11), the β‐hairpin between β5 and β6 and the C‐terminal end of β8 (Figure 3).

Active site

Type II restriction endonucleases require only Mg2+ as a cofactor to catalyse the hydrolysis of a phosphodiester bond, producing 5′ phosphate and 3′ hydroxyl groups. Hydrolysis proceeds with stereochemical inversion of configuration at the phosphorous atom (Connolly et al., 1984; Grasby and Connolly, 1992). This possibly involves a direct in‐line attack on the phosphate by an activated water molecule, leading to a pentavalent transition state stabilized by Mg2+ ions. Although several catalytic mechanisms have been proposed (Jeltsch et al., 1993; Kostrewa and Winkler, 1995; Vipond et al., 1995), there is currently no detailed understanding of the steps involved in phosphodiester hydrolysis. For example, it is not clear whether one or two Mg2+ ions are involved at the active site (Vipond et al., 1995; Groll et al., 1997). Also, it is not known which group activates the attacking water molecule, or which group is responsible for protonation of the leaving group.

BglI represents the first structure of a type II restriction endonuclease that cleaves its DNA substrate producing 3′ overhanging ends, and analysis of the three‐dimensional structure reveals that it too possesses the conserved triad of charged amino acid residues adjacent to the scissile phosphodiester bond. Residues Asp116, Asp142 and Lys144 in BglI can be aligned spatially with active site residues Asp74, Asp90 and Lys92 in EcoRV (Winkler et al., 1993), with only small differences in the positions of side‐chain groups (Figure 5A). These residues are situated on the edge of a β‐meander in the central β‐sheet (comprising strands β1, β2 and β3 in BglI). Residues Asp74 and Asp90 are essential for the catalytic function of EcoRV, and are probably involved in co‐ordinating Mg2+‐cofactor ions. Substitution of either of these residues by alanine renders EcoRV inactive (Selent et al., 1992; Groll et al., 1997). Importantly, inspection of the experimental electron density map (calculated using MIR phases), and subsequent 2|Fo|−|Fc| maps (calculated using model phases), identified two Ca2+ ions at the active site of BglI (Figure 5B, Table III). Calcium 1 has octahedral co‐ordination geometry, involving ligands from the carboxylate groups of Asp116 and Asp142, the carbonyl of Ile143 and a non‐bridging oxygen O2P of the active site phosphate Ade2. The remaining two ligands are water molecules. Calcium 2 has pentagonal bipyramidal co‐ordination geometry that is typical of a Ca2+ ion, but not of Mg2+. Ligands are from the carboxylate of Asp116, a non‐bridging oxygen O2P of the active site phosphate Ade2, O3′ of Thy1, and four water molecules. The ligand distances lie within the ranges 2.3–2.7 Å and 2.3–2.8 Å, for calcium ions 1 and 2, respectively. No DNA cleavage product was observed in the BglI–DNA crystal structure, as Ca2+ ions do not generally support hydrolysis of DNA by restriction endonucleases (Vipond and Halford, 1995).

Figure 5.

The active site of BglI. (A) Comparison of the active sites of BglI and EcoRV. Both views are in a similar orientation and show similar regions of the structures. The central β‐sheet is green with an α‐helix (α4 in BglI and αB in EcoRV) is purple. Only five bases of a single strand of DNA are shown for clarity, with the sugar–phosphate backbone in red. Active site residues and the active site phosphate are shown as a ball‐and‐stick representation. Calcium ion positions for the BglI–DNA complex are shown as cyan spheres. (B) Stereo view of the co‐ordination geometry of the Ca2+ ions in the active site of BglI. Active site residues are shown as a ball‐and‐stick representation. Only a small segment of the DNA phosphate backbone on either side of the scissile phosphodiester bond is shown. DNA bases have been omitted for clarity. Calcium ions are shown as cyan spheres. Atoms co‐ordinated to the calcium ions are shown connected by solid grey lines. (C) The active site of the 3′–5′ exonuclease domain of DNA polymerase I (PDB code 1kfs) shown as a ball‐and‐stick representation. Divalent cations, shown as cyan spheres, are Zn2+ at position 1 and Mg2+ at position 2. The attacking water molecule is not shown. Produced using MOLSCRIPT (Kraulis, 1991) and RASTER3D (Merritt and Murphy, 1994).

View this table:
Table 3. Active site calcium ion interactions

The Ca2+ positions are very similar to the divalent cation positions in the 3′–5′ exonuclease domain of DNA polymerase I (Beese and Steitz, 1991; Figure 5C). They are also similar to the proposed position of two ions in the BamHI–DNA complex, identified from two highly ordered water molecules (Newman et al., 1995). However, the positions of the Ca2+ ions observed at the active site of BglI differ from the divalent cation positions identified in the EcoRV–DNA complex (Kostrewa and Winkler, 1995), although this complex was non‐productive, as soaking the crystals in solutions containing Mg2+ did not lead to DNA cleavage. For BglI, the metal ions lie along a direction parallel to the scissile phosphodiester bond, whereas for EcoRV, they are positioned on a line approximately perpendicular to it, with the Mg2+ ions co‐ordinated by Asp74 and Asp90, and Glu45 and Asp74, respectively (Figure 5A). The BglI–DNA structure reveals that Glu87 superimposes with Glu45 of EcoRV (Figure 5A), both residues being situated on a conserved α‐helix (α4 in BglI and αB in EcoRV). In the BglI active site, however, Glu87 does not participate in Ca2+ binding directly, but instead forms a hydrogen bond to the calcium‐1‐bound water molecule Wat3 (Figure 5B). In EcoRV, Glu45 has been shown to be important for catalysis, but unlike Asp74 and Asp90, substitution of this residue by an alanine does not abolish activity entirely (Selent et al., 1992; Groll et al., 1997). PvuII does not have an acidic residue at a structurally equivalent position on conserved helix αB (Athanasiadis et al., 1994; Cheng et al., 1994).

Wat4 provides a possible candidate for the nucleophilic water (Figure 5B), and is in a very similar position to a water observed in the crystal structure of the BamHI–DNA complex (Newman et al., 1995). In addition to being co‐ordinated by calcium 1, Wat4 is hydrogen bonded to Lys144 Nζ (2.9 Å), Ade2 O5′ (2.8 Å) and Gua3 O1P (2.7 Å). It is well positioned for in‐line attack opposite the O3′ leaving group, being 3.2 Å from the phosphorus atom of the scissile phosphodiester bond, with the Wat4–P‐O3′ angle almost linear (164°). There is, however, no convenient acidic group that could act as a general base to deprotonate the attacking water molecule (Wat4). A substrate‐assisted catalysis model has been proposed for restriction endonucleases, where the pro‐Rp oxygen atom of the phosphate 3′ to the scissile phosphodiester bond deprotonates the attacking water molecule (Jeltsch et al., 1993). The main problem with this model is that although Gua3 O1P in BglI is well placed to deprotonate Wat4, the phosphate oxygen would be a poor proton acceptor owing to its unfavourably low pKa (pKa ≤2).

Based on the active site structure of BglI, one possible mechanism is that a Mg2+ ion at site 1 could help to activate the attacking water molecule (Wat4), a Mg2+ at site 2 could help to stabilize the negative charge on the 3′ oxyanion leaving group, and both ions could be involved in stabilizing the pentavalent transition state. Although Ca2+ appears to mimic the role of the active cofactor Mg2+ in specific DNA binding by EcoRV (Vipond and Halford, 1995), it must be borne in mind that the BglI–DNA structure is properly considered a non‐productive complex, and that Ca2+ may be a poor model for the binding of Mg2+ (Engler et al., 1997). The co‐ordination distance of Mg2+ is 2.0 Å, whereas Ca2+ has a longer co‐ordination distance of 2.5 Å. Substitution of Ca2+ by Mg2+ would therefore require small positional adjustments of the groups involved in co‐ordinating the divalent cations.

Cleavage pattern

Despite the fact that a subunit of BglI displays a significant degree of structural similarity to subunits of EcoRV and PvuII, BglI has a remarkably different dimer structure. All three enzymes have a conserved α‐helix at their dimer interface: α4 in BglI, and αB in EcoRV and PvuII (Figures 2B and 3). However, the remaining subunit–subunit interactions differ considerably between the three enzymes. In EcoRV, dimerization occurs primarily through residues situated on a β‐hairpin (involving strands βa–βb) and a large loop between strands βg and βh. These two loops form the dimerization sub‐domain (Winkler et al., 1993). This sub‐domain interacts with the corresponding sub‐domain from another protein subunit. In PvuII, the dimerization sub‐domain is replaced by α‐helix αA and the loop between αA and αB (Athanasiadis et al., 1994; Cheng et al., 1994). In BglI, there is no equivalent of the dimerization sub‐domain (Figure 2B), and as a result of the altered subunit–subunit interactions, one BglI subunit is rotated with respect to the other by 86° compared with EcoRV (101° compared with PvuII). This creates a BglI dimer that is less ‘U‐shaped’ than either the EcoRV or PvuII dimers. The BglI subunits are also ∼11 Å further apart along a direction almost parallel to the DNA helical axis, creating the water‐filled cavity at the dimer interface (Figure 6). The resulting ‘screw’ transformation of the BglI subunits, correctly places the active sites and DNA recognition elements adjacent to the scissile phosphodiester bonds and DNA half‐sites, respectively. The two active sites of BglI are separated by ∼12 Å along a direction parallel to the DNA helical axis, which is the correct distance to produce DNA fragments with single‐stranded three base extensions.

Figure 6.

Comparison of the dimer structure of BglI and EcoRV, viewed down the molecular two‐fold axis, with the DNA above the protein. One subunit is coloured green and the other purple. The DNA is indicated in red by bonds drawn through the O5′ positions. Active site residues (Asp116 and Asp142 for BglI, Asp74 and Asp90 for EcoRV) are shown in red as a ball‐and‐stick representation. The position of the active site phosphate is shown as a cyan sphere. The DNA recognition sheet is coloured blue for BglI, as is the equivalent sheet, plus recognition R‐loop, for EcoRV. The N‐ and C‐terminal ends of the protein, and the 5′‐ and 3′‐ends of the DNA, are labelled. Conserved helices α4 in BglI and αB in EcoRV are labelled at their N‐terminal ends. Produced using MOLSCRIPT (Kraulis, 1991) and RASTER3D (Merritt and Murphy, 1994).

The structure of BglI–DNA complex shows that dimeric restriction endonucleases can use a conserved EcoRV‐like subunit fold, combined with alternative modes of dimerization, to generate cleavage patterns with blunt or 3′ overhanging ends. The observation that proteins use conserved folds that are combined in alternative ways to recognize different DNA sequences, was previously noted in the structures of the MetJ (Somers and Phillips, 1992) and Arc (Raumann et al., 1994) repressor complexes. The repressor homodimers display significant structural similarity to each other, but cooperative contacts between pairs of homodimers involve dramatically different regions in the two structures, because the half‐sites in the MetJ and Arc operator sequences that have different spacings. It remains to be seen whether restriction endonucleases that generate 5′ overhanging ends of various lengths, can achieve these cleavage patterns by alternative dimerization of a conserved, BamHI‐like subunit fold.

Materials and methods

Cloning and purification

The BglI restriction/modification system was cloned from a 4.8 kb EcoRI fragment of B.globigii genomic DNA using methylase selection (Lunnen et al., 1988). Independent clones were isolated and shown by DNA sequencing to contain an open reading frame (orf) that coded for the N–terminus of purified BglI endonuclease. These clones lacked any R.BglI endonuclease activity which was attributed to various frameshift errors in the R.BglI orf. An overexpressing plasmid coding for the full‐length R.BglI gene was constructed (DDBJ/EMBL/GenBank accession No. AF050216) from these clones and was introduced into an E.coli strain (NEB #815) which harboured a compatible plasmid expressing the BglI methylase gene. BglI was purified on Heparin Sepharose and Q‐Sepharose columns. Purified BglI (protein concentration 15 mg/ml in 20mM Tris–HCl pH 7.5, 10 mM β‐mercaptoethanol, 200 mM NaCl) had a specific activity of ∼1 250 000 U/mg. SDS–PAGE analysis showed a single band at >95% purity.


DNA oligonucleotides for use in crystallization trials were synthesized using an in‐house service and purified on a DIONEX anion exchange HPLC (final concentration 20 mg/ml in 50 mM NaCl, Tris–HCl pH 7.5, 1 mM EDTA). Diffraction quality crystals were obtained by co‐crystallizing BglI with a self‐complementary 17mer (protein:DNA ratio was 1:2, Figure 1) using the vapour diffusion method. Crystals measuring up to 600×300×100 μm were obtained from 7–12% PEG4000, 75–150 mM Li2SO4, 100 mM Tris–HCl pH 8.5 at room temperature. The crystals were subsequently transferred to a stabilizing solution containing 8–10% PEG8000, 125–150 mM Ca(acetate)2 and 100 mM PIPES pH 6.5. The space group was C2221 and unit cell dimensions were a = 78.5 Å, b = 81.6 Å, c = 117.1 Å. Heavy‐atom screening was performed on a Rigaku RU200 rotating anode generator with a Siemens Xentronics X100A area detector. Two heavy‐atom derivatives were identified from crystals soaked in PCMBS and K2OsCl6. In addition, modified oligonucleotides with 5 bromo‐dU (Br) were purchased from Cruachem: Br1 5′‐ABrCGCCBrAATAGGCGAT‐3′; Br2 5′‐ABrCGCCTAABrAGGCGAT‐3′; and Br3 5′‐ATCGCCBrAABrAGGCGAT‐3′.

Data collection

Data were collected at BM14 ESRF using a CCD detector from crystals frozen at 100 K. X‐ray fluorescence spectra for the bromine k‐edge and osmium L111‐edge were measured from BglI–DNA derivatized crystals to select the appropriate wavelengths for the maximum anomalous signal. X‐ray data were collected by the oscillation method (Δφ = 0.5°) and reduced to profile fitted intensities using the HKL suite (Otwinowski and Minor, 1997) (Table I). Data were placed on an approximately absolute scale using TRUNCATE, before derivative‐to‐native anisotropic scaling using SCALEIT (Collaborative Computational Project Number 4, 1994).

Phasing and refinement

The BglI–DNA complex was solved using the technique of multiple isomorphous replacement with anomalous scattering. Heavy‐atom positions were identified by inspection of either isomorphous or anomalous difference Pattersons, or by calculation of cross phased difference Fouriers using phases derived from the PCMBS heavy‐atom positions. The maximum‐likelihood phase refinement program SHARP (De la Fortelle and Bricogne, 1997) was used to produce initial phases to 2.5 Å resolution (Table I), which was improved by solvent flattening using SOLOMON (Abrahams and Leslie, 1996). The experimental electron density map was of an extremely high quality and enabled all the DNA and 298 out of 299 amino acids to be built using the interactive graphics program O (Jones et al., 1991). The structure was refined using the program X‐PLOR Vs. 3.86 (Brünger et al., 1987), employing a bulk solvent correction in the final stages. Several rounds of refinement and manual rebuilding were performed, resulting in 253 solvent molecules being included in the model, with excellent final statistics and stereochemistry (Table I). DNA structure was analysed using the program CURVES (Lavery and Sklenar, 1988, 1989).


We thank E.Fanchon, L.Smith, A.Thomson, E.Vinecombe and C.Wilmot for helping with data collection at ESRF and J.Jäger for useful discussions. We thank beam‐line staff at both the SRS, Daresbury and ESRF, Grenoble. This work was supported in part by BBSRC, MRC and HHMI grants. Full co‐ordinates and structure factors are being deposited in the Brookhaven Protein Data Bank.


View Abstract