Advertisement

Crystal structures of open and closed forms of binary and ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I: structural basis for nucleotide incorporation

Ying Li, Sergey Korolev, Gabriel Waksman

Author Affiliations

  1. Ying Li1,
  2. Sergey Korolev1 and
  3. Gabriel Waksman*,1
  1. 1 Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, Campus Box 8231, 660 South Euclid Avenue, St Louis, MO, 63110, USA
  1. *Corresponding author. E-mail: waksman{at}gwiris1.wustl.edu

Abstract

The crystal structures of two ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I (Klentaq1) with a primer/template DNA and dideoxycytidine triphosphate, and that of a binary complex of the same enzyme with a primer/template DNA, were determined to a resolution of 2.3, 2.3 and 2.5 Å, respectively. One ternary complex structure differs markedly from the two other structures by a large reorientation of the tip of the fingers domain. This structure, designated ‘closed’, represents the ternary polymerase complex caught in the act of incorporating a nucleotide. In the two other structures, the tip of the fingers domain is rotated outward by 46° (‘open’) in an orientation similar to that of the apo form of Klentaq1. These structures provide the first direct evidence in DNA polymerase I enzymes of a large conformational change responsible for assembling an active ternary complex.

Introduction

The family of DNA polymerase I (DNA Pol I) enzymes plays a role in the repair of DNA lesions in prokaryotic organisms (Kornberg and Baker, 1991). DNA Pol I enzymes catalyze the addition of mononucleotide units derived from deoxynucleoside 5′ triphosphates (dNTP) to the 3′ hydroxyl terminus of a primer chain in a reaction that requires a template chain which directs the enzyme in its selection of the specific incoming nucleotide (Kornberg and Baker, 1991). These enzymes are characterized by a multidomain architecture which supports not only the polymerase activity, but also a proof‐reading 3′→5′ and/or a 5′→3′ exonuclease activity (Delarue et al., 1990).

The mechanism of DNA polymerization by DNA pol I enzymes has been the subject of extensive biochemical and structural studies (Johnson, 1993; Joyce and Steitz, 1994; Brautigam and Steitz, 1998). The first crystal structure of a member of this family of proteins, the large fragment (Klenow) of Escherichia coli DNA Pol I, revealed that the polymerase domain has a shape reminiscent of a right hand in which the palm, fingers and thumb form the DNA‐binding crevice (Ollis et al., 1985). The active site composed of three acidic residues is located at the palm which forms the base of the crevice. Subsequently, complexes of this protein fragment with DNA described the enzyme in its editing mode (Freemont et al., 1988; Beese and Steitz, 1991; Beese et al., 1993). On the basis of the configuration of the binding partners at the 3′‐5′ exonuclease active site, it was proposed that the mechanism of catalysis for the polymerase activity involved two metal ions which promote the deprotonation of the 3′ OH of the primer strand and assist the leaving of the pyrophosphate (Steitz, 1993).

During synthesis by DNA Pol I enzymes, the primer/template DNA is translocated with each cycle of polymerization to present a new base to the polymerase active site. This event occurs at a rate of several hundred bases per second, depending on the intrinsic processivity of the enzyme (Carroll and Benkovic, 1990; Johnson, 1993). Whether translocation of the DNA requires a conformational change in the protein is not known. However, solution studies have identified a slow rate‐limiting step before chemistry which suggests that at least one conformational transition is required before incorporation of the nucleotide, possibly to assemble a productive protein‐DNA‐dNTP ternary complex (Kuchta et al., 1987, 1988; Dahlberg and Benkovic, 1991; Patel et al., 1991; Wong et al., 1991). Recently, a conformational change in DNA Pol I enzymes has been documented by comparing the structure of a quaternary complex of the T7 DNA polymerase bound to thioredoxin, a template/primer DNA, and an incoming dideoxynucleoside triphosphate (ddNTP) with structures of DNA‐bound or apo forms of other DNA polymerases I (Doublié et al., 1998). This comparison revealed an orientation of the fingers domain in the T7 DNA polymerase quaternary complex that is different from the other structures, corresponding to a rotation inwards (closed) by ∼41° towards the primer/template DNA. Such an open to closed conformational transition may be responsible for assembly of a productive ternary complex.

The crystal structure of the quaternary complex of the T7 polymerase, as well as that of an active binary complex of the DNA polymerase I from Bacillus stearothermophilus bound to a template/primer DNA (Kiefer et al., 1998), demonstrated that the terminal base pair is contained within a binding pocket, the geometry of which is incompatible with a mismatched base pairing (Doublié et al., 1998; Kiefer et al., 1998). In none of these structures were the DNAs seen to cross the crevice formed by the fingers, palm and thumb domains. Instead, in both, the first single‐stranded template base is flipped out at a 90° angle, indicating that during polymerization the DNA remains on one side of the protein.

Since the argument for a large conformational change affecting the fingers domain, as proposed by Doublié et al. (1998), rested on the comparison of different DNA Pol I structures, the direct experimental proof for such a motion remained to be provided. In this report, we describe the structures of two ternary complexes of the large fragment of Taq DNA Pol I (Klentaq1) bound to a primer/template DNA and ddNTP. These structures represent the open and closed forms of the enzyme and capture Klentaq1 in the act of incorporating a nucleotide at the active site. We also present the structure of a binary complex of Klentaq1 bound to a template/primer DNA. These three structures together with those of the apo (Korolev et al., 1995) and the dNTP‐bound (Li et al., 1998) forms of the enzyme provide new insight into the structural basis of nucleotide incorporation during DNA polymerization.

Results and discussion

Structure determination

The strategy used to obtain ternary complexes of Klentaq1 was similar to that first developed by Pelletier et al. (1994) for DNA polymerase β. Klentaq1 was mixed prior to crystallization with a primer/template DNA composed of strands that were 11 and 16 nucleotides in length, respectively, and subsequently reacted against an excess of ddNTP (Materials and methods). The design of the template strand was such that the two first single‐stranded bases of the template were guanosines allowing (i) the incorporation of a dideoxycytidine monophosphate (ddCMP), and (ii) the productive positioning of a dideoxycytidine triphosphate (ddCTP) at the active site.

Two ternary complex crystals were obtained, the structures of which differed by a large reorientation of the fingers domain (designated ‘open’ and ‘closed’ below). Both crystals were obtained using identical crystallization conditions, diffracted to similar resolution (2.3 Å) and were in the same space group (P3121) with the same unit cell dimensions. However, the ‘open’ crystal form was obtained from a selenomethionine‐derived protein and was incubated after growth in a stabilizing solution deprived of protein, DNA and ddCTP for 5 days. In contrast, the ‘closed’ crystal form was obtained from the wild‐type protein and was used in data collection immediately after growth. Therefore, we believe that the ‘open’ form was obtained by depleting, at least partially, the ‘closed’ crystal form of its ddCTP component, a hypothesis consistent with the fact that the occupancy for the ddCTP in the open form was found to be low (Materials and methods).

The binary complex crystals were obtained by incubating the closed ternary complex crystals in the stabilizing solution described above for an extended period (1 month). This treatment resulted in complete release of the ddCTP and consequently in a binary enzyme‐DNA complex.

The structure of the open ternary complex was determined using the method of multiwavelength anomalous diffraction (MAD) and was refined against data to 2.3 Å resolution with a free‐R and R values of 28.8 and 22.4%, respectively. The structures of the closed ternary complex and of the binary complex were determined by difference Fourier methods (Materials and methods). The closed ternary complex structure was refined against 2.3 Å data, with values for the free‐R and R‐factors of 27.5 and 21.8%, respectively, whereas the binary enzyme‐DNA complex structure was refined against 2.5 Å data, with values for the free‐R and R‐factors of 29.8 and 22.7%, respectively (Figures 1 and 2, Table I).

Figure 1.

Representative regions of the electron density. (A) The GT5:ddC base pair and the flipped GT4 base of the open ternary complex. ddC indicates the incorporated ddCMP (notation as in Figure 5). (B) Same region as (A) in the closed ternary complex. Note that GT4 is now in a stacking arrangement with GT5. (C) The H1H2 loop in the thumb region. In (A) and (C), electron density results from a map calculated using the experimental MAD solvent‐flattened phases. In (B), electron density results from a simulated annealing omit map where the region shown has been deleted from the model (Hodel et al., 1992). Residues in the protein and the DNA are color coded according to atom type with carbon and phosphorus atoms in yellow, oxygen atoms in red and nitrogen atoms in blue. Generated using the program O (Jones et al., 1991).

Figure 2.

Stereo ribbon diagram (Carson, 1997), and surface of the closed ternary and open binary complexes of Klentaq1. (A and C) The closed ternary complex. (B and D) The open binary complex. In (A) and (B), the N‐terminal, palm, fingers and thumb domains are indicated as ribbons, and color coded in yellow, magenta, green and deep blue, respectively. The O helix in the fingers domain is shown in red. The ribose and base in the primer (silver) and template (clear blue) strands are shown in stick representation, with the ribose‐phosphate backbones shown as ribbons. The incoming ddCTP in (A) is shown in stick representation and is colored black. Gold spheres in A indicate metal ions. The notation for the secondary structural elements in the polymerase domain is indicated. In (C) and (D), the surface was contoured and displayed using GRASP (Nicholls et al., 1991). Color coding of DNA atoms is as in Figure 1 except for carbons (white). The template and primer strand backbones are shown in blue and red, respectively. Helices O1 and Q, and loop H1H2 are indicated. The double cyan arrow in D indicates the only possible direction for DNA motion in the open binary complex.

View this table:
Table 1. Refinement statistics

Two conformations of the fingers domain

The two ternary complexes of Klentaq1 bound to a primer/template DNA and ddCTP differ by a large conformational change affecting the tip of the fingers domain (Figure 3A). In one ternary complex structure (Figure 2A), the fingers domain closes the crevice formed by the thumb, palm and fingers (Figure 2C): this conformation is referred to as the closed form of the enzyme. In the second ternary complex structure, as in the binary Klentaq1‐DNA complex structure (Figures 2B), the fingers domain is seen in a totally different conformation, such that the crevice is clearly visible (Figure 2D). This conformation is referred to as the open form of the polymerase. Therefore, the three structures presented here are those of an open and a closed ternary complex, and that of an open binary complex.

Figure 3.

Comparison of the Cα tracings of the open and closed ternary complex forms and of the apo and open binary DNA‐bound forms of Klentaq1 (Carson, 1997). (A) Superimposition of the structures of the open (magenta) and closed (yellow) forms of the ternary complex. (B) Superimposition of the structures of the open binary Klentaq1/DNA complex (magenta) and apo‐Klentaq1 (green).

The open to closed conformational change affecting the fingers domain can be deconvoluted into two rotations successively affecting different parts of the fingers domain (Figure 3A). First, a 6° rigid body rotation of helices N, O, O1 and O2 results in a partial closing of the crevice (see Figure 4 for definition of secondary structures). This motion is amplified by a second rotation of 40°, affecting the N and O helices only.

Figure 4.

Sequence alignment of the polymerase domains of Taq, E.coli and T7 DNA polymerases. Secondary structural elements and domain boundaries are indicated above the aligned sequences, with β‐strands and α‐helices indicated by blue and red open boxes, respectively. Notation for each of these elements is included in the boxes. Note that the region corresponding to helix J in Klenow is not helical in the complexes presented here. The catalytic triad is shown in gray boxes spanning all three sequences. Residues in Klentaq1 involved in DNA and ddCTP binding are indicated in colored unframed boxes: green, residues interacting with DNA in both the open and closed ternary complexes, and in the open binary complex; pink, residues interacting with the DNA in the closed ternary complex only; dark blue, interactions with DNA (such as Tyr671) in the open ternary and the open binary complexes only; cyan, residues interacting with the ddCTP in both the open and closed ternary complexes; orange, residues interacting with the ddCTP in the closed form only; dark red, residues interacting with the ddCTP only in the open ternary complex.

The open to closed transition affects dramatically the orientation of the O helix. Solution studies of DNA Pol I have demonstrated the involvement of many residues in the O helix in dNTP binding and incorporation (Astatke et al., 1995; Suzuki et al., 1996). The effect of the open to closed conformational transition is to position the O helix in two different orientations. In the first orientation, that seen in the open form, the O helix is in a configuration similar to that observed in the apo and dCTP‐bound forms of the enzyme (Korolev et al., 1995; Li et al., 1998). Tyr 671, at the C‐terminus of the O helix, is inserted into the stacking arrangement of the template bases and lies on top of the first base pair of the duplex part of the primer/template DNA (Figures 5, 6B and D). Tyr 671 may act as a positioning device for the DNA, such that the first base pair of the primer/template can register itself against the active site. Insertion of Tyr 671 into the stacking arrangement of the template bases also has consequences for the local structure of the DNA in this region (see below).

Figure 5.

Schematic diagram of the contacts between the polymerase and the DNA. Only direct contacts between the protein and the DNA are shown. Residues in green boxes make similar contacts with the DNA in all three complexes. Pink boxes indicate contacts only observed in the closed ternary complex. AT3 is shown in pink since it is only observed in this complex. GT4 is shown both in pink and deep blue since it is seen in two different conformations in the open (deep blue) binary and ternary complexes, and in the closed (pink) ternary complex. Tyr671 is in deep blue since it is observed stacking against the 5′ end template base in the open binary and ternary complexes only. Distances between interacting atoms are shown. This figure also defines the notation for the bases used in the text.

Figure 6.

Stereo diagrams and surfaces of the contacts between the polymerase and the ddCTP (Carson, 1997). (A) Protein‐ddCTP contacts in the closed form. (B) Protein‐ddCTP contacts in the open form. (C) Surface diagram of ddCTP‐binding site in the closed form. (D) Surface diagram of the ddCTP‐binding site in the open form. Residues interacting with the ddCTP and with the first base pair are shown and labeled. Color coding and display of the ribbons are as in Figure 2. Metal ions are indicated as gold spheres. Water molecules are not shown. Color coding of the protein side chains, the DNA and the ddCTP in (A) and (B) is by atom type with the carbons in white, oxygens in red, nitrogens in blue and phosphorus in yellow. In (C) and (D), protein side chains are in gray, except for residues in the O helix which are in red; atoms in the template and primer strands are in clear blue and silver, respectively; the incoming nucleotide is colored by atom types as in (A) and (B). The solvent‐accessible surface [in (C) and (D)] is represented by gray dots. In all panels, atoms are represented as ball‐and‐stick models, except in C and D where the ddCTP and the DNA are represented as space‐filling models.

In its second orientation, that seen in the closed form of the ternary complex, the O helix has moved in and is now much closer to the active site formed by the three carboxylates located in the palm domain (Figures 2A and C, and 6A and C). One effect of the rigid body motions affecting the O helix in the open to closed transition is to release the side chain of Tyr 671 from its stacking arrangement with the template bases (Figure 6A). As a result, the place which Tyr 671 occupied in the open form is vacated, allowing the first single‐stranded DNA base of the template to position itself in front of the incoming ddCTP (Figures 1B and 6A). Another effect of the conformational change affecting the O helix is to bury the ddCTP, thereby assembling a productive ternary complex poised for chemistry (see below).

Conformational transitions between the unbound and DNA‐bound states

The conformation of the thumb is affected greatly by binding of DNA. Figure 3B compares the structure of the apo form of the enzyme (Korolev et al., 1995) with that of the open binary Klentaq1‐DNA complex (this work) and illustrates the conformational change affecting the thumb domain upon interactions with the primer/template DNA. This conformational change can be deconvoluted into two parts: first, a rigid body rotation of the thumb domain by a 17° angle along an axis perpendicular to the view plane of Figure 3B results in an opening of the DNA‐binding crevice; secondly, a rotation by 12° about the same axis brings the tip of the thumb domain, i.e. only helices H1 and H2, closer to the DNA. This second rotational component is in the opposite direction to the first rotation. Other significant conformational changes, mostly corresponding to disorder to order transitions, are apparent between the unbound and the DNA‐bound states of the thumb. These are mostly localized to the H1H2 loop, which in the apo‐ and dNTP‐bound forms of the enzyme is disordered, but well defined in electron density in all three complexes described here (Figure 1C).

The result of the conformational changes affecting the thumb domain is the formation of a cylinder that almost completely surrounds the DNA (Figure 2C and D). This structure is observed in all three complexes. Solution studies have established that the rate of DNA dissociation during polymerization by DNA Pol I enzymes is very slow and that the DNA does not necessarily dissociate from the protein after chemistry (Kuchta et al., 1987; Patel et al., 1991). These data suggest that the observed wrapping of the thumb domain around the DNA may be maintained during nucleotide assembly, nucleotide incorporation and DNA translocation.

DNA conformations and residues involved in DNA binding

The overall conformation of the duplex part of the primer/template is very similar in all three complexes (Figure 5). The DNA is mostly in the B‐form, except for the three base pairs at the end of the duplex DNA adjacent to the O helix, which are A‐form. The resulting widening of the minor groove in this region has also been noted for DNAs complexed with the Taq, T7 and B.stearothermophilus DNA polymerases and DNA polymerase β (Pelletier et al., 1994; Eom et al., 1996; Doublié et al., 1998; Kiefer et al., 1998). Here, however, as in the quaternary complex of T7 polymerase, the DNA is distorted to assume an S shape. A first bend is caused by the interactions with the palm at the active site, whereas a second bend results from interactions with the thumb (Doublié et al., 1998).

The DNA duplex interacts with the protein between base GT5 and GT13 in the template, and base CP6 and ddC in the primer (ddC refers to the incorporated ddCMP; notation for the bases is given in Figure 5), i.e. nine and seven nucleotides at the 5′ end of the template and the 3′ end of the primer, respectively, participate in contacts with the protein (Figure 5). This observation is consistent with solution studies by Catalano et al. (1990) which have measured a site size of seven base pairs using a marker located on the primer strand. The part of the duplex DNA distal to the active site and the O helix makes contact with residues in the thumb domain, whereas the other end of the duplex DNA mostly interacts with residues in the palm. Two loops in the thumb domain, HH1 and H1H2, and the I helix form most of the contacts with the DNA between the T12‐P5 and T9‐P8 base pairs (Figure 5). Interactions in this region are polar or charged, and occur mostly with the ribose phosphate backbones of each strand. The palm domain also presents binding surfaces to both the primer and the template strands, with β‐strands 7 and 8 running parallel to the template, and β‐strands 12 and 13 parallel to the primer (Figure 2A and B).

In all three structures, neither the duplex DNA nor the single‐stranded region of the template strand passes through the crevice between the thumb and the fingers. Instead, the first unpaired base of the template is flipped out of the stacking arrangement with the duplex by a sharp angle in the template ribose phosphate backbone which positions the single‐stranded template base on the same side of the crevice as the duplex DNA (Figure 2).

Most of the differences in DNA conformation and binding between the open and closed ternary complexes lie in the single‐stranded and ddCTP‐paired regions of the template (Figure 6A and B). In the closed form, a sharp turn in the template's phosphate backbone between AT3 (the first single‐stranded template base) and GT4 (the base paired with the ddCTP) positions AT3 between the O1 and Q helices, away from the duplex part of the DNA (Figure 2C). In contrast, in the open form, AT3 is disordered and GT4 is displaced from the ddCTP‐paired position by the insertion of Tyr671. Surprisingly, the displaced GT4 base lies on the side of the DNA duplex and is stabilized uniquely by contacts with the template bases GT5 and GT6 (Figures 1A and 6B). This configuration of GT4 is also observed in the open binary Klentaq1‐DNA complex.

Binding of the dideoxynucleoside triphosphate in the two forms of the ternary complex

In both forms of the ternary complex, the nucleoside triphosphate is located on top of the 3′ end base of the primer strand. Its binding configuration is also very similar in the two structures (Figure 6). However, due to the reorientation of the O helix from open to closed, the interactions of ddCTP with the protein are very different between the two ternary complexes (Figure 6A and B). Whereas, in the open form, the ddCTP is readily accessible to solvent, in the closed form, the incoming ddCTP is buried completely (Figure 6C and D). As illustrated in Figures 4 and 6, residues in the O helix contribute most of the additional contacts observed in the closed form. Interestingly, contacts with the O helix in this form are similar to those observed between residues in the O helix and the dCTP in the open Klentaq1‐dCTP binary complex described previously by Li et al. (1998): the base of the nucleotide stacks against Phe667, whereas the triphosphate moiety makes electrostatic interactions with basic residues on the same face of the O helix. Furthermore, when the motion which brings the O helix from its open to closed configuration is applied to the coordinates of the dCTP of the open Klentaq1‐dCTP binary complex, the dCTP is moved to within 1.2 Å of the ddCTP of the closed ternary complex (result not shown). This observation, together with the fact that in the closed form the ddCTP is located near the three active site carboxylates (Asp785, Glu786 and Asp610), lends support to the suggestion that a role for the conformational change affecting the O helix from the open to the closed form is to deliver the incoming nucleotide to the active site, thereby assembling a productive polymerase machinery poised for chemistry.

In the closed form of the ternary complex, two metal ions (Mg2+) are octahedrally coordinated by the triphosphate moiety of the ddCTP and carboxylate side chains in the active site (Figure 7). One metal (metal B) is ligated in the basal octahedral plane by four oxygen atoms, contributed by the β‐ and γ‐phosphates and the carboxylate groups of Asp610 and Asp785 (Figure 7). The coordination sphere of the metal ion is completed on each side of the octahedral plane by interactions with oxygen atoms in the α‐phosphate and the carbonyl of Tyr 611. The other metal ion (metal A) is coordinated in the octahedral plane by oxygen atoms from the carboxylate of Asp785, the α‐phosphate of the incoming nucleotide and two water molecules (Figure 7). On one side of the octahedral plane, metal A is ligated by an oxygen from the carboxylate of Asp610. On the other side of the plane, however, the coordinating position is vacant: a ribose 3′‐hydroxyl at the 3′ end of the primer strand in a natural substrate would occupy this position and complete the coordination sphere of metal ion A (Figure 7). The coordination architectures of metal ions A and B are similar to those observed in the T7 polymerase quaternary complex (Doublié et al., 1998) and are consistent with the ‘two metal ion’ mechanism for nucleotide addition proposed by Steitz and colleagues (Steitz, 1993; Steitz et al., 1994). In this mechanism, metal ion A is thought to lower the affinity of the 3′ OH for the hydrogen, thereby facilitating the 3′ O attack on the α‐phosphate. Metal ion B may assist the leaving of the pyrophosphate.

Figure 7.

The active site in the closed ternary complex. The three acidic side chains involved in the transfer reaction are shown as well as the ddCTP, the first incorporated ddCMP base (ddC) at the 3′ end of the primer strand and the metal ions (yellow spheres). Red lines with distances indicate interactions between atoms. Red stars indicate water molecules. The cyan star indicates the position of the deoxyribose 3′‐OH in a natural substrate. Generated using the program O (Jones et al., 1991).

Conclusions

Solution studies of DNA Pol I enzymes have established that the mechanism of DNA polymerization consists of a series of single dNMP incorporation reactions (Carroll and Benkovic, 1990; Johnson, 1993). Nucleotide incorporation begins with the binding of a dNTP to the enzyme‐DNA (E·Dn) complex. The binding of a correct dNTP induces a rate‐limiting conformational change which results in the formation of a tight ternary complex (E·Dn·dNTP ↔ E*·Dn·dNTP). The chemical reaction that follows is fast and results in the formation of a tightly bound enzyme‐product complex (E*·Dn·dNTP ↔ E*·Dn+1·PPi); this complex then undergoes a second conformational change which relaxes the tightly bound enzyme‐product complex (E*·Dn+1·PPi ↔·E·Dn+1·PPi). This step facilitates PPi release and allows translocation of the DNA for the next cycle of polymerization. This mechanism suggests that the E·Dn complex fluctuates periodically between a loose and a tight binding state such that the binding and release of the substrates and products, as well as the linear diffusion or sliding of the DNA, would occur in the loose binding state, whereas the chemical reaction would occur in the tight binding state (Kuchta et al., 1987; Dahlberg and Benkovic, 1991; Patel et al., 1991).

Since the kinetic data can be described fully by assuming only two conformations for the enzyme, one ‘relaxed’ (E) and one ‘tight’ (E*), we propose that the open and closed structures described in this report represent the E and the E* forms of the enzyme, respectively. In this proposal, the E·Dn state is represented by the open binary Klentaq1‐DNA complex. The E·Dn·dNTP state may be approximated by the structure of the open binary Klentaq1‐DNA complex, to which a dNTP bound to the O helix in the configuration described in Li et al. (1998) is added. Alternatively, one cannot rule out the possibility that the open ternary complex may represent a E·Dn·dNTP state. Finally, the E*·Dn·dNTP state corresponds to the closed ternary complex where the conformational change affecting the O helix brings the dNTP within the active site of the enzyme, thereby assembling a ‘tight’ ternary complex. Hence, a description of the structures of three of the five kinetic states invoked by Patel et al. (1991) for a single nucleotide incorporation is provided for the Klentaq1 system, making this DNA polymerase the best structurally documented DNA Pol I enzyme.

Solution studies of DNA Pol I enzymes have demonstrated that the formation of a ‘tight’ (i.e. closed) ternary complex (E*·Dn·dNTP) only occurs when the correct complementary dNTP is selected (Carroll and Benkovic, 1990; Johnson, 1993). The enzyme may be able to sample all dNTPs in a fast process by delivering them to the active site through an open to closed transition. However, only the correct dNTP locks the enzyme in the tight closed ternary complex form. This process can be understood by examining the structure of the closed ternary Klentaq1 complex and that of the quaternary T7 polymerase complex. In these structures, the terminal base pair is contained within a binding pocket, the geometry of which is incompatible with a mismatched base pairing (Figure 6 and Doublié et al., 1998). Hence, formation of a mismatch base pair would possibly result in an unstable structure which could open rapidly to sample another dNTP.

During DNA synthesis, the primer/template must translocate with each cycle of nucleotide incorporation. Kinetic studies have suggested that linear diffusion or sliding of the DNA may occur in the relaxed E·Dn state (Kuchta et al., 1987; Patel et al., 1991). This suggests that in the open form (i.e. the E·Dn form), the DNA may be free to slide within the structure formed by the thumb, the palm and the fingers domains of the enzyme. Such a structure, as shown in Figure 2, resembles a cylinder that almost completely wraps around the DNA. The axis of this cylinder (indicated by the double arrow in Figure 2D) is perpendicular to the plane of the nascent base pair, suggesting that a motion of the DNA along this axis would generate the space required for the next template base to insert itself in the stacking arrangement of the duplex DNA. The motions of the DNA within the cylinder need not be unidirectional. However, in the direction of the fingers, the DNA would wedge against the ring of Tyr671 and position itself in a favorable geometry: the open to closed transition would then displace Tyr671, allow the first single‐stranded base of the template to insert itself in its place, assemble the incoming dNTP to base‐pair with the template base and finally, position the metal ions and the water molecules in a configuration favorable for the reaction to occur. These latter events may result in tighter binding and transient immobilization of the DNA.

Materials and methods

Crystallization

Klentaq1 was purified according to Korolev et al. (1995). The resulting material was further purified by size‐exclusion chromatography followed by high‐resolution cation exchange chromatography. The protein was then dialyzed in a buffer containing 20 mM Tris pH 8.0, 20 mM NaCl, 1 mM EDTA and 1mM 2‐mercaptoethanol, and concentrated to 25 mg/ml (0.4 mM). Production of the selenomethionine‐derived protein was carried out in the methionine‐auxotroph E.coli strain DL41 and protein purification proceeded as for wild‐type. Incorporation of the selenium was confirmed by electrospray ionization mass spectrometry.

The primer/template DNA consisted of 5′‐GACCACGGCGC‐3′ for the primer and 5′‐AAAGGGCGCCGTGGTC‐3′ for the template. The duplex was formed by mixing these two oligonucleotides (3.0 mM) in a 1:1 molar ratio in a buffer containing 10 mM Tris‐HCl pH 8.0, 10 mM NaCl and 5 mM MgCl2, and was annealed using standard procedures. The ternary complex of Klentaq1 was formed by reacting a mixture of the protein, the primer/template DNA and the ddCTP in a molar ratio of 1:7.5:50 and in 20 mM MgCl2.

Wild‐type crystals were grown by vapor diffusion using the hanging drop method against a reservoir containing 0.1 M HEPES pH 7.5, 20 mM MnCl2, 0.1M Na acetate and 10% (w/v) PEG4000 at 20°C. Sizable crystals (0.7×0.15×0.15 mm) appeared within 1 week. After cryoprotecting progressively (15 min total) to a final solution containing 0.1 M HEPES pH 7.5, 20 mM MnCl2, 0.1 M Na acetate, 20% (v/v) glycerol and 22.5% (w/v) PEG4000, the crystals were flash‐frozen to liquid nitrogen temperature. The crystals diffracted to 2.3 Å in the laboratory setting (Rigaku Raxis IV image plate mounted on a Rigaku RU200 rotating anode X‐ray generator). Crystals were in space group P3121 with unit cell dimensions a = b =108.3 Å and c = 90.4 Å, and one complex in the asymmetric unit. A complete data set to a resolution to 2.3 Å (Table II) was collected using a single crystal with an oscillation range of 1.5° and exposure time of 45 min/frame. These crystals corresponded to the closed form of the ternary Klentaq1 complex.

View this table:
Table 2. Data collection

Selenomethionine‐derived crystals were grown as described for wild‐type. However, crystals were incubated for 5 days in a stabilizing solution containing 0.1 M HEPES pH 7.5, 20 mM MnCl2, 0.1 M Na acetate and 15% (w/v) PEG4000 prior to being transported to the Advanced Photon Source (APS) at Argonne National Laboratory. Just before data collection [at the Structural Biology Center (SBC) beamline], these crystals were cryoprotected as described above and flash‐frozen. The selenomethionine‐derived crystals were in the same space group with the same unit‐cell dimensions as wild‐type. MAD data were collected to a resolution of 2.3 Å at four wavelengths (details below) using a single crystal. The oscillation range was 1.5° and the exposure time was 8 s/frame. These crystals corresponded to the open form of the ternary Klentaq1 complex.

Binary complex crystals were derived from closed ternary complex crystals incubated for 30 days in the stabilizing solution described above. A complete native data set to a resolution of 2.5 Å was collected at Stanford Synchrotron Radiation Laboratory (SSRL; beamline 7.1) using a single frozen crystal (1.5° oscillation range, 20 s exposure).

All data were processed and reduced using DENZO and SCALEPACK (Otwinowski, 1993).

Structure determination

The first attempt to solve the structures of each of the Klentaq1 complexes was carried out by molecular replacement (MR) (Rossman, 1972). The structure of the apo‐form of Klentaq1 (Korolev et al., 1995) was used as a search model and the MR method was implemented using the program AMoRe (Navaza, 1994). Both the rotation and translation functions showed distinct peaks which could have constituted possible solutions for the MR problem. However, incorporating these models in refinement (program XPLOR, Brünger, 1992a) did not result in a significant decrease in R or free‐R‐factors.

Multiwavelength anomalous diffraction (MAD) data were collected from a single selenomethionine‐substituted crystal (Hendrickson et al., 1990). The crystal was oriented such that Bijvoet pairs could be collected on the same frame. Scaling of data sets and anomalous pairs was performed using local scaling (program HEAVYv4.5, Terwilliger and Eisenberg, 1983). The anomalous differences recorded at the wavelength where f ″ is maximum (SeMet‐3 in Table II) were first used in an anomalous difference Patterson synthesis (program HASSP, Terwilliger et al., 1987) which yielded three strong peaks. These selenium positions matched the sulfur positions of three methionines in the MR model obtained using the SeMet‐3 data, and therefore were used to generate phases solely based on the MAD data (program HEAVYv4.5). The resulting phases were used in a difference Fourier against the same anomalous differences to generate six additional selenium positions. The validity of the selenium positions was confirmed using difference Fourier techniques by systematically omitting each of the nine positions. These positions were subsequently used to generate MAD phases using the resolution range 30.0‐2.7 Å resolution (program SHARP, De La Fortelle and Bricogne, 1997). Values for f′ and f″ were from Hall et al. (1995). Subsequent solvent flattening and phase extension to a resolution of 2.3 Å (program SOLOMON, Abrahams and Leslie, 1996) resulted in an electron‐density map which was of excellent quality (Figure 1) in the N‐terminal, palm and thumb domains, but of poor to moderate quality in the fingers domain.

An apo‐Klentaq1 model was adjusted in the electron density using the program O (Jones et al., 1991). During model building, partial models consisting of the N‐terminal, palm and thumb domains were used as external phase information to the phasing based on the multiwavelength anomalous dispersions, resulting in improved electron‐density maps in the fingers region which could then be built at least partially (see Refinement below). At this stage, it was clear that the configuration of the fingers was similar to that observed in the apo form of the enzyme, i.e. open (Korolev et al., 1995).

The phases calculated from a partial model where the fingers domain was omitted were also used in the calculation of an electron density map using |Fo‐Fc| coefficients, where Fcs were the structure factor amplitudes calculated from the partial model, and Fos were the amplitude data collected using the native ternary (closed) complex crystal (Table II). Remarkably, this electron‐density map clearly showed density for the fingers domain which, once modeled, positioned the fingers domain in a configuration similar to that observed in the T7 polymerase quaternary complex structure, i.e. closed (Doublié et al., 1998).

In both forms of the ternary complex, electron density for the DNA was clearly interpretable (Figure 1) and a model could be built readily. Electron density for the entire ddCTP was well defined only in the closed form. For the open form, experimental and omit electron‐density maps showed clear density only for the base and the ribose of the ddCTP, whereas electron density for the triphosphate moiety was patchy with density only for the α‐ and γ‐phosphates. This could be rationalized by the absence of the O helix to stabilize the phosphate moiety, whereas the remaining contacts such as stacking contacts and interactions with Tyr671 (Figure 4) may have been sufficient to stabilize the base. Building of the two metal ions in the closed form relied on the excellent quality of the electron density in this region and on the optimal coordination sphere observed with the surrounding protein and ddCTP atoms. In the open form of the ternary complex, metal ion B only was built in density and was found to coincide with the same metal ion in the closed form of the ternary complex. Metal ion A could not be identified unambiguously in this form.

The open binary Klentaq1‐DNA complex structure was determined by directly evaluating the open ternary complex model (minus the ddCTP) against the native binary complex data. Rigid body refinement (program XPLOR) was instrumental in finding the correct orientation of the model in the asymmetric unit. Simulated annealing omit maps for the region around the reacted 3′ end primer base (ddC) demonstrated the absence of electron density for the ddCTP or the metal ions.

Refinement

Both conjugate gradient minimization and simulated annealing in cartesian space were used during refinement (program XPLOR, Brünger et al., 1987). Progress was assessed by monitoring the free R‐factor (Brünger, 1992b). Individual B‐factor refinement was used only at the later stages of the process, resulting in tightly restrained B‐factors (Table I).

The atomic model of the open binary complex was refined using the ‘native (binary)’ data (program XPLOR, Table II). After bulk solvent correction, the refinement of this complex converged to a final R‐factor of 22.7% with an R‐free of 29.8% (30.0‐2.5 Å resolution range; |F|/σ|F| > 2.0). The model for this structure contains protein residues from 294 to 832 with no interruption, 13 template nucleotides, 11 primer nucleotides and one incorporated ddCMP, and 103 well‐defined water molecules. Electron density was poor for 29 side chains at the surface of the protein, which were built as Ala.

The atomic model of the open ternary complex was refined using the ‘SeMet‐2’ data, whereas the model for the closed ternary complex was refined against the ‘native (ternary)’ data (program XPLOR and Table II). After bulk solvent correction, the refinement converged to a final R‐factor of 21.8% with an R‐free of 27.5% for the closed form, and to a final R‐factor of 22.4 with an R‐free of 28.8% for the open form (30.0‐2.3 Å resolution range; |F|/σ|F| >2.0). The closed form model contains protein residues from 293 to 831 with no interruption, 14 template nucleotides, 11 primer nucleotides, one incorporated ddCMP, one ddCTP, two Mg metal ions and 149 well‐defined water molecules. Side chains for 26 residues, mostly solvent exposed, were not well defined in density and consequently were built as Ala. The open ternary complex model contains protein residues from 295 to 832, 13 template nucleotides, 11 primer nucleotides, one incorporated ddCMP, one ddCTP, one Mg metal ion and 150 well‐defined water molecules. The occupancy for the ddCTP in this complex was set to 0.3 in the final refined structure resulting in B values for the entire ddCTP similar to the B values of the surrounding model atoms. Electron density for side chains in the fingers region between residues 636 and 699 was poor, and these residues were built as Ala, except for Phe667, Tyr671 and Met673, which were ordered. The open ternary complex model is also interrupted between residues 645 and 654. Twenty‐one additional residues at the protein surface were built as Ala.

Only one residue (His784) in all three structures has (psi, phi) angles in the disallowed region of the Ramachandran plot (Ramachandran and Sasisekharan, 1968). This residue is near the active site and a similar conformation has been observed for the corresponding residue in other polymerase structures (Doublié et al., 1998; Kiefer et al., 1998). The coordinates of the three structures reported here have been deposited (PDB entry codes 2KTQ for the open ternary complex, 3KTQ for the closed ternary complex and 4KTQ for the binary Klentaq1‐DNA complex).

Acknowledgements

We thank J.Kuriyan, K.Johnson, J.Majors, F.S.Mathews and T.M.Lohman for comments on the manuscript, K.Fütterer for help in data collection and phasing, D.Mosbaugh for discussions, T.Ellenberger for the coordinates of the quaternary complex of T7 DNA polymerase and the staff at the Structural Biology Center beamline of the Advanced Photon Source (APS) at Argonne National Laboratory and the staff of beamline 7.1 at the Stanford Synchrotron Radiation Laboratory for assistance during synchrotron data collection. This work was supported by National Institute of Health grant GM54033.

References