The non‐ribosomal synthesis of the cyclic peptide antibiotic gramicidin S is accomplished by two large multifunctional enzymes, the peptide synthetases 1 and 2. The enzyme complex contains five conserved subunits of ∼60 kDa which carry out ATP‐dependent activation of specific amino acids and share extensive regions of sequence similarity with adenylating enzymes such as firefly luciferases and acyl‐CoA ligases. We have determined the crystal structure of the N‐terminal adenylation subunit in a complex with AMP and L‐phenylalanine to 1.9 Å resolution. The 556 amino acid residue fragment is folded into two domains with the active site situated at their interface. Each domain of the enzyme has a similar topology to the corresponding domain of unliganded firefly luciferase, but a remarkable relative domain rotation of 94° occurs. This conformation places the absolutely conserved Lys517 in a position to form electrostatic interactions with both ligands. The AMP is bound with the phosphate moiety interacting with Lys517 and the hydroxyl groups of the ribose forming hydrogen bonds with Asp413. The phenylalanine substrate binds in a hydrophobic pocket with the carboxylate group interacting with Lys517 and the α‐amino group with Asp235. The structure reveals the role of the invariant residues within the superfamily of adenylate‐forming enzymes and indicates a conserved mechanism of nucleotide binding and substrate activation.
A number of oligopeptides, some of which have important medical and biotechnological applications, are produced by fungi and bacteria via a non‐ribosomal mechanism. Peptides such as the cyclic gramicidin S and cyclosporin A, the lactone actinomycin, the branched bacitracin and the linear precursor of both penicillin and cephalosporin, are synthesized by large multifunctional enzymes which act as protein templates for the growing polypeptide chain. Peptide synthetases catalyse the repetitive activation and condensation of the constituent amino acids to yield the peptide product. Each amino acid is activated by adenylation of its carboxylate group with ATP and then transferred to the thiol group of an enzyme‐bound phosphopantetheine cofactor for possible modification and the elongation reaction (Stachelhaus and Marahiel, 1995a; Kleinkauf and von Döhren, 1996).
The cloning and sequencing of several peptide synthetase genes have revealed a conserved and ordered modular organization. Each module encodes a functional building unit containing ∼1000 amino acids, which specifically recognizes a single amino acid. Within such a protein template‐directed peptide biosynthesis, the occurrence and specific order of the modules in the genomic DNA dictate the number and sequence of the amino acids to be incorporated into the resulting oligopeptide. The modular arrangement of peptide synthetases closely parallels the multienzyme complexes responsible for the biogenesis of fatty acids and of the polyketide family of natural products. Furthermore, peptide synthetases, fatty‐acid synthetases and polyketide synthetases all use enzyme‐bound phosphopantetheine cofactors as acyl carriers, in a thiotemplate mechanism first proposed by Lipmann more than 20 years ago (Lipmann, 1971) and revised recently (Stein et al., 1996).
In particular, the synthesis of the cyclic antibiotic gramicidin S has been studied in detail. Gramicidin S is produced by the Gram‐positive bacterium Bacillus brevis and consists of two identical pentapeptides joined head to tail. It is synthesized by the multienzyme complex gramicidin S synthetase, which is encoded by the 19 kb grs operon that includes the genes grsA, grsB and grsT. The grsT gene, which is located at the 5′‐end of the grs operon, encodes a 29 kDa protein homologous to fatty‐acid thioesterases. The grsA gene product, gramicidin S synthetase 1 (GrsA) is a protein composed of 1098 amino acids (Hori et al., 1989; Krätzschmar et al., 1989). GrsA activates l‐phenylalanine to the corresponding acyl‐adenylate and catalyses the inversion of configuration of the amino acid. d‐phenylalanine is then transferred to the grsB gene product, gramicidin S synthetase 2 (GrsB), a 510 kDa polypeptide chain which sequentially activates proline, valine, ornithine and leucine and forms the peptide bonds in the elongation reaction, releasing the decapeptide (d–Phe‐Pro‐Val‐Orn‐Leu)2 after cyclization.
Each of the five modules in which the grs operon is organized encodes for highly conserved functional subunits. The major one is a 60 kDa fragment which recognizes a specific amino acid and catalyses the adenylation of the amino acid carboxylate group with the α‐phosphate of ATP. This adenylation subunit is conserved not only within all known peptide synthetases, but also shares extensive sequence similarity with firefly luciferases and acyl CoA ligases. Common to all these enzymes is the ATP‐dependent activation of substrates as acyl adenylates. On the other hand, the adenylation subunit shares no sequence homology with enzymes involved in the ribosomal synthesis of polypeptides, despite the fact that the formation of aminoacyl‐adenylates is chemically analogous in the two systems. Indeed, the crystal structure of firefly luciferase has indicated a structural framework unrelated to those of both class I and class II aminoacyl‐tRNA synthetases (Conti et al., 1996).
We report here the crystal structure of the phenylalanine‐activating subunit of gramicidin synthetase 1 (PheA) in a ternary complex with phenylalanine and AMP. The structure reveals the role of residues which are highly conserved in the superfamily of adenylate‐forming enzymes. In addition, the presence of the substrate provides details of the amino acid specificity and allows a sequence‐based comparison to be made with other peptide synthetases. A comparison of the structure with that of unliganded firefly luciferase reveals that both a domain rotation and a conformational change of a loop in the N‐terminal domain must occur for luciferase to form an active complex with luciferin and ATP.
Results and discussion
Crystal structure determination
The crystal structure of PheA was determined by the multiple isomorphous replacement method, together with real‐space non‐crystallographic symmetry averaging and refined against 1.9 Å resolution diffraction data to a crystallographic R‐factor and R‐free (Brünger, 1992) of 21.4% and 24.6% respectively. The model for 512 residues has good stereochemistry and includes phenylalanine and AMP bound at the active site. No interpretable electron density is present for the 16 N‐terminal residues, the 33 C‐terminal residues, nor for a loop containing residues 192–196. The two copies of the molecule in the asymmetric unit have a very similar conformation: after superposition the r.m.s. difference in the position of the main chain atoms of residues 21–530 is 0.26 Å.
Description of the overall structure
The polypeptide chain folds into two compact domains (Figure 1). There are very few direct protein–protein interdomain contacts and instead the interactions between the structural domains are mediated by a network of hydrogen bonds between the side chains of the protein and a sandwiched layer of ordered water molecules. The much larger N‐terminal domain comprising residues 17–428 contains three subdomains: a distorted β‐barrel and two β‐sheets which pack together to form a five‐layered αβαβα tertiary structure (Figure 2). Subdomain A contains a six‐stranded β‐sheet and three helices formed by a single segment of the polypeptide chain (residues 91–203) while a seventh strand is formed by an insertion in the β‐barrel subdomain (Figure 3). The β‐sheet B contains eight strands, of which the first two (B1–B2) are formed by residues occurring before β‐sheet A in the polypeptide chain, while the remaining six strands (B3–B8) and four helices form a contiguous polypeptide segment located before the β‐barrel subdomain in the sequence. Strands 1–6 in the two β‐sheets share a similar topology, with strands A1–A4 in sheet A corresponding to strands B3–B6 in sheet B while strands A5–A6 correspond to strands B1–B2.
The C‐terminal domain (residues 429–530) includes two helices which pack against one side of a three‐stranded antiparallel β‐sheet E as well as an additional small sheet containing two β‐strands. The polypeptide chain at the C‐terminus of the protein loops back towards the N‐terminal domain and then packs against the remaining face of β sheet E (Figure 1). Residues at both the N‐ and C‐termini of the polypeptide chain project out from the surface of the molecule and are relatively less well ordered.
The crystal structure shows unambiguous electron density for the ligands bound at the active site (Figure 4). In spite of the presence of Mg‐ATP and phenylalanine in the crystallization conditions, the electron density is not consistent with the product of the activation reaction: the phenylalanyl adenylate has been hydrolysed to the corresponding amino acid and AMP. Substrate recognition is accomplished by an extensive network of hydrogen bonds with a number of charged or polar amino acid residues. Most of the protein residues involved in substrate recognition are contributed by the large N‐terminal domain. However, it is a charged residue of the C‐terminal domain, the strictly invariant Lys517, which is involved in two key polar interactions with both the amino acid and the adenosine, fixing their position in the active site and clamping the C‐terminal domain in a productive orientation. The lysine residue is located at the bottom of the large loop that projects down into the active site from the C‐terminal domain (Figure 1). The key role played by this residue has been demonstrated by site‐directed mutagenesis studies where the replacement of the corresponding lysine to a glutamine in the valine‐activating domain of surfactin synthetase 1 results in the reduction of the reverse rate of adenylate formation by 94% (Hamoen et al., 1995).
The adenylate is bound in a cleft present on the surface of the large N‐terminal domain, between residues from β‐sheet B and the β‐barrel subdomains. A stereo diagram of the adenylate binding site is shown in Figure 5, while Figure 6 shows the hydrogen bonding interactions. The adenine moiety lies in a slot sandwiched between the side chains of Tyr323, Tyr425 and Ile348 on one side and the main chain atoms of residues 302–304 on the other. The binding of the base is mediated not only by the large area of hydrophobic and van der Waals interactions on the sides of the slot, but also by the hydrogen bonds of the N6 amino group with the main chain carbonyl oxygen of Ala322 and the side chain oxygen of Asn321. Hydrogen bonding to the exocyclic nitrogen of the adenine is the major specificity determinant by which the enzyme discriminates against guanine. No other ring nitrogen is in direct contact with the protein; of the possible hydrogen bonding interactions with the acceptor groups at positions 1, 3 and 7 of the purine ring, only N1 contacts a well‐ordered water molecule (B = 22 Å2) which is in turn at 3.0 Å from the side chain nitrogen of Asn321. This pattern of interactions accounts for the catalytic activity displayed by the peptide synthetase in the presence of ATP analogues such as 7‐deaza‐ATP (Pavela‐Vrancic et al., 1994).
The ribose moiety is held in the C3′‐endo conformation. The two hydroxyls of the sugar are involved in hydrogen‐bonding interactions with the carboxylate of Asp413, which is a strictly invariant residue within the superfamily of adenylate‐forming enzymes. The 2′ hydroxyl also forms a rather long (3.2 Å) hydrogen bond with the side chain of Tyr425 while Tyr323, which packs edge‐on against the adenine ring, also hydrogen bonds to Asp413. In site‐directed mutagenesis studies on the highly homologous tyrocidine synthetase 1, Gocht and Marahiel (1994) find that the replacement of the invariant aspartate by an asparagine residue reduces the ATP‐PPi activity to 78% of that of the wild‐type enzyme, while the substitution of a serine residue at this position reduces the activity to just 12%. Gramicidin synthetase 1 shows a much higher activity when ATP is replaced by 2′‐deoxy‐ATP (a 40% reduction) than when 3′‐deoxy‐ATP (85% reduction) is used in the ATP‐PPi exchange reaction (Pavela‐Vrancic et al., 1994). Although believed to be poor acceptors (Moodie et al., 1996), both the ribose O‐4′ and O‐5′ are hydrogen‐bonded to the invariant Lys517.
The α‐phosphate of AMP has slightly weaker electron density, indicating that the binding site is more disordered. Interaction is with Thr326, a highly conserved residue in the superfamily of adenylate‐forming enzymes, Thr190, also well conserved although replaced by a serine in the luciferases, together with the invariant Glu327. Glu327 points towards O1 of the phosphate and is bridged presumably by a magnesium ion, at 2.54 Å from the carboxylate and 2.26 Å from the phosphate.
The amino acid binding site of the peptide synthetase is a pocket with an entrance on the concave surface of the large domain near the intersection of the three subdomains. A stereo representation of the phenylalanine binding site of PheA is shown in Figure 7. The side chain of Asp235 and the main chain carbonyl oxygens of Gly324 and Ile330 are well placed to form hydrogen bonds with the α‐amino group of the phenylalanine substrate. The aspartic acid is conserved in all peptide synthetases apart from the l‐α‐aminoadipate activating domain of ACV synthetase. In the PheA structure, Ile330 has dihedral angles (φ = 74°, ψ = −64°) outside the allowed regions of a Ramachandran plot (Ramakrishnan and Ramachandran, 1965). As is commonly observed in protein structures, the energetically unfavourable main chain dihedral angle is associated with a region of the molecule having a functional role (Herzberg and Moult, 1991). The α‐carboxylate group of the substrate amino acid is stabilized by an electrostatic interaction with the invariant Lys517 from the C‐terminal domain.
The specificity pocket for the phenylalanine side chain is surrounded by residues from the strands and a helix associated with β‐sheet B (Figures 2 and 7). The pocket is lined at the bottom by the indole ring of Trp239, on one side by Ala236, Ile330 and Cys331 and on the opposite side by Ala322, Ala301 and Thr278. The two sides of the pocket are appropriately separated to accommodate an aromatic residue but at one end of the pocket (towards the viewer in Figure 7) there is a water‐filled channel that connects with the solvent.
The phenylalanine binding pocket can accommodate both stereoisomers of the amino acid with no significant change in the protein conformation. In the 2.0 Å refined crystal structure of a ternary complex containing d‐phenylalanine and AMP, the polar interactions between the ligand and the protein are identical to those in the l‐phenylalanyl complex but the benzene ring of the side chain is rotated by 30° about an axis perpendicular to the plane of the ring. This rotation leads to a 1.3 Å displacement in the relative position of the Cβ atoms, but the Cα atoms are within 0.5 Å of one another and the oxygen atoms that interact with the side chain of Lys517 are within 0.26 Å.
Over 50 sequences are now known for the amino acid activating modules of peptide synthetases and although the enzymes differ in their substrate specificity, they show extensive regions of sequence similarity to PheA. Of the modules listed in Table I, the percentage of identical residues ranges from 26% for module 3 of the HC‐toxin synthetase to 56% for the phenylalanine‐activating module of tyrocidine synthetase. With this level of sequence identity, the main chain conformation of the enzymes is likely to be very similar and the differing substrate specificities will be mainly determined by the nature of the amino acids lining the substrate binding pocket. Table I lists these amino acids for several different peptide synthetases. A number of enzymes contain charged residues near the binding pocket and for the activation of substrates with charged side chains such as ornithine, aspartate and glutamate, there are protein side chains at either position 239 or 278 with opposite charge. The first module of ACV synthetase adenylates the δ‐carboxylate rather than the α‐carboxylate of the l‐α‐aminoadipate side chain, and it is possible that the α‐amino and α‐carboxylate groups of the substrate bind at the bottom of the pocket and interact with the arginine at position 239 and the glutamate at position 322. This mode of binding could explain the absence of an aspartate residue at position 235 to interact with the α‐amino group of the amino acid substrate. It can also be seen from Table I that charged residues are sometimes present near the pocket in enzymes that activate amino acids with neutral side chains, while synthetase modules that are specific for the same substrate can have different amino acids around the substrate binding pocket.
Comparison with the structure of firefly luciferase
As was anticipated from the 16% sequence identity and the presence of short highly conserved amino acid motifs equidistantly separated in the two sequences (Figure 8), the topology of each of the structural domains of PheA is very similar to that observed in the crystal structure of the unliganded firefly luciferase (Conti et al., 1996). PheA is slightly smaller, is missing an extra strand in sheet A, an α‐helix near the C‐terminus and, where the luciferase enzyme has a β‐strand at the N‐terminus of the molecule, PheA has an α‐helix. The most striking difference between the two crystal structures is the relative orientation of the N‐ and C‐terminal domains. When compared with the luciferase structure, the C‐terminal domain of PheA is rotated by 94° relative to the N‐terminal domain and is 5 Å closer to it (Figure 9). The loops 436–440 and 524–528 (luciferase numbering) near the domain interface which are disordered in luciferase have well‐defined electron density in the PheA crystal structure. Optimal superposition of the N‐terminal domains of the two enzymes results in an r.m.s. separation of 1.5 Å for the 287 pairs of equivalent Cα atoms with a separation <3 Å (Figures 8 and 9), while superposition of the C‐terminal domains gives an r.m.s. separation of 1.42 Å for the 77 Cα atoms within 3 Å. The structure of PheA in a binary complex with Mg‐AMP‐PNP in which the PNP moiety is disordered (data not shown) reveals a similar domain orientation to the one observed for the ternary complex, indicating that the presence of the amino acid is not required to obtain this productive conformation of the protein.
There is a major difference in the entrance to the cavity which in PheA is occupied by the amino acid ligand. Although most of the main chain atoms of the two enzymes superpose well around the active site, the loop containing residues 314–319 in firefly luciferase (residues 300–305 in PheA) has a different conformation and obstructs both the entrance to the large water‐filled luciferin binding pocket and the binding of the adenine ring of the nucleotide. A significant conformational change of this loop must occur in the firefly luciferase molecule to accommodate the binding of the ligands. The mutation of the highly conserved glycine residue in this loop (Gly302 in PheA) to a glutamic acid in the valine activating domain of GrsB results in the complete loss of PPi‐ATP exchange activity (Saito et al., 1995). The addition of a side chain at this position is likely to compromise the movement of the loop and cause a change in the conformation of the main chain as the dihedral angles for Gly302 (φ = 99°, ψ = −41°) are unfavourable for non‐glycine residues.
The most highly conserved sequence of amino acids in the superfamily of adenylate‐forming enzymes involves residues 190‐TSGTTGNPKG‐199 which in the PheA structure form a loop between β‐strands 5 and 6 in subdomain A. The absence of significant electron density for residues 192–196 implies that the central residues of the loop have conformational flexibility. The corresponding loop is also disordered in the crystal structure of the firefly luciferase. These residues are not involved in the binding of the AMP moiety; their position with respect to the AMP binding site suggests that they are likely to interact with the pyrophosphate leaving group (Figure 5)Gycine‐rich loops are often present in ATP‐ and GTP‐binding proteins and are generally found to form an anion hole which accommodates the phosphate of the nucleotide (Pai et al., 1989; Knighton et al., 1991). As discussed above, Thr190 at the beginning of the conserved peptide interacts with the α‐phosphate, while the side chain of Lys198 is poorly ordered and projects into solvent. The importance of Lys198 in the PPi‐ATP exchange reaction has been demonstrated by site‐directed mutagenesis of the corresponding lysine residue in tyrocidine synthetase 1 (Gocht and Marahiel, 1994). Both Lys198 and the invariant arginine at position 428 (Figure 5) are probably involved in coordinating the pyrophosphate group, while the lysine at position 517 is likely to stabilize the negatively charged pentavalent transition state in a manner analogous to that of the invariant arginine of class II aminoacyl‐tRNA synthetases. The arginine residue forms an ion pair with the α‐carboxylate moiety of the amino acid substrate but interacts with the α‐phosphate when the aminoacyl‐adenylate intermediate is formed (Onesti et al., 1995).
The knowledge of the residues involved in forming the substrate specificity pocket provided by this study combined with the wealth of amino acid sequence information available for other adenylate‐forming domains provides a structural basis for understanding the specificity of peptide synthetases. These findings should allow the manipulation of these enzymes for the synthesis of novel peptides with modified biological activity. Indeed, the introduction of alternate amino acid‐activating modules by recombinational integration in the Bacillus system has already demonstrated the possibility of biosynthesis of different antibiotics (Stachelhaus et al., 1995).
Materials and methods
Crystallization and data collection
The recombinant phenylalanine‐activating domain of gramicidin synthetase 1 (PheA) from B.brevis ATCC 999 was expressed in overproducing Escherichia coli cells and purified as described previously (Stachelhaus and Marahiel, 1995b). The PheA construct consists of residues 1–557 and a C‐terminal hexahistidine tag which was not cleaved after purification. The enzyme was co‐crystallized with Mg‐ATP, in the presence of either l‐ or d‐phenylalanine, from the sparse matrix screening (Jancarik and Kim, 1991).
The crystals of PheA complexed with Mg‐ATP and l‐phenylalanine used for the structure determination were grown by vapour diffusion techniques. The protein was concentrated to 15 mg/ml in 10 mM Tris–HCl, pH 7.8, 50 mM NaCl, 15% glycerol, 2 mM ATP, 2 mM l‐phenylalanine and 4 mM MgCl2.The hanging drops contained equal volumes of the protein solution and of a reservoir solution consisting of 28–32% (w/v) methoxy polyethylene glycol (MePEG) 5000, 200 mM ammonium sulfate and 100 mM N‐(2‐acetamido)‐iminodiacetic acid (ADA) at pH 6.5. Equilibration at 18°C yielded crystals that grew as plates, reaching their maximum size (0.5×0.5×0.2 mm) in 2 or 3 weeks. They belong to the primitive monoclinic spacegroup P21, with cell dimensions a = 61.9 Å, b = 155.6 Å, c = 65.8 Å and β = 94°. The volume to mass ratio calculation (Matthews, 1968) suggests the presence of two molecules per asymmetric unit, resulting in a Vm of 2.5 Å3/Dalton and a solvent content of 50%.
All data collection was carried out at cryogenic temperatures. Crystals were frozen using standard techniques (Teng, 1990) in a stream of nitrogen gas at 100 K produced by an Oxford Cryosystem (Cosier and Glazer, 1986). Successful freezing of the PheA crystals was achieved by transferring them for <1 min into a cryoprotectant solution containing 20% MePEG 5000, 100 mM ADA at pH 6.5 and 30% glycerol. For heavy atom screening, the crystals were previously stabilized into a harvesting solution consisting of 20% MePEG 5000, 200 mM ammonium sulfate, 100 mM ADA at pH 6.5 and 15% glycerol. Screening for heavy atom derivatives was carried out to low resolution on a MarResearch (Hamburg, Germany) image plate detector, using CuKα radiation from an Elliott GX21 rotating anode X‐ray generator. Higher‐resolution synchrotron data for the native and derivatives used for phasing were obtained at SRS (Daresbury, UK) and DESY (Hamburg, Germany). The 1.9 Å resolution data used in the refinement were measured on the wiggler BW7B line at DESY. Diffracted intensities were evaluated and integrated using a modified version of MOSFLM for processing image plate data (A.G.W.Leslie, personal communication) and the CCP4 suite was used for data reduction (CCP4, 1994). A summary of the data collection statistics is given in Table II.
The presence of two molecules in the asymmetric unit, suggested by the calculation of the Vm value, was confirmed by self‐rotation function studies which were carried out with the program POLARRFN (CCP4, 1994). An unambiguous peak which was six times as high as the r.m.s. of the map was identified and corresponded to a rotation of 180° around an axis in the ac plane. Attempts to solve the structure by the molecular replacement method using the firefly luciferase coordinates were unsuccessful. Phases were determined by the method of multiple isomorphous replacement using a uranyl acetate derivative, a mercury acetate derivative and a uranyl/mercury double derivative. Initial SIR phases were calculated from the positions of two uranyl sites which had been determined with the real‐space Patterson search program RSPS (CCP4, 1994). Additional heavy atom sites were located by difference Fourier methods. Refinement of heavy atom parameters and phase calculations were performed using the program MLPHARE (Otwinowski, 1991). The refinement of the heavy atom sites and occupancies of the double derivative was carried out independently to avoid artificial sharpening of the phase probability distributions. The final MIR phases had an overall figure of merit of 0.538 for data between 20 and 3.2 Å resolution. The phasing statistics are shown in Table II.
The MIR phases were improved by density modification procedures. After solvent flattening and solvent flipping (Abrahams and Leslie, 1996), the 3.2 Å phases were used to compute an electron density map which clearly showed the presence of two molecules in the asymmetric unit. The RAVE package (Kleywegt and Jones, 1994) was used for creating a protein envelope from the skeletonized electron density map for one molecule, and for subsequent mask manipulation and averaging protocols. The non‐crystallographic symmetry (ncs) operator were derived from the heavy atom positions and agreed with the results of the self‐rotation function calculations. The ncs operator was refined by real‐space density correlation with RAVE, to give a final density correlation coefficient of 0.71. Ten cycles of iterative averaging yielded an electron density map with many recognizable features resembling the luciferase structure.
The backbone of the luciferase molecule was used as a guide when building the model into the 3.2 Å averaged electron density map. A polyalanine model for 451 amino acid residues was traced by fitting fragments from a database of highly refined structures using the graphics program O (Jones and Thirup, 1986; Jones et al., 1991). The amino acid sequence could be fitted unambiguously to give an initial model which included 78% of the atoms.
The PheA model was refined with the program X‐PLOR (Brünger et al., 1987) against a 1.9 Å resolution data set which had not been used for MIR phasing due to lack of isomorphism with the derivative data sets. Low‐resolution data to 20 Å were included and a bulk solvent correction was applied throughout the refinement procedure. A random sample containing 5% of the data was excluded from the refinement and the agreement between calculated and observed structure factors for these reflections (R‐free) was used to monitor the course of the refinement (Brünger, 1992). The model was restrained with the Engh and Huber stereochemical parameters (Engh and Huber, 1991).
An initial round of rigid‐body refinement was carried out to 3.2 Å resolution, with each PheA molecule treated as a separate rigid body, giving a free R‐factor of 39.3%. A new matrix corresponding to the ncs operators was derived and strict non‐crystallographic 2‐fold symmetry was enforced until the later stages of refinement. The model was subjected to cycles of positional refinement to 3.2 Å resolution, followed by rounds of refinement in which the maximum resolution of the data was gradually increased. Beyond 2.4 Å resolution the refinement of the atomic positions was alternated with the refinement of the atomic temperature factors. After simulated annealing, the resulting molecular model had an R‐free of 32.9% and an R‐factor of 29.7% at 1.9 Å resolution. Manual rebuilding into an averaged electron density map allowed most gaps in the polypeptide chain to be closed and the electron density corresponding to the substrates AMP and phenylalanine could be easily located in the active site of the enzyme. Water molecules were added at geometrically reasonable positions at which the electron density was >4 σ in the averaged difference map. Subsequent refinement with ncs constraints and rebuilding in the averaged electron density map was performed until the free R‐factor had dropped to 29.9%. A round of rigid‐body refinement with each of the two molecules subdivided into two domains decreased the R‐free to 25.8% and subsequent positional and individual temperature factor refinement was carried out independently for the two molecules in the asymmetric unit. The last round of refinement was carried out including the 5% of the data previously omitted.
The refined model contains 512 amino acid residues and 284 water molecules per subunit. There is no interpretable electron density for the 16 N‐terminal and the 33 C‐terminal residues, as well as for the 192–196 loop. The crystallographic R‐factor is 21.3% for all reflections with no σ cut‐off between 20 Å and 1.9 Å resolution. The free R‐factor is 24.6% using a random sample of 3656 reflections out of the total 91142. The r.m.s. deviation from ideality is 0.008 Å in bond lengths and 1.37° in bond angles. The main chain dihedral angles for the majority of the residues in the refined model (90.1%) lie in the most favoured regions of the Ramachandran plot (Ramakrishnan and Ramachandran, 1965; Laskowski et al., 1993). Only two amino acid residues (Ile330 and Glu61) have energetically unfavourable dihedral angles, corresponding to disallowed regions in the Ramachandran plot. The average temperature factor for all protein atoms is 33 Å2. The refined B‐factors for the atoms of the substrates fall within the 17.6–30.7 Å2 range, but the α‐phosphate of AMP is modelled as only half occupied.
The refined coordinates of the ternary complex was used as a starting model in the refinement against a 2.0 Å resolution diffraction data set collected from a crystal grown in the presence of ATP and d‐phenylalanine (Table II). The resulting molecular model has a crystallographic R‐factor of 21.1% using data between 20.0 Å and 2.0 Å. The value of R‐free using a random sample containing 5% of the diffraction data is 24.5%. The r.m.s. deviation in bond lengths and bond angles from their target values are 0.007 Å and 1.25° respectively. The d‐phenylalanine substrate has low temperature factors (15–20 Å2) and the α‐phosphate of AMP has been modelled as being fully occupied. The structure of the protein in the complex containing d‐phenylalanine is very similar to that containing l‐phenylalanine: after superposition the r.m.s. difference in the positions of the main chain atoms is 0.19 Å.
We are very grateful to Silvia Onesti and Nicholas Franks for helpful discussions. We thank the staff at both the Synchrotron Radiation Source, Daresbury Laboratory and at DESY, Hamburg. We also thank the Deutsche forschungsgemeinschaft, the EC (contract Nr. BIO4‐CT95‐01769) and Fonds der chemischen Industrie for support. The European Union supported the work at EMBL Hamburg through the HCMP Access to Large Installations Project, Contract Number CHGE‐CT93‐0040. Figures 1, 2, 5, 7 and 9 were prepared using MOLSCRIPT (Kraulis, 1991).
- Copyright © 1997 European Molecular Biology Organization