The transsulfuration enzyme cystathionine γ‐synthase (CGS) catalyses the pyridoxal 5′‐phosphate (PLP)‐dependent γ‐replacement of O‐succinyl‐L‐homoserine and L‐cysteine, yielding L‐cystathionine. The crystal structure of the Escherichia coli enzyme has been solved by molecular replacement with the known structure of cystathionine β‐lyase (CBL), and refined at 1.5 Å resolution to a crystallographic R‐factor of 20.0%. The enzyme crystallizes as an α4 tetramer with the subunits related by non‐crystallographic 222 symmetry. The spatial fold of the subunits, with three functionally distinct domains and their quarternary arrangement, is similar to that of CBL. Previously proposed reaction mechanisms for CGS can be checked against the structural model, allowing interpretation of the catalytic and substrate‐binding functions of individual active site residues. Enzyme‐substrate models pinpoint specific residues responsible for the substrate specificity, in agreement with structural comparisons with CBL. Both steric and electrostatic designs of the active site seem to achieve proper substrate selection and productive orientation. Amino acid sequence and structural alignments of CGS and CBL suggest that differences in the substrate‐binding characteristics are responsible for the different reaction chemistries. Because CGS catalyses the only known PLP‐dependent replacement reaction at Cγ of certain amino acids, the results will help in our understanding of the chemical versatility of PLP.
The sulfur‐containing amino acids l‐cysteine, l‐homocysteine and l‐methionine are metabolically linked via the transsulfuration and reverse transsulfuration pathways. In transsulfuration, employing the sequential action of cystathionine γ‐synthase (CGS) and cystathionine β‐lyase (CBL), l‐cysteine condenses with activated l‐homoserine to form the intermediate l‐cystathionine (CTH), which subsequently is split asymmetrically into l‐homocysteine and pyruvate. Homocysteine may be methylated to yield l‐methionine. Different organisms seem to display distinct spectra of transsulfuration enzymes: plants and microbes employ only the forward pathway from cysteine to methionine, in mammals only the reverse transsulfuration is found, and fungi support transsulfuration in both directions.
CGS (EC 126.96.36.199), encoded by the metB gene, catalyses the first step in transsulfuration, which at the same time can be regarded as the first specific step in l‐methionine biosynthesis, i.e. a pyridoxal 5′‐phosphate (PLP)‐dependent γ‐replacement leading to the formation of CTH from a homoserine ester and l‐cysteine. CGS has been purified from a variety of bacteria and plants, including Salmonella typhimurium (Kaplan and Flavin, 1966b), Escherichia coli (Tran et al., 1983), Arabidopsis thaliana (Ravanel et al., 1998), wheat (Kreft et al., 1994) and spinach (Ravanel et al., 1995). While it has been found that the enzymes from prokaryotes and eukaryotes use different substrates, i.e. O‐succinyl‐l‐homoserine (OSHS) and O‐phospho‐l‐homoserine, respectively, in vitro, the plant enzymes also accept the non‐physiological esters, albeit with a 10‐fold higher Km (Datko et al., 1974; Ravanel et al., 1995). In all cases investigated, active CGS has been identified as a tetramer of identical or closely related 40‐50 kDa subunits with one PLP cofactor bound per monomer via a Schiff base linkage to an active site lysine (Kaplan and Flavin, 1966b). In the E.coli CGS, PLP is covalently linked to Lys198 (Martel et al., 1987) and gives rise to a strong absorption of the holoenzyme at 422 nm and a weaker absorption at 325 nm, originating from the ketoenamine and enolimine forms of the Schiff base, respectively. The sequences of the enzymes from different kingdoms are ∼30% identical, and the highest homologies are found for the ∼12 residues comprising the consensus PLP‐binding site (Fearon et al., 1982).
Steady‐state kinetics suggest that CGS employs a ping‐pong mechanism, commonly encountered in PLP‐dependent enzymes (Litzenberger Holbrook et al., 1990), with a slightly alkaline pH optimum (Kaplan and Flavin, 1966a; Litzenberger Holbrook et al., 1990). Besides the replacement, which is the only relevant reaction in vivo, CGS in the absence of l‐cysteine catalyses in vitro a γ‐elimination, breaking up OSHS into α‐ketoglutarate, succinate and ammonia, as well as a variety of other non‐physiological reactions (Guggenheim and Flavin, 1969b). Both chemical intuition (requirement for the elimination of the γ‐substituent in either route) and pre‐steady‐state kinetic analyses (absorbance maximum at 300 nm, appearing concomitantly with the bleaching of the 422 nm absorbance of the holoenzyme) suggest a common partitioning intermediate in the replacement and elimination reactions, presumably an α‐imino β‐γ‐unsaturated pyridoxamine derivative (Brzovic et al., 1990).
The enzymes involved in methionine biosynthesis are interesting targets for the development of antibiotics and herbicides. Our group has therefore set out to acquire high resolution structural data for these proteins in order to elucidate their reaction mechanisms and modes of interaction with inhibitors. Recently, the structure of E.coli CBL, the enzyme which catalyses the second step in the methionine pathway, has been solved at 1.82 Å resolution, both alone and in complex with the inhibitors aminoethoxy‐vinylglycine (AVG) and trifluoroalanine (Clausen et al., 1996, 1997a). CGS exhibits 37% similarity (29% amino acid identity) to CBL, and a common evolutionary origin for the two enzymes has been suggested (Belfaiza et al., 1986). We subsequently have been able to purify recombinant E.coli CGS (386 residues, 40 kDa) in large amounts and have obtained well diffracting single crystals (Wahl et al., 1997). The initial analysis of the diffraction data of one crystal form (space group P1) indicated the presence of two independent CGS tetramers per asymmetric unit, both of which exhibited local 222 symmetry. Herein, we describe the crystal structure of the native enzyme in a newly discovered crystal form with space group C2, refined at 1.5 Å resolution, and discuss its implications for the specificity and mode of action of CGS.
Results and discussion
Quality of the model
Because the structures of the enzyme in the two different space groups are almost identical, we only refer to the higher resolved C2 structure in this and the following discussions. Accordingly, the refined model comprises one CGS homotetramer, four covalently bound PLP molecules, 1301 waters oxygens and one 2‐methyl‐2,4‐pentanediol (MPD) molecule, and maintains small deviations from standard geometry (Table I). The final crystallographic R‐factor and free R‐factor (Rfree) are 20.0 and 25.5%, respectively, for the diffraction data between 8.0 and 1.5 Å. The mean positional error of the atoms as determined by a Luzatti plot (Luzatti, 1952) is 0.19 Å. Especially for the β‐strands, the internal side chains and the region around the cofactor, the error is likely to be lower. The main chain dihedral angles of all non‐glycine residues are within energetically allowed regions of the Ramachandran plot (Ramachandran and Sasisekharan, 1968), except for two serine residues (Ser326 and Ser178); 89.7% of the amino acids lie in the most favoured area of this plot. Both outlying serine residues are integral parts of the active sites and their conformation is determined unequivocally by the excellent electron density in these regions of the tetramer. A 2Fo‐Fc map contoured at 1.2σ shows continuous density for all main chain atoms with the exception of two flexible residues at the N‐terminus. Only the side chain of a single residue, Glu40, which is positioned on the surface of the molecule, displays no electron density.
Monitoring of the free R‐factor, model building into non‐crystallographic symmetry (NCS)‐averaged density and incorporation of a round of simulated annealing should have eliminated the model bias introduced from the CBL structure. The general correctness and quality of the solution was also attested by the ‘omit’ density which appeared for the PLP cofactor and improved continuously during the refinement (Figure 1).
All significant electron density patches were accounted for by the refined model, except for a small region which was symmetrical about the crystallographic 2‐fold axis and appeared in the late stages of the refinement. According to the composition of the crystallization buffer, it most probably corresponds to a MPD molecule. The co‐crystallization of this molecule is presumably one of the determinants of the change in space group from P1 to C2.
Overall structure of the CGS monomers
Secondary structural elements were assigned on the basis of characteristic hydrogen‐bonding patterns and φ/ψ backbone torsion angles [DSSP (Kabsch and Sander, 1983), O (Jones et al., 1991)]. Thereafter, the enzyme consists of 44.2% helical structure, 22.5% β‐bends, 17.7% β‐sheets and 15.6% unclassified coil structure. The short helix 7 comprising residues 184‐188 exhibits a hydrogen‐bonding pattern typical for 310 helices.
Like most other PLP enzymes, each CGS subunit can be divided into three domains (Figure 2). (i) Each subunit has an extended N‐terminal domain (residues 1‐51) composed of helix 1 and an extended loop structure comprising 38 residues. (ii) The large PLP‐binding domain (residues 52‐247) with an open, mainly parallel, seven‐stranded β‐sheet (β‐strands a, g, f, e, d, β and c with directions +, −, +, +, +, + and +, respectively) at its centre, which forms a curved plane around helix 3. All β‐strand cross‐overs are right‐handed. The β‐sheet in the PLP‐binding domain is sandwiched between seven helices; helices 2, 5, 6 and 7 are located on the solvent‐accessible side of the sheet, helices 3, 4 and 8 lie on the opposite side and are involved in the formation of the intersubunit and interdomain interface. At the latter interface, PLP is covalently attached via a Schiff base linkage to Lys198, located near the N‐terminus of helix 3, and the tails of strands d, e and f. The PLP‐binding domain is connected by the long α‐helix 9 to the C‐terminal domain. The connecting helix consists of 34 amino acids and has a kink near Thr247, adopting a characteristic structure which is also found in CBL (Clausen et al., 1996). (iii) The central part of the C‐terminal domain (residues 248‐385) is a five‐stranded, mainly antiparallel β‐sheet with a topology in Richardson notation (Richardson, 1981) of +1, +3X, −1X, −1. The formation of a five‐stranded β‐sheet in the C‐terminal domain has not been observed in any other related PLP enzyme structure. The cross‐overs are right‐handed, and helices 10, 11, 12 and 13 of the C‐terminal domain are all located on the solvent‐accessible side of the β‐sheet.
Quarternary structure and crystal packing
The crystal structure reveals the presence of a homotetramer with local 222 symmetry. The NCS axes could be identified unequivocally in a locked self‐rotation search performed with GLRF (Tong and Rossman, 1990) and were exploited in early stages of the model building. The crystal structure is in accord with the native tetrameric state of CGS in solution (Tran et al., 1983). The quarternary assembly displays approximate dimensions of 90×85×75 Å (Figure 2). Despite their crystallographic independence, any pair of monomers (AB, AC and AD) can be superimposed with r.m.s. deviations of 0.276, 0.283 and 0.241 Å, respectively.
The CGS monomers are oval, beetle‐shaped molecules of ∼50×55×70 Å. The compact monomer shape is achieved by packing the PLP‐binding domain against the C‐terminus, while an N‐terminal extended loop protrudes from the bulk of the subunit, forming a clamp to the neighbouring monomer. The clamping occurs mainly through hydrophobic interactions: Phe35 and Phe38 are bound to a hydrophobic pocket on the C‐terminal domain of the second subunit, which is lined with residues Leu311*, Leu315*, Leu323*, His 336* and Met340* (the asterisks indicate residues from the second subunit of the active dimer). These interactions enable residues 42‐50 to come in close contact with the neighbouring active site, and to participate in cofactor binding and formation of the active site entrance. Furthermore, the guanidinium group of Arg114* of the second monomer docks via hydrogen bonds to three carbonyl oxygens to the C‐terminus of helix 3 (PLP phosphate‐binding helix). With its N‐terminus binding the PLP phosphate, helix 3 effectively mediates an intersubunit salt bridge between the two peripheral ionic groups via its helix dipole. Overall, it can be seen that an intricate network of interactions orchestrates contributions from residues of all three domains of a monomer and from the two subunits of the active dimer to both symmetrical active sites.
Within the homotetramer, the four monomers clearly assemble into two catalytically active dimers (AC and BD, Figure 2), engaging in extensive intradimer contacts and in a smaller number of interdimer contacts. The three different monomer‐monomer associations bury 5127 Å2 of the solvent‐accessible surface in the tight (AC) connection, and 2288 Å2 (AB) and 3369 Å2 (AD) in the weaker connections, as calculated with GRASP (Nicholls et al., 1991). The monomer‐monomer interactions within an active dimer encompass 32 hydrogen bonds, two salt bridges and various hydrophobic contacts, while the dimer‐dimer interactions include 76 hydrogen bonds and 12 salt bridges, rendering the dimer‐dimer interface highly polar. Only in the centre of the tetramer, there is a striking cluster of hydrophobic residues which seems to play a pivotal role in the stabilization of the quarternary assembly. One phenylalanine (Phe236), one isoleucine (Ile27) and two leucines (Leu29 and Leu240) from each monomer come together entertaining van der Waals and stacking interactions. The intimacy of the contacts is protected by the shielding effect of peripheral Tyr239 residues and salt bridges between Arg243 and Asp205.
The above quarternary association leads to the formation of four active sites per tetramer, two made up within each active dimer. These two active sites are quite close to each other (P‐P distance of the PLP cofactors = 21.4 Å). For several enzymes of the γ‐family, evidence has been obtained suggesting that only one active site per dimer is actually operating or being inhibited by mechanism‐based inactivators (Abeles and Walsh, 1973; Silverman and Abeles, 1977; Johnston et al., 1979). Interestingly, the present structure reveals a corresponding asymmetry with respect to the B‐factors for residues 34‐50, which are thought to be essential for substrate binding. For the monomers of one of the active dimers (BD), this region is significantly more disordered (average B‐values 65 Å2 versus 40 Å2 in AC). The more flexible active site environment could lead to an easier acceptance of the substrates via an induced‐fit mechanism.
The active site design is shown schematically and spatially in Figure 3. The boundaries which enclose the cofactor are made up of the N‐terminal ends of helices 3 and 4, the C‐terminal tails of strands d, e and f, and their associated loops. The opposite border of the active site lining the bound substrate is constructed by the C‐terminal loops of strands C and D of the C‐terminal domain, and an N‐terminal segment (44*‐50*) of the neighbouring subunit.
Continuous electron density shows that the aldehyde functionality of the uncomplexed PLP and the ϵ‐amino group of Lys198 have condensed to form the internal aldimine (Figure 1). The nitrogen of this Schiff base is protonated as deduced from the absorption maximum of the CGS crystals at 425 nm. The positive charge is stabilized by interaction with the deprotonated PLP hydroxyl group at C3.
Besides the covalent bond, the cofactor is anchored predominantly in the active site through its phosphate group, for which seven hydrogen bonds to protein residues are discernible, i.e. to the main chain amide nitrogens of Gly76 and Met77, and to the side chains of Ser195, Thr197, Tyr46* and Arg48*. The latter hydrogen bonds are strengthened by charge‐charge interactions. Binding of the PLP phosphate is improved further by its interaction with the positive end of the helix 3 macrodipole, as mentioned above. The OP4‐C5′‐C5‐C4 torsion angle exhibits an energetically favourable 84°, fixing the ester oxygen at the A‐face (Ford et al., 1980; in the direction of the protein interior) of the cofactor.
The PLP pyridine is sandwiched between Tyr101 and Thr175/Ser195 which impede any vertical movements of the cofactor with respect to the pyridine ring plane. The latter two residues are in van der Waals contact (shortest distances to PLP atoms 3.8 and 3.7 Å, respectively) with all atoms of the cofactor ring system. The phenol ring of Tyr101 is located 3.7 Å above the PLP pyridine ring with a tilt angle of ∼10° between the best planes through both rings. Apart from restraining the cofactor, the resulting ring‐stacking interactions should, in analogy to the situation in aspartate aminotransferase (Hayashi et al., 1990), increase the electron‐sink character of the cofactor. The hydroxyl group of Tyr101 is positioned close to the guanidinium group of Arg48* (3.1 Å) and to the positively charged nitrogen atom of the internal aldimine (4.4 Å). It can therefore be inferred that Tyr101 exists as a phenolate in the pH range of 8.2‐9.5, over which the CGS activity is maximal.
Asp173 constitutes another catalytically important residue. It builds a strong hydrogen bond/salt bridge (2.71 Å) to the PLP pyridine nitrogen (N1), thereby stabilizing its positive charge and increasing the electrophilic character of the cofactor. As a component of an extensive hydrogen‐bonding network (Figure 3A), the carboxylate group of Asp173 is fixed in the geometrically optimal position for the contact to N1. Furthermore, the hydrogen bonds which originate from Asp173 should permit charge dissipation during the catalytic cycle into the protein periphery.
Beside Lys198, Asp173 and the structurally important Gly76, Arg361 is largely conserved in the α‐ and γ‐families of PLP‐dependent enzymes, with the exception of those members whose substrates miss the α‐carboxylate group. Accordingly, this residue is assumed to bind the α‐carboxylate group of the incoming substrate. The orientation and position of the Arg361 side chain is fixed in the present structure by interaction with the amide oxygen of Asn148. In the CGS active site, two water molecules (O597 and O659) are found in the vicinity of the guanidinium group of Arg361 (Figure 3). The locations of these water molecules could point to the positions of the substrate α‐carboxylate oxygens.
The C2′‐methyl group and the OH3′‐hydroxyl group of the cofactor are not involved in any specific interactions with the protein. At the corresponding side of the cofactor, the binding pocket has a significantly hydrophobic character which originates from the apolar residues Phe176 and Leu327. Only one weak hydrogen bond is formed between the deprotonated hydroxylic group at C3 (OH3′) and an active site water molecule (O3). In contrast, aminotransferases provide a multitude of polar interactions with OH3′. These interactions are believed to be of importance in lowering the pKa of the internal aldimine (Yano et al., 1993) and for binding an incoming water molecule for ketimine hydrolysis (John, 1995). These mechanistic features are not relevant for CGS. Furthermore, limited hydrogen bonding to OH3′ is not opposing reorientation of the PLP pyridine ring during transaldimination and should, in analogy to the Y225F mutant of aspartate aminotransferase (Goldberg et al., 1991), result in a decreased Km for substrates and inhibitors.
Comparison with CBL
The three‐dimensional fold of CGS is related to those of the members of the α‐family of PLP‐dependent enzymes (for an overview, see John, 1995) and, of course, to the evolutionarily related γ‐family enzyme CBL. For comparison of the respective active sites, residues Tyr101, Asp173, Lys198, Ser326, Arg361 (CGS nomenclature) and the PLP of CGS and CBL were used to guide a least squares superposition with the program O (Jones et al., 1991). The resulting overlays of the Cα traces and of selected active site residues are shown in Figure 4.
Like CGS, CBL is a homotetramer in solution (Belfaiza et al., 1986), with very similar shape, dimensions, relative orientation of subunits, and domain structure. The least squares superposition aligns 340 Cα atoms with an r.m.s. deviation of 1.482 Å. All secondary structural elements of the PLP‐binding and most of the C‐terminal domain superimpose almost perfectly. Interestingly, even the N‐terminal residues 1‐35 of CBL and CGS fold similarly, although they exhibit no sequence homology.
The two main regions of divergence in the overall structures of CGS and CBL (referred to as I and II) are indicated in Figure 4A and B. Region I (CGS 336‐358, CBL 349‐369) is situated in the C‐terminal domain forming an extended loop including helix 12. For CGS, this loop protrudes much further into the active site than for CBL, considerably narrowing the active site cleft at this border. The arrangement of this loop furthermore leads to a reorientation of helix 12 by 90°, thereby opening a second entrance to the active site. This stretch of the protein contains no catalytically important residues and no amino acids implicated directly in substrate binding. Rather, by influencing the overall shape and the electrostatic curvature of the active site channel, this region presumably plays a steric role for substrate binding or product release.
The second region (region II) is composed of residues 41‐44 in CGS and 40‐54 in CBL, and constitutes the area of predominant differences between the two enzymes. In CGS, it is again arranged in a short loop, whereas in CBL it folds in an additional helix of the N‐terminal domain which protrudes into the second subunit of the active dimer. Here it builds up the active site wall near the phosphate group of the cofactor and carries two arginines (Arg47* and Arg49* in CBL nomenclature) which were proposed to be important in CBL for binding the distal carboxylate group of cystathionine (Clausen et al., 1997b).
In the superposition of the active sites of CGS and CBL (Figure 4C), the catalytically important residues Lys198, Tyr101 and Asp173 align almost perfectly. Even the hydrogen‐bonding network around Asp173 is quite conserved. Similarly, the positions of residues involved in the binding of the PLP phosphate group (Tyr46*, Arg48*, Gly76, Met77 and Thr197) are closely related. Nevertheless, we observe some functionally significant differences between the two structures.
Probably the most interesting amino acid exchanges refer to the CBL Tyr338 and Phe55*, which are mutated to Glu325 and Asp45* in CGS. In the structure of the CBL‐AVG complex, the phenolic rings of Tyr338 and Phe55* engage in hydrophobic interactions with the inhibitor and enclose it in the active site. The corresponding regions of CBL previously have been proposed to determine the permissible chemical nature of the substrate and its orientation (Clausen et al., 1997a). The change from tyrosine/phenylalanine to glutamate/aspartate corresponds to an exchange of hydrophobic to acidic side chains, which can be correlated with the enzymes' substrate specificity. Accordingly, the CBL substrate should be bound in analogy to AVG via hydrophobic interactions, while the second CGS substrate, cysteine, is more appropriate for an interaction with the acidic side chains of glutamate and aspartate.
A shift in the locations of the side chains of Tyr46* (CGS) and Tyr58* (CBL) leads to a much closer approach of the PLP phosphate and the tyrosine side chain in CGS. By this rearrangement, the side chain of Tyr46* adopts an orientation different from that of the ϵ‐amino group of Lys198 which could be of catalytic importance (see below). A further minor difference between CGS and CBL is the change from Trp340 (CBL) to Leu327 (CGS), which should have little influence on catalysis or substrate binding. The position, which can be occupied by a phenylalanine, a leucine or a tryptophan, is not strictly conserved in the γ‐family of PLP enzymes and shows a variation which seems to be independent of the catalysed reaction type or the converted substrate.
Modelling of substrate‐binding modes and reaction mechanism
CGS represents the only case of a PLP‐dependent enzyme that catalyses a replacement or elimination at the γ‐carbon of an amino acid substrate. The present X‐ray structure in combination with molecular modelling of enzyme‐substrate complexes suggests a likely catalytic mechanism for CGS. This mechanism is based, with some slight modifications, on a mechanism proposed by Brzovic et al. (1990).
OSHS binding to the CGS enzyme has been modelled in the state of the external aldimine formed between the OSHS molecule and the PLP cofactor. The PLP moiety and the α‐carboxyl group were pre‐oriented according to the binding mode of AVG in the inhibitor complex crystal structure of the related X‐ray structure of CBL (Clausen et al., 1997a), with the α‐carboxyl group in hydrogen‐bonding contact with Arg361. The carboxyl group of the O‐succinyl part of the molecule was oriented in hydrogen‐bonding distances to Arg48* and Arg106. The situation before and after energy minimization is depicted in Figure 5. The external aldimine‐enzyme complex minimized to a total energy of −901 kcal/mol. The minimization brings the α‐carboxyl group of the external aldimine into hydrogen‐bonding distance to Asn148, maintaining the contact to Arg361. The distal carboxyl group of theO‐succinyl moiety remains hydrogen bonded to Arg48* and additionally acquires an H‐bond with Tyr101. A remarkable rearrangement of side chains in the active site occurs as a consequence of the formation of the external aldimine which has also been observed in the CBL‐AVG complex structure (Clausen et al., 1997a). The side chain of Tyr101 undergoes a considerable rotation and comes to lie parallel to the pyridine ring of the external aldimine. The side chain of Arg106 moves away from the O‐succinyl‐carboxyl, breaking the initial hydrogen bond.
Figure 6A shows a solid surface/electrostatic potential representation of the active site, calculated without the OSHS external aldimine, plus its energy‐minimized model. The blue positive recognition regions for the two OSHS carboxyl groups are due mainly to the side chains of Arg361 and Arg48*. The negatively charged recognition site for the α‐amino group of OSHS is not visible on this surface because it arises mainly from the deprotonated OH3′ of the PLP which is buried in the cofactor‐binding pocket. The charge distribution in the right well at the entrance of the channel is determined by Arg49* and Asp45*, and that of the lower part of the entrance by Glu325. In the energy‐minimized complex (surface not shown), the side chain of Glu325 moves into hydrogen‐bonding contact with Arg361 (Figure 5). Thus, the carbonyl function of OSHS is no longer close to Glu325 and the negative surface potential of the latter residue is partly neutralized by the formation of the Arg361‐Glu325 interaction.
CGS catalyses the γ‐synthesis of CTH from OSHS and l‐cysteine and, therefore, has to adopt two different substrate molecules. In accordance with substrate modelling, OSHS would occupy the main part of the binding pocket, extending from Arg361 to a very basic region lined with Arg106, Arg48* and Asn227* (Figure 6A). With OSHS fixed in this orientation, there is a second potential binding site which is electrostatically optimally designed for hosting a zwitterion and could, therefore, be responsible for binding l‐cysteine. In Figure 6B, the leaving group of OSHS has been omitted and the second substrate, l‐cysteine, has been docked manually with the α‐amino and α‐carboxy groups in hydrogen‐bonding contacts with their putative recognition functions Asp45*/Glu325 and Arg49*, respectively. In this model, the l‐cysteinate sulfur is arranged perfectly for a nucleophilic attack at the Cγ of the homoserine‐PLP derivative. In contrast to CGS, the active site entrance of CBL is very narrow owing to the location of helix 3, which is absent in CGS and which covers the proposed cysteine‐binding site like a lid (Figure 6B). In detail, the side chain of Arg59* of CBL is approximately in the region of Arg49* of CGS, Phe55* (CBL) in place of Asp45* (CGS), Tyr338 (CBL) in place of Glu325 (CGS) and the other visible residues of helix 3 of CBL cover empty space in CGS. Since all catalytically important residues of CBL are conserved in CGS, we therefore propose that the different chemical reaction types supported by the enzymes are solely a consequence of their different substrate‐binding properties.
As may be expected for this unique and unusual reaction, the reaction mechanism entertained by CGS is rather complicated. It has been much discussed in the literature (Brzovic et al., 1990; and references therein) and appears to involve three residues that act as acid‐base catalysts: (i) for the transaldimination reaction; (ii) for the 1‐3 prototropic shift; and (iii) for protonation of the OSHS leaving group. Based on the structure, we can propose specific residues for each of these functions (Figure 7).
(i) Transaldimination. A scenario similar to that proposed for CBL should hold true (Clausen et al., 1996). After productive binding of OSHS through residues Arg48*, Tyr101 and Arg361, the amino group of the substrate must be deprotonated for the nucleophilic attack on C4′ of the internal aldimine. Y101 should exist in the unliganded enzyme at pH 8.5 (optimal pH, Litzenberger Holbrook et al., 1990) as phenolate due to the two neighbouring positive charges (Arg48*, NH of the internal aldimine). Thereafter, Tyr101 abstracts a proton from the incoming substrate and initiates transaldimination (Figure 7I).
(ii) Generation of the ketimine intermediate as identified in time‐resolved rapid scanning spectroscopy (Brzovic et al., 1990). After transaldimination, Lys198 is responsible for the proton transfer from Cα to C4′ of the cofactor. Due to the absence of the typical quinonoid intermediate, as deduced from the missing characteristic 490 nm absorption band during the course of the reaction (Karube and Matsushima, 1977; Kallen et al., 1985), the proton should be more or less directly transferred from Cα to C4′ as indicated in Figure 7II and III. As argued for aspartate aminotransferase (Kirsch et al., 1984), Tyr46* should have an effect in guiding the amino group of Lys198 during catalysis. In comparison with the corresponding CBL Tyr58*, which is important mainly for cofactor binding, Tyr46* in CGS has moved deeper into the protein interior (see above), thereby favouring a positioning of the protonated Lys198 amino group near C4′ after the initial α‐proton abstraction. From a structural point of view, the side chain of Lys198 can swing like a ‘liana’ to either Cα, C4′ or Cβ of the substrate. It is therefore also proposed to be the residue abstracting a proton from Cα to initiate γ‐cleavage (Figure 7IV). In the PLP ketimine derivative, the latter atom should have a slightly acidic character (Brzovic et al., 1990), facilitating this proton abstraction process.
(iii) Release of succinate. To gain insight into this part of the reaction, it would be helpful to elucidate the structure of the CGS‐propargylglycine (PG) complex. Unfortunately, the CGS crystals suffer severely when treated with PG and lose much of their diffraction power. Inactivation by PG is the consequence of the covalent linking of the PG‐Cγ to an active site nucleophile other than Lys198. For the closely related cystathionine γ‐lyase, it has been reported that this residue should be either a cysteine or a tyrosine (Washtien and Abeles, 1977). In accordance with substrate modelling experiments and the structure of the CBL‐AVG inhibition complex (Clausen et al., 1997a), Tyr101 seems to be the corresponding residue in CGS. Thereafter, Tyr101 facilitates release of succinate in the physiological reaction as proposed in Figure 7.
The resulting β‐γ unsaturated ketimine (Figure 7VI) exhibits a pronounced electron deficiency, caused by the protonated Schiff base, leading to activation of Cγ towards Michael nucleophilic addition by l‐cysteinate. The last steps of the catalytic cycle (VII‐IX) are the reverse of the previous steps including Cβ protonation, C4′ deprotonation and Cα protonation. Reverse transaldimination of the external product aldimine releases CTH into the active site and completes the net γ‐replacement reaction.
In summary, the CGS structure, in comparison with that previously determined for the related CBL, illustrates how versatility can be conferred upon a cofactor by the intricate design of the active site of the corresponding enzymes. Specificity in the present example presumably stems from the steric and electrostatic (in)accessibility of the active sites rather than the exchange of catalytically important residues. High resolution structures, as exemplified by the present work, should prove instrumental in the design of specific inhibitors of transsulfuration components, which may be an important lead to novel herbicides and antibiotics, acting through interference with microbial and plant methionine biosynthesis.
Materials and methods
Crystallization and data collection
Recombinant CGS was purified and crystallized as described previously (Wahl et al., 1997). For initial data collection, the largest crystals were soaked in cryoprotectant (reservoir buffer supplemented with 15% MPD) and mounted in a liquid nitrogen stream (KGV, Karlsruhe, Germany), which allowed data collection over prolonged times at 113K. Diffraction data initially were collected on our in‐house MAR‐Research (Hamburg, Germany) image plate with graphite‐monochromated CuKα radiation (λ = 1.5418 Å) at 50 kV/100 mA from a Rigaku (Tokyo, Japan) RU 200 rotating anode. A single specimen produced a data set complete to 2.4 Å resolution in a 180° φ‐scan with a 1.0° frame width and 1600 s exposure per frame. The data were processed as before (Wahl et al., 1997) with the program packages MOSFLM (Leslie, 1991) and CCP4 (Collaborative Computational Project, 1994).
Initially, only crystals which exhibited space group P1 with >25 000 non‐hydrogen atoms per asymmetric unit were characterized, necessitating the acquisition of a larger number of reflections to allow independent refinement of all asymmetric atoms. At times, crystals appeared under the same reservoir conditions which exhibited a distinct, compact morphology (∼0.3×0.3×0.3 mm3). One of these crystals yielded data beyond 1.5 Å resolution at an X‐ray wavelength of 1.1 Å at the Deutsche Elektronen Synchrotron, beamline BW6, Hamburg, Germany at 100K (Oxford Cryosystems, Oxford, UK). The data were processed as before and indicated space group C2 with a = 160.04 Å, b = 61.30 Å, c = 153.84 Å, and β = 104.18°, suggesting a single CGS tetramer per asymmetric unit. The statistics of this data set are summarised in Table I.
Structure solution and refinement
The structure of CGS initially was solved by molecular replacement with a P1 data set and later transferred to the higher resolution C2 data. In all structure solution and refinement routines, only data with F ≥2σ(F) were used (Table I), 10% of which were set aside for the calculation of an Rfree. Best‐fit alignment (Genetics Computer Group, 1997) of the sequences of E.coli CGS and CBL revealed homologies between the enzymes, based on which a model for rotation‐translation searches was constructed: all identical residues were retained, the others were replaced with alanines, the extended N‐terminus of CGS (residues 1‐50) was omitted completely, and all B‐values were reset to 20 Å2.
Using the P1 data up to 3.5 Å resolution, the partial polyalanine model, assembled into a tetramer guided by the CBL structure, yielded two solutions in a molecular replacement search in AMoRe (Navaza, 1994) well above the background, with a combined R‐factor and correlation coefficient of 48 and 35%, respectively. While the resulting 2Fo‐Fc electron density map was not easily interpretable, a map after 8‐fold cyclic averaging with RAVE (Kleywegt and Jones, 1994), using NCS operators extracted with LSQMAN (Kleywegt, 1996), allowed unequivocal main‐ and side‐chain tracing in the central parts of the molecule. Refinement, model rebuilding and water incorporation proceeded smoothly via rigid body, positional and later B‐factor optimizations in X‐PLOR (Brünger, 1992) using the parameters developed by Engh and Huber (1992), and employing weak NCS restraints. During this process, the R‐factor and the Rfree dropped from 48 to 28% and from 49 to 33%, respectively.
Resetting the B‐factors to 20 Å2, one tetramer model at this stage of the refinement was used in rotation‐translation searches with the C2 data set truncated at 3.5 Å resolution, allowing the positioning of a CGS tetramer in the new space group (R‐factor = 30%, correlation coefficient = 55%). The unequivocal solution displayed clear electron density for the PLP cofactor which was not included in the initial search model. All data to 1.5 Å resolution were included stepwise in rigid body refinements (X‐PLOR). Further improvement of the model was achieved in positional and B‐factor optimizations (X‐PLOR). One round of simulated annealing (Weis et al., 1990) was included to minimize the model bias. Water molecules were placed into the electron density with the routine ARP (Lamzin and Wilson, 1993) and manually checked for their authenticity. In the last round of the refinement, an anisotropic overall B‐factor refinement was applied. The final model contained 12 993 non‐hydrogen atoms and converged at an R. Rfree of 20.0%/25.5%. The atomic coordinates and experimental intensity data have been submitted to the Brookhaven Protein Data Bank (accession No. 1CS1).
Molecular model building
Molecular modelling was pursued with modules Viewer, Builder, Docking, Delphi and Discover3 of program Insight II (Version 97.0; Los Angeles, 1997). The dimer consisting of subunits A and C of the refined CGS X‐ray structure was chosen as the starting point for the model building and energy minimization of the modelled complex. OSHS, and subsequently its external aldimine with PLP, were generated and positioned into the active site following the binding of AVG in CBL. The Consistent Valence Forcefield (CVFF) of the Insight II program was used for the energy minimization with Discover3 which was run over 2000 minimization steps until convergence (0.1 kcal/mol tolerance). The protein dimer and the ligand molecule were minimized simultaneously.
Generous financial support from a stipend of the ‘Peter und Traudl Engelhorn Stiftung zur Fürderung der Biotechnologie und Gentechnik’ to M.C.W. is gratefully acknowledged.
- Copyright © 1998 European Molecular Biology Organization