The crystal structure of the P‐protein of the glycine cleavage system from Thermus thermophilus HB8 has been determined. This is the first reported crystal structure of a P‐protein, and it reveals that P‐proteins do not involve the α2‐type active dimer universally observed in the evolutionarily related pyridoxal 5′‐phosphate (PLP)‐dependent enzymes. Instead, novel αβ‐type dimers associate to form an α2β2 tetramer, where the α‐ and β‐subunits are structurally similar and appear to have arisen by gene duplication and subsequent divergence with a loss of one active site. The binding of PLP to the apoenzyme induces large open–closed conformational changes, with residues moving up to 13.5 Å. The structure of the complex formed by the holoenzyme bound to an inhibitor, (aminooxy)acetate, suggests residues that may be responsible for substrate recognition. The molecular surface around the lipoamide‐binding channel shows conservation of positively charged residues, which are possibly involved in complex formation with the H‐protein. These results provide insights into the molecular basis of nonketotic hyperglycinemia.
The glycine cleavage system (GCS) is a multienzyme complex composed of four different components (P‐, H‐, T‐ and L‐proteins). This system catalyzes the oxidative cleavage of glycine in a multistep reaction (Figure 1) (Motokawa et al, 1995). In almost all organisms, the GCS plays a crucial role in the degradation of glycine, and it has been studied extensively (Perham, 2000; Douce et al, 2001). In humans, a mutation in the GCS genes can lead to a dramatic accumulation of glycine in the blood, resulting in severe neurological diseases, termed nonketotic hyperglycinemia (NKH) (Tada and Kure, 1993; Applegarth and Toone, 2001). In plants, large amounts of GCS proteins are present in leaf mitochondria, where they are involved in the photorespiratory pathway (Douce et al, 2001). In vivo, the GCS acts as a stable complex with an approximate ratio of 2P:27H:9T:1L; in vitro, it easily dissociates into its component proteins, and the H‐protein acts as a mobile cosubstrate that commutes between the other three enzymes (Oliver et al, 1990). The three‐dimensional structures of H‐protein from pea (Pares et al, 1994; Cohen‐Addad et al, 1995; Faure et al, 2000) and Thermus thermophilus HB8 (Tth) (Nakai et al, 2003a), T‐protein from Thermotoga maritima (Lee et al, 2004), and L‐protein from pea and six other species (Faure et al, 2000, and references therein) have been published, and only the structure of the P‐protein has not yet been reported.
P‐protein, also known as glycine decarboxylase (glycine:lipoylprotein oxidoreductase (decarboxylating and acceptor‐aminomethylating), EC 22.214.171.124), is a pyridoxal 5′‐phosphate (PLP)‐dependent enzyme (PLP‐enzyme). PLP‐enzymes have been classified into fold‐types I–V (Grishin et al, 1995; Jansonius, 1998; Mehta and Christen, 2000; Schneider et al, 2000), and P‐protein belongs to fold‐type I based on sequence similarities (Grishin et al, 1995). Fold‐type I, of which aspartate aminotransferase (Ford et al, 1980) is the prototype, is the best‐characterized structurally among the five types. All of the fold‐type I enzyme structures determined so far have shown that the protein occurs as a homodimer (αI2‐type active dimer) with an internal two‐fold axis or as multimers (αI4, αI6 or αI12) (Supplementary Table I), where each αI‐subunit (αI, ∼50 kDa) contains one molecule of PLP. On the other hand, the subunit compositions of P‐proteins from many species have been classified into two types: those from eukaryotes (e.g. human (Kume et al, 1991) and pea (Bourguignon et al, 1988)) and some of the P‐proteins from prokaryotes (e.g. Escherichia coli (Okamura‐Ikeda et al, 1993)) are in the homodimeric form (αNC2), while the rest of those from prokaryotes (e.g. Clostridium acidiurici (Gariboldi and Drake, 1984), Eubacterium acidaminophilum (Freudenberg and Andreesen, 1989) and Tth (Nakai et al, 2003b) are in the heterotetrameric form (αN2βC2), where the αN‐ and βC‐subunits (αN and βC, each ∼50 kDa) correspond respectively to the N‐ and C‐terminal halves of the ∼100 kDa αNC‐subunit (αNC‐N and αNC‐C, each ∼50 kDa) based on sequence similarities. Interestingly, while PLP is attached not to αNC‐N and αN but to αNC‐C and βC (Fujiwara et al, 1987), all of these proteins exhibit sequence similarities to αI. Thus, the roles of αNC‐C and βC appear to correspond to those of αI, but those of αNC‐N and αN have remained unknown.
The Tth P‐protein used in this study has a total molecular mass of 200 kDa (αN 47.1 kDa with 438 residues and βC 52.7 kDa with 474 residues), where αN and βC have 31 and 37% sequence identities with αNC‐N and αNC‐C of human P‐protein, respectively. This suggests that Tth and human P‐proteins have the same folding topology and similar three‐dimensional structures. In humans, more than 80% of NKH patients have a specific defect in P‐protein (Tada and Kure, 1993; Toone et al, 2000). Therefore, we expected the Tth P‐protein structure to be a suitable model for understanding the molecular basis of NKH. Moreover, determination of the P‐protein structure should greatly facilitate studies aimed at understanding the roles of active‐site residues, the reaction mechanisms and the structural architecture of this multienzyme complex.
Here we report the crystal structures of three forms of Tth P‐protein: the apoenzyme (Papo) at 2.4 Å resolution, the holoenzyme (Pholo) at 2.1 Å resolution and Pholo in complex with a substrate analog inhibitor (aminooxy)acetate (AOA) (Pholo·AOA) as an external aldimine intermediate model at 2.4 Å resolution. These structures reveal a novel αNβC‐type active dimer and show that the enzyme undergoes large conformational changes with movements up to 13.5 Å upon binding of the cofactor. Neither of these characteristics have ever been observed in the other fold‐type I enzymes. We discuss the molecular mechanisms of these enzymes in light of the new structural information, and in particular how the mutations identified in NKH patients might lead to loss of enzymatic activity.
Description of the structure
The P‐protein has an αN2βC2‐tetrameric structure with an overall size of approximately 114 × 79 × 82 Å (Figure 2), where two pairs of αN and βC form intimate αNβC dimers that are related by a noncrystallographic two‐fold axis. Thus, the tetramer can be described as a dimer of heterodimers. Furthermore, αN and βC of the αNβC dimer exhibit a similar folding topology (root‐mean‐square deviation (r.m.s.d.) of 2.01 Å for 395 Cα atoms) and thus appear to be related by an approximate two‐fold axis (Figure 3A), resulting in the (αNβC)2 tetramer, which mimics a homotetramer with pseudo‐222 symmetry (Figure 2). The designation of the subunits as αN, αN′, βC and βC′ (a prime after a Greek letter indicates the other subunit related by a two‐fold axis) has been chosen such that the αN‐ and βC‐subunits (or αN′ and βC′) form intimate αNβC (or αN′βC′) dimers, that is, the active dimeric form, as shown in Figure 2. Each subunit in the tetramer interacts with the other subunits through six interfaces, and the accessible surface areas buried in the αN–αN′, αN–βC (αN′–βC′), αN–βC′ (αN′–βC) and βC–βC′ interfaces are 1512, 17 222 (17 125), 1433 (1418) and 2199 Å2, respectively. Notably, the αN–βC interface buries a large hydrophobic surface of 11 058 Å2, which would likely render each subunit insoluble in aqueous solution if αN and βC were expressed separately (Nakai et al, 2003b). This may be the reason why coexpression of the two subunits is essential for obtaining functional recombinant P‐protein (Nakai et al, 2003b).
The αN‐subunit is composed of three parts: an N‐terminal arm (N‐arm) (residues 1–65), a large α/β domain (residues 79–337) and a small α+β domain (residues 66–78 and 338–438) (Figures 3C and 4). The N‐arm, which contains two α‐helices, makes contact with the other three subunits. The N‐arm is connected to the large domain via a loop belonging to the small domain. The large domain consists of a central seven‐stranded β‐sheet (a, g, f, e, d, b and c) plus nine α‐helices, which surround the sheet on both the interior and surface of the protein, and a small two‐stranded antiparallel β‐sheet (h and i). The small domain consists of a four‐stranded antiparallel β‐sheet (a′, b′, c′ and d′) and three α‐helices that make up part of the protein surface. The βC‐subunit is composed of four parts: an N‐arm (residues 439–505), a large domain (residues 518–775), a small domain (residues 506–517 and 776–889) and a C‐terminal arm (residues 890–912) (note that the residue numbering starts from 439, as described in the legend to Figure 4). The former three domains correspond to those of αN and share common structural features described above except for a few short helices (Figures 3 and 4). Major structural features found only in βC are as follows. The cofactor PLP is covalently attached to the side chain of Lys704β. The C‐terminal arm, which is found only in βC and which contains an α‐helix, makes contact with αN.
Structural comparison of P‐protein with other PLP‐enzymes
A comparison of P‐protein with structures in the Protein Data Bank using the program DALI (Holm and Sander, 1993) produced a list of 62 structures having Z‐scores (indicating the degree of structural similarity) better than 14.1. (Supplementary Table I; the highest Z‐score for excluded structures is 7.0). All of these structures belong to fold‐type I, indicating that P‐protein should be classified as a member of fold‐type I. However, since all the fold‐type I enzymes other than P‐protein occur as an αI2‐type active dimer or its multiple, the αNβC‐type active dimer observed in the P‐protein is a remarkable exception.
The highest Z‐scores for αN and βC of P‐protein were calculated to be 28.3 and 31.5, respectively, and both were from comparisons with αI of glutamate decarboxylase (GluDC) from E. coli (Capitani et al, 2003), indicating that GluDC is the closest homolog with a known three‐dimensional structure. While GluDC forms an αI6 hexamer (Figure 2D), unlike P‐protein, the overall polypeptide fold of P‐protein is indeed similar to that of GluDC (Figure 3E); the superimposition of αN and βC onto αI of GluDC yields r.m.s.d.s of 3.8 and 3.0 Å, and sequence identities of 12 and 12% for 354 and 369 Cα atoms, respectively. The most notable differences between these structures, other than subunit compositions, are found at the N‐terminal half of each N‐arm (Figure 3A and B). The αN‐ and βC‐subunits of P‐protein and αI of GluDC have an N‐arm that crosses over and embraces the adjacent subunit. In GluDC, the N‐terminal half (residues 1–30) of the N‐arm mostly participates in interactions with the neighboring dimer, leading to trimerization of the αI2 dimers; in P‐protein, residues 1–39α and 439–447β participate in interactions with the adjacent subunit, contributing to dimerization of αN and βC. These regions also participate in interactions with the other dimer (αN′βC′ dimer) and thus participate in dimerization of αNβC dimers. Therefore, P‐protein and GluDC are similar in that their N‐termini are involved in assembly of active dimers, in spite of differences in the final assembly of subunits (Figure 3A and B).
Open–closed conformational changes upon binding of the cofactor PLP
Binding of PLP to Papo induces an open–closed conformational change involving primarily two regions that are adjacent to the active site, with backbone shifts of up to 13.5 Å (Figure 5). The first region is a loop that includes an α‐helix (helix 10) from the αN large domain (residues 305–322α), which we will refer to as the ‘mobile loop’. The second region is composed of a subdomain from the βC large domain (residues 591–671β), which we will refer to as the ‘mobile subdomain’. The P‐protein Cα atoms, except for those of the mobile loop and mobile subdomain, plus a few other residues (described later), in the open and closed forms, were superimposed by least‐squares fitting, giving an r.m.s.d. of 0.37 Å, with a maximum displacement of 1.15 Å for 797 Cα atoms (Figure 5B). In contrast to the good agreement of the Cα atoms of these residues, larger differences were observed in the loop and subdomain Cα atoms, which gave r.m.s.d.s of 7.31 and 1.28 Å, with maximum displacements of 13.5 Å (Ile309α) and 2.5 Å (Arg636β), respectively.
The mobile loop in Papo contains an α‐helix (helix 10). Binding of PLP induces conformational changes in the loop involving shortening of helix 10 (from Arg306α–Arg311α to Gln308α–Arg311α) and formation of two 310‐helices (helices η7 and η8; Figure 4A). The mobile subdomain consists of three α‐helices (helices 7, 8 and 9) and three β‐strands (strands b, c and d). The motion of the subdomain can be approximated by a 4.0° rotation as a rigid body to close the active site, because the subdomain Cα atoms of Papo are superimposable onto those of Pholo with an r.m.s.d. of 0.30 Å and a maximum displacement of 1.13 Å for 81 Cα atoms (Figure 5B). Besides the loop and the subdomain, four regions (Glu33α–Pro38α, Pro267α–His268α, Ala569β–Gly570β and Glu739β–Glu740β) exhibit prominent conformational changes (Figure 5B). The first three regions are located around the active site and interact with PLP directly or indirectly via other mobile regions (Figure 5A). Notably, the peptide bond preceding Pro267α undergoes trans–cis isomerization. Ala569β–Gly570β move up to 4.1 Å, allowing their main‐chain NH groups to form hydrogen bonds with the phosphate group of PLP. In contrast, the fourth region (Glu739β–Glu740β) lies at a molecular surface distant from the active site and is involved in crystal packing environments that differ between the Papo and Pholo forms, resulting in different conformations in this region.
Cofactor binding to proteins often results in conformational changes. Another PLP‐enzyme, 8‐amino‐7‐oxononanoate synthase, also shows similar conformational changes (Alexeev et al, 1998), including movement of two regions corresponding to the mobile loop and subdomain of the P‐protein. However, the conformational changes in 8‐amino‐7‐oxononanoate synthase involve smaller backbone shifts of up to 5.3 Å (except for the C‐terminus) as compared with 13.5 Å in the P‐protein. This is likely due to the absence of refolding of the helices, as observed in the P‐protein. The mobile loop of the P‐protein also corresponds to a disordered loop found in the crystal structure of DOPA decarboxylase (Burkhard et al, 2001). The disordered loop includes highly conserved residues that are essential for catalytic activity (Ishii et al, 1996). Bertoldi et al (2002) have reported that one of the loop residues is responsible for protonation of the decarboxylated reaction intermediates to form biological amines. It is also notable that in P‐protein, highly conserved residues are located in the mobile loop (Figure 4). These data imply that the mobile loop may also play a catalytic role in the latter catalytic steps carried out by the P‐protein (Supplementary Figure 1C–F).
The active site
The (αNβC)2 P‐protein tetramer has two active‐site pockets at the αN–βC and αN′–βC′ interfaces (Figure 2). Each pocket is situated also at the domain interface of βC (Figure 3D). The residues constituting the pocket can be considered as two regions. The first region is the bottom of the active‐site pocket, which consists of residues at one end of the seven‐stranded β‐sheet from the βC large domain (Gly570β, Ala571β, Glu574β, His604β, Ser606β, Thr648β, Thr652β, Asp677β, Ala679β, Asn680β, His699β, Asn701β, His703β and Lys704β) (Figure 6A). The second region is the top of the pocket, which consists of residues from the βC small domain (Ser511β and Cys512β) and from the αN large domain (Tyr95α, Thr96α, Tyr98α, Gln308α, Tyr309α, Thr320α and Thr321α) (Figure 6B). Of the 23 residues comprising these two regions, 20 are strictly conserved (Figure 4), suggesting that structure and function of the active site are essentially conserved among the P‐proteins.
The cofactor PLP is bound to the active‐site pocket by extensive noncovalent interactions with all the bottom residues of the pocket and with some of the top residues (Ser511β, Tyr95α, Thr320α and Thr321α). PLP also forms an internal aldimine bond (Schiff base linkage) with the catalytic residue Lys704β (Figure 6B and Supplementary Figure 1A). The pyridine ring of PLP is sandwiched by the methyl group of Ala679β and the imidazole ring of His604β. The pyridine ring of the cofactor makes an angle of 14.9° with the imidazole ring of His604β. The phosphate group of PLP is involved in a number of hydrogen bonds and one ion pair, and acts as an anchor to fix the cofactor to the active site. The negative charge of the phosphate group is likely to be balanced by the positive charge of His703β and the dipole of helix 6 of βC, which is positioned with its N‐terminus close to the phosphate group. In addition, Asp677β forms an ion pair with the protonated nitrogen atom of the pyridine ring of PLP, as in other PLP‐enzymes, and Ser511β forms hydrogen bonds with the O3 atom of PLP.
Binding of substrate
The substrate‐binding site of the P‐protein consists mostly of the top residues of the active‐site pocket (Tyr95α, Thr96α, Tyr98α, Gln308α, Tyr309α, Thr320α, Ser511β, Cys512β and His604β) (Figure 6); in the Pholo·AOA complex, the AOA moiety makes van der Waals contacts with these residues. AOA is a substrate analog in which the amino (NH2) group of glycine is replaced by an aminooxy (ONH2) group.
In the substrate‐free form, seven water molecules (W1–W7) in the substrate‐binding site are involved in hydrogen‐bond networks, where W1–W4 together with the OH group of Tyr95α form a linear hydrogen‐bond network (Figure 6B). This network is connected to the phosphate group of PLP through W1 and to the side‐chain C=O of Gln308α and the main‐chain NH of Gly605β through W4.
Upon binding of AOA, two water molecules (W1 and W2) are liberated from the substrate‐binding site and the cofactor makes a new covalent bond with AOA in place of Lys704β (Figure 6C and Supplementary Figure 1A and B). This may be an aldimine bond, based on the good agreement between the substrate model and electron density (Figure 6C). In this case, this bond corresponds to the external aldimine shown in Supplementary Figure 1B. The released ε‐amino group of Lys704β makes an ion pair with the phosphate group of PLP. While a new hydrogen‐bond network is formed between AOA and the surrounding residues, the active‐site residues, except for Lys704β, do not change their positions, and the interactions among them are retained upon AOA binding; the corresponding active‐site residues of Pholo and Pholo·AOA were superimposed with an r.m.s.d. of 0.15 Å and with a maximum displacement of 0.55 Å (Figure 6D).
The carboxylate of AOA occupies the space formed by the liberated water molecules. One of the carboxylate oxygen atoms of AOA forms hydrogen bonds with the OH group of Tyr95α and with the OH group of Thr96α through W6, and the other forms hydrogen bonds with the OH group of Tyr98α, with the NH2 group of Gln308α and with W3. W3 is also involved in the hydrogen‐bond network described above. The N1 atom of AOA forms a hydrogen bond with the imidazole Nε2 atom of His604β. The pyridine ring of the cofactor and the imidazole ring of His604β rotate by 18.3 and 6.3° toward the solvent side, respectively, compared with their positions in the free enzyme (Figure 6D).
AOA is fixed at the substrate‐binding site by extensive interactions with the active‐site residues. The AOA and glycine structures are nearly identical except for the one extra oxygen atom proximate to the amino group, which suggests that the binding mode of AOA is similar to that of glycine but is not strictly identical, due to displacement (up to ∼1.5 Å) of the distal carboxylate. Nevertheless, the pocket residues surrounding AOA are likely to be involved in substrate recognition. This is supported by the facts that all residues involved in the above interactions with AOA are highly conserved (Figure 4) and that the orientation of the carboxylate is appropriate for decarboxylation. The latter fact is based on the model for control of the reaction specificity of PLP‐enzymes (Dunathan, 1966); the scissile bond of the external aldimine intermediate must be oriented perpendicular to the π system of the cofactor. In the Pholo·AOA structure, the carboxylate of the inhibitor is in an orientation approximately orthogonal to the plane of the cofactor ring (Figure 6C and Supplementary Figure 1B).
The lipoamide‐binding channel and the surface properties neighboring its entrance
After formation of the external aldimine between PLP and glycine, P‐protein catalyzes a second reaction: the decarboxylation and reductive aminomethylation of lipoamide (Supplementary Figure 1). The lipoamide is presented to P‐protein in the form of a 16 Å long lipoyl‐lysine arm of H‐protein (Figure 7). Molecular recognition and interaction between H‐ and P‐proteins have been suggested by biochemical experiments, in which H‐apoprotein lacking lipoamide was found to behave as a competitive inhibitor relative to H‐protein in the second reaction (Neuburger et al, 2000). The active site of P‐protein is connected to the molecular surface by an ∼18 Å deep channel with a broad entrance facing the solvent (Figure 7). The channel is lined by residues from both αN and βC (Figure 7D), most of which are hydrophobic residues that could interact with the aliphatic chain of the lipoyl‐lysine. In the model structure (Figure 7D), the disulfide of the lipoamide lies close to the side chains of Tyr95α, Tyr98α, Gln308α and His604β. While these residues should be involved in the substrate recognition as described above, one of the residues may also act as a proton donor in the reduction of the disulfide bond (Supplementary Figure 1D). Note that Gln308α is located at the mobile loop, and its counterpart in DOPA decarboxylase is important for catalysis, as described above.
The molecular surface neighboring the entrance of the channel possesses two remarkable properties. As shown in Figure 7B and C, the molecular surface of the P‐protein reveals that conserved residues cluster around the entrance of channel and that several of them are positively charged (Lys146α, Arg280α, Arg306α, Arg311α, Arg312α, Lys314α, Lys316α, Arg894β and Lys902β). In contrast, negatively charged residues are situated around the lipoyl‐lysine arm of the H‐protein, and most of them are highly conserved (Nakai et al, 2003a). The data imply that these residues play a critical role in the interaction between the P‐ and H‐proteins.
The structure of P‐protein presented here reveals that the enzyme is present as an (αNβC)2 tetramer containing a unique arrangement of αNβC‐type active dimers. The molecular mass of the tetramer is consistent with dynamic light scattering results in aqueous solution (Nakai et al, 2003b), indicating that the tetramer is the native oligomeric state. P‐protein has been classified as a member of fold‐type I PLP‐enzymes, based on sequence similarities (Grishin et al, 1995), and this is now confirmed experimentally. All other fold‐type I enzymes with known structures are intimate αI2 dimers, each with an internal two‐fold axis, or loose multiples thereof, where an αI2‐type dimer has two active sites at the αI–αI′ interface. In contrast, two intimate αNβC dimers in the P‐protein are related by a two‐fold axis to form an (αNβC)2 tetramer, in which each αNβC‐type dimer has only one active site at the αN–βC interface. The αN‐ and βC‐subunits show 24% sequence identity and have similar structures, and so the αNβC dimer appears to have an approximate two‐fold axis (Figure 3A) and to mimic an αI2 dimer such as those seen in GluDC (Figure 3B). These data suggest that the αNβC‐type active dimer of the P‐protein arose by gene duplication of a homodimeric ancestor, after which the ancestral P‐protein has structurally diverged such that the protein has been specifically adapted for use as a multienzyme complex component, even though this involved the loss of one active site. More specifically, in the other fold‐type I enzymes including the presumed homodimeric ancestor, the pocket size of the active site has remained fixed so that only small substrates have been recognized (Supplementary Table I). In the P‐protein, the pocket size had to expand to recognize macromolecular substrates such as H‐protein, and this may have required major rebuilding on a molecular scale, involving the loss of one active site, and thereby leading to an asymmetric active dimeric form for the enzyme.
In contrast to the Tth P‐protein, the human P‐protein occurs in a homodimeric form (αNC2). Our results suggest that the human P‐protein consists of ‘dimer‐like active monomers’, each of which corresponds to the αNβC‐type active dimer seen in the Tth P‐protein. Figure 8 shows a model structure of the human P‐protein. The structures of αNC‐N and αNC‐C of the human P‐protein model are essentially the same as those of αN and βC of the Tth P‐protein, respectively. The two halves of the model are connected by a 17‐residue stretch of residues from Ser492 to Ile508. The corresponding 31‐residue linker region in the Tth P‐protein stretches from residues 437α to 467β (Figure 4). These regions in the human and Tth P‐proteins do not exhibit significant sequence identity, and the different lengths of the linker regions indicate that the structures are locally divergent here. From the viewpoint of molecular evolution, we suggest that a dimer‐like active monomer (αNC) has evolved by gene fusion of the two subunits of the αNβC‐type active dimer. The two subunits had arisen by gene duplication and evolutionary divergence, as described in the previous paragraph. This hypothesis is supported by the higher sequence identity between the (αNβC)2‐tetrameric and αNC2‐dimeric P‐proteins; αNC‐N and αNC‐C share ∼31 and ∼37% sequence identities with αN and βC, respectively, compared to the sequence identities of ∼24 and ∼13%, which resulted from alignments between αN and βC, and between αNC‐N and αNC‐C, respectively. Many mutations in the human P‐protein gene have been identified in NKH patients: a three‐base deletion resulting in deletion of Phe756 (F756del) (Kure et al, 1991), 10 missense mutations (A283P, A313P, P329T, R410 K, P700A, G762R (Toone et al, 2002), R515S (Toone et al, 2000), S564I (Kure et al, 1992), G761R (Kure et al, Am J Hum Genet 65, A425 (1999), meeting abstract) and A802V (Korman et al, 2004)) and two mutations (Takayanagi et al, 2000; Kure et al, 2002) resulting in highly altered polypeptides. Most of these 13 mutations have been found in single patients, but three are recurrent mutations (Applegarth and Toone, 2001): S564I and G761R in 78% (42/54) of alleles of Finnish patients and R515S in 5% of alleles of non‐Finnish patients. Meanwhile, our results support the hypothesis that the human and Tth P‐proteins share a similar three‐dimensional structure. They also allow us to identify functionally important residues involved in recognition of the cofactor, substrate and possibly the H‐protein, and show that most of these residues are well conserved among these P‐proteins. Hence, the catalytic mechanisms should be essentially the same for both the Tth and human P‐proteins.
We will now discuss the possible molecular mechanisms of the human P‐protein, with our arguments based on the Tth P‐protein structure, and we will also explore how pathogenic mutations (Figure 8) lead to loss of enzymatic activity. The Tth P‐protein residue Phe706β (Phe756 in the human P‐protein; note that the corresponding residue number in the human P‐protein is shown in parentheses) is fully conserved (Figure 4), and is situated close to the active site, where it is only two residues from the catalytic residue Lys704β; thus, F756del would disrupt the active site. Ala221α (Ala283) and Ala251α (Ala313) lie in the middle of helix 8 and strand f, respectively, and A283P or A313P would therefore break these secondary structures. Pro267α (Pro329) is fully conserved and is involved in the conformational changes upon binding of PLP, suggesting that P329T affects binding of the cofactor. Pro650β (Pro700) is a fully conserved residue in the conserved sequence motif MhTxPxT, where ‘x’ indicates any amino acid and ‘h’ is a hydrophobic residue. This proline residue adopts the cis‐conformation and lies adjacent to the active site, and therefore P700A is likely to affect the active‐site structure. Arg474β (Arg515) is a fully conserved residue located at the αN–βC interface. The side chain of Arg474β forms hydrogen bonds with the main‐chain C=O groups of Glu119α and Leu120α, both of which are situated at the C‐terminus of helix 5. The dipole of this helix is well balanced with the positive charge of Arg474β. Hence, R515S would destabilize the corresponding interface. Pro518β (Ser564) is located in tightly packed structure proximate to the conserved sequence motif PLGSCTMKhN constituting the active site, suggesting that S564I affects the active‐site structure. Gly711β (Gly761) and Gly712β (Gly762) are fully conserved residues in the sequence motif PHGGGGPG. Both residues lie in a tightly packed structure at the αN–βC interface neighboring the active site. In G761R or G762R, the introduction of bulky side chains likely hampers the subunit association at the corresponding interface and disrupts the active site. Therefore, there are compelling structural arguments that can explain the observed loss of enzymatic P‐protein activity associated with each of these mutations. There is, however, no structural explanation for the effect of the other two mutations: A802V is situated at a region where the sequences of the two proteins are locally divergent, and the residue altered by the mutation R410 K aligns with Lys348α in the Tth P‐protein. Nevertheless, in most of the cases, the Tth P‐protein structure provides a molecular basis for understanding how the NKH mutations lead to loss of enzymatic activity.
In conclusion, the Tth P‐protein structure greatly aids in understanding the molecular pathology of NKH in humans. Plausible arguments have been presented for the molecular basis of point mutations associated with this disease, including three recurrent mutations (Applegarth and Toone, 2001). The structural information will be useful in interpreting other NKH mutations that may be identified in the future. Furthermore, our work on P‐protein provides a structural characterization of the final GCS component with an unknown structure. It gives new insights into the molecular evolution of P‐proteins, provides a structural basis for understanding the architecture and function of this fascinating complex, and should stimulate further structure/function studies.
Materials and methods
The expression of Tth P‐protein by E. coli, and purification and crystallization of the expressed protein have been reported elsewhere (Nakai et al, 2003b). Briefly, the purification was accomplished by heat treatment and four‐step column chromatography using 20 mM Tris–HCl buffer (pH 8.0). The purified protein was crystallized by vapor diffusion at 298 K using 30% (w/v) polyethylene glycol (PEG) 3350 and 300 mM KSCN as the precipitant (form I in Table I). However, since the crystallization rate of form I proved to be insufficient for structure determination by using heavy‐atom derivatives, we searched for new crystallization conditions to find a new crystal form more suitable for structure determination (form II in Table I).
The purification conditions for form II were essentially the same as those for form I except that 20 mM Tris–HCl buffer (pH 8.0) was replaced with 10 mM sodium phosphate buffer (pH 7.0). The purified protein was crystallized by vapor diffusion at 277 K using 22% (w/v) PEG 3350, 300 mM Li2SO4 and 100 mM MES–NaOH (pH 6.5) as the precipitant. A droplet containing 2 μl protein solution (10 mg ml−1 protein, 20 mM HEPES–NaOH (pH 8.0), 200 μM PLP and 1 mM DTT) was mixed with an equal volume of reservoir solution and equilibrated against 400 μl of reservoir solution to give crystals of form II. Finally, the structures of forms I and II were revealed to be Papo and Pholo, respectively.
Crystal soaking and data collection
A crystal of Pholo was soaked for 3 h in artificial mother liquor containing an inhibitor, AOA (26% (w/v) PEG 3350, 300 mM Li2SO4, 5 mM AOA, 100 mM MES–NaOH (pH 6.5)), generating a crystal of Pholo·AOA. A mercury derivative was prepared by soaking a crystal of Pholo in the same liquor containing 1 mM methylmercury chloride (CH3HgCl) for 3 h. All soaking experiments were performed at 277 K.
Prior to data collection, each crystal of Pholo, Pholo·AOA and the mercury derivative was soaked for a few seconds in a cryoprotectant solution containing 18% (w/v) PEG 400, 22% (w/v) PEG 3350, 300 mM Li2SO4 and 100 mM MES–NaOH (pH 6.5) and then transferred to a nylon loop and flash‐cooled in a nitrogen gas stream at 90 K. The data collection preparation for Papo was the same as above, except that its cryoprotectant solution contained 35% (w/v) PEG 3350 and 300 mM KSCN. X‐ray diffraction data were collected at 90 K using synchrotron radiation from BL45XU (Yamamoto et al, 1998) or BL44B2 (Adachi et al, 2001) at SPring‐8 (Table I). Data processing was completed using the program HKL2000 (Otwinowski and Minor, 1997) (Table I).
The structure of Pholo was solved by single isomorphous replacement with anomalous scattering (SIRAS) with a CH3HgCl derivative. A total of 11 mercury sites of the derivative were identified and refined at 3.38 Å resolution by the program SOLVE (Terwilliger, 2002), which gave an overall figure of merit of 0.32. The SOLVE phases were subsequently input to the program RESOLVE (Terwilliger, 2002) for density modification and automatic model building. RESOLVE gave an overall figure of merit of 0.48 and built 79% of the peptide backbone in the asymmetric unit including 62% of the side chains. The electron density map after density modification was of good quality, and the remaining model was built manually into the map using the program Xfit within the software package XtalView (McRee, 1992).
The structure of Papo was solved by molecular replacement using the program AMoRe (Navaza, 1994), using the refined coordinates of Pholo as the search model. The model was constructed from the (αNβC)2 tetramer, in which all the PLP and water molecules were removed. A rotational search using data from 15 to 4 Å resolution gave a distinct solution, which was followed by a translational solution showing an (αNβC)2 tetramer in the asymmetric unit with an Rfactor of 34.1%.
The structure of Pholo was refined by simulated annealing and energy minimization with the program CNS (Brunger et al, 1998). After several rounds of refinement and manual rebuilding, the difference Fourier map clearly exhibited residual electron density corresponding to the bound PLP. The electron density of PLP was connected to the ε‐amino group of Lys704β (Figure 6B), indicating that PLP is covalently bound to Lys704β via an aldimine bond, like those observed for other PLP‐enzymes. Water‐molecule picking using the program CNS and further model building and refinement cycles produced the final model (Table I). The structure of Pholo·AOA was refined using the same procedure as that used for Pholo. The refined coordinates of Pholo, except for the PLP and water molecules, were used as the initial model. When the Rwork was below ∼25%, the difference Fourier map clearly exhibited residual electron density corresponding to the AOA and PLP molecules. The electron density of AOA was connected to the C4′ atom of PLP and the density corresponding to the linkage between PLP and Lys704β found in Pholo was absent (Figure 6C). This result indicates that the internal aldimine between PLP and Lys704β is cleaved and a new external aldimine is formed between PLP and AOA. The structure of Papo was refined using the same procedure as Pholo, except that the coordinates obtained by molecular replacement were used as the initial model. No electron density for PLP was observed, showing that the crystals represented the apo‐form of the enzyme.
Analysis of the stereochemistry showed that every model was of good quality, with more than 99.0% of the residues falling in the allowed regions (Table I). Eight proline residues of each heterodimer (205α, 267α, 373α, 650β, 715β, 733β, 845β and 887β), except for Pro267α in Papo, were found in the cis‐conformation.
Graphical representations of protein models were generated by MOLSCRIPT (Kraulis, 1991) (Figures 2, 3 and 5, 6, 7 and 8), Xfit (McRee, 1992) (Figure 6) and GRASP (Nicholls et al, 1991) (Figure 7), and rendered with Raster3d (Merritt and Bacon, 1997) (Figures 2, 3 and 5, 6, 7 and 8).
Supplementary data are available at The EMBO Journal Online.
We thank Dr H Naitow, Dr Y Kawano, Mr T Matsu and Mr H Nakajima (RIKEN Harima Institute) for their help with data collection at SPring‐8. This work was supported in part by the Special Postdoctoral Researchers Program at RIKEN to TN, by a JSPS Research Fellowship for Young Scientists (no. 15‐03803) to TN and by a Grant‐in‐Aid for Young Scientists (B) from the Ministry of Education, Culture, Sports, Science and Technology of Japan (13780495) to TN. Coordinates and structure factors have been deposited in the Protein Data Bank under accession codes 1WYT, 1WYU and 1WYV for Papo, Pholo and Pholo·AOA, respectively.
- Copyright © 2005 European Molecular Biology Organization