Crystal structure of the bifunctional N‐acetylglucosamine 1‐phosphate uridyltransferase from Escherichia coli: a paradigm for the related pyrophosphorylase superfamily

Kieron Brown, Fredérique Pompeo, Suzanne Dixon, Dominique Mengin‐Lecreulx, Christian Cambillau, Yves Bourne

Author Affiliations

  1. Kieron Brown1,
  2. Fredérique Pompeo2,
  3. Suzanne Dixon1,
  4. Dominique Mengin‐Lecreulx2,
  5. Christian Cambillau1 and
  6. Yves Bourne*,1
  1. 1 AFMB‐CNRS, 31 chemin Joseph Aiguier, 13402, Marseille, Cedex 20, France
  2. 2 CNRS, Biochimie Structurale et Cellulaire, Université Paris‐Sud, 91405, Orsay, Cedex, France
  1. *Corresponding author. E-mail: yves{at}
View Full Text


N‐acetylglucosamine 1‐phosphate uridyltransferase (GlmU) is a cytoplasmic bifunctional enzyme involved in the biosynthesis of the nucleotide‐activated UDP‐GlcNAc, which is an essential precursor for the biosynthetic pathways of peptidoglycan and other components in bacteria. The crystal structure of a truncated form of GlmU has been solved at 2.25 Å resolution using the multiwavelength anomalous dispersion technique and its function tested with mutagenesis studies. The molecule is composed of two distinct domains connected by a long α‐helical arm: (i) an N‐terminal domain which resembles the dinucleotide‐binding Rossmann fold; and (ii) a C‐terminal domain which adopts a left‐handed parallel β‐helix structure (LβH) as found in homologous bacterial acetyltransferases. Three GlmU molecules assemble into a trimeric arrangement with tightly packed parallel LβH domains, the long α‐helical linkers being seated on top of the arrangement and the N‐terminal domains projected away from the 3‐fold axis. In addition, the 2.3 Å resolution structure of the GlmU–UDP‐GlcNAc complex reveals the structural bases required for the uridyltransferase activity. These structures exemplify a three‐dimensional template for the development of new antibacterial agents and for studying other members of the large family of XDP‐sugar bacterial pyrophosphorylases.


UDP‐GlcNAc, the cytoplasmic nucleotide‐activated form of N‐acetylglucosamine (GlcNAc), is an essential precursor for the peptidoglycan and lipopolysaccharide biosynthesis pathways in Gram‐positive and Gramnegative bacteria, respectively (Raetz, 1996). In Gram‐negative bacteria, this precursor is found in the enterobacterial common antigen and some O antigen (Rick and Silver, 1996; van Heijenoort, 1996), whereas in Gram‐positive bacteria it is found in teichoic acids (Fisher, 1990). The assimilation of GlcNAc, N‐acetylmannosamine and N‐acetylneuraminic acid in bacteria all converge at the step of N‐acetylglucosamine‐6‐phosphate (GlcNAc‐6‐P) for the biosynthesis of UDP‐GlcNAc (Plumbridge and Vimr, 1999), suggesting that enzymes involved in its biosynthesis represent attractive targets for the development of new antibacterial agents. Conditional lethal mutants of Escherichia coli altered in the biosynthesis of UDP‐GlcNAc were characterized by a cell lysis phenotype under restrictive growth conditions (White, 1968; Sarvas, 1971; Wu and Wu, 1971). The four‐step formation from fructose‐6‐P to UDP‐GlcNAc is fully elucidated in E.coli and involves three enzymes: (i) a GlcN‐6‐P synthase (GlmS) (Dutka‐Malen et al., 1988); (ii) a phosphoglucosamine mutase (GlmM) (Mengin‐Lecreulx and van Heijenoort, 1996); and (iii) a GlcN‐1‐P acetyltransferase and GlcNAc‐1‐P uridyltransferase (GlmU) (MenginLecreulx and van Heijenoort, 1994) which catalyses the two‐step mechanism in the presence of MgCl2, according to the following reactions:

Embedded Image

Escherichia coli GlmU, a cytoplasmic 49 kDa protein, represents the first reported bifunctional enzyme in the cytoplasmic steps of the peptidoglycan and lipopolysaccharide synthesis pathways. The recombinant protein has been overexpressed and purified to homogeneity and shown to be dimeric/trimeric in solution (Mengin‐Lecreulx and van Heijenoort, 1993). Homologue sequences were reported for Haemophilus influenzae (Fleischmann et al., 1995), Neisseria gonorrhoeae (Ullrich and van Putten, 1995) and Bacillus subtilis (Hove‐Jensen, 1992) with sequence identities of 69, 54 and 42%, respectively (Figure 1). The enzyme activity is maximal in the presence of magnesium and phosphate ions; in addition, the acetyltransferase activity is inactivated by thiol‐specific reagents (Mengin‐Lecreulx and van Heijenoort, 1994; Pompeo et al., 1998), indicating that cysteine residues might be involved in the acetyl transfer reaction. Mutagenesis experiments, aimed at investigating the role of the four cysteine residues of the E.coli GlmU enzyme, have demonstrated the important role of Cys307 and Cys324 for acetyltransferase activity (Pompeo et al., 1998), but have not revealed the exact location of the active site or the catalytic residues involved in the uridyltransferase activity. GlmU has also been used to synthesize azido‐substituted nucleotide‐sugar analogues for the identification and characterization of glycosyltransferases (Sunthankar et al., 1998) and to produce large amounts of N‐acetyl‐labelled UDP‐GlcNAc for industrial purposes (Leiting et al., 1998).

Figure 1.

GlmU sequence conservation, secondary structure, fold and function. The sequence of the enzyme of E.coli is aligned with that of three bacterial species, H.influenzae (H.i.), N.gonorrhoeae (N.g.) and B.subtilis (B.s.). Invariant and conserved residues in all sequences are highlighted in white with a red background and white with a yellow background, respectively. Secondary structure elements of E.coli GlmU are indicated beneath the sequences with the consensus sequence motif (red bar). Residue functions implied by the GlmU structure are identified by triangles that are black for mutated residues, green for residues in the acetyltransferase domain, and red for those in the pyrophosphorylase domain. Residues buried at the trimer interface are displayed as filled black circles, and residues involved in UDP‐GlcNAc binding as filled orange and blue circles for those in contact with the nucleotide and the sugar moiety, respectively. The first and last residue well defined in the electron density map, Asn3 and Ile326, are indicated as filled magenta triangles. Deletions are indicated by dashes.

Biochemical experiments (Mengin‐Lecreulx and van Heijenoort, 1994; Gehring et al., 1996) have shown that GlmU is organized into two domains: (i) a pyrophosphorylase N‐terminal domain that shares homology with other pyrophosphorylase (PPase) enzymes over residues Met1–Ala120 with the strict conservation of the motif G‐X‐G‐T‐(RS)‐(X)4‐P‐K; and (ii) an acetyltransferase domain that possesses 23 times the hexapeptide repeat (LIV)‐(GAED)‐X2‐(STAV)‐X typically found in other bacterial acetyltransferases (Vaara, 1992).

In eukaryotes, the equivalent UDP‐GlcNAc biosynthesis pathway is accomplished by four distinct enzymes, two of which are involved in the reaction catalysed by GlmU: (i) a UDP‐N‐acetylglucosamine PPase (Uap1p) that preserves the consensus sequence motif found in all PPase amino acid sequences (Mio et al., 1998); and (ii) a glucosamine‐6‐phosphate acetyltransferase (GNA1) (Mio et al., 1999), which possesses little structural homology with other known bacterial acetyltransferases. The UDP‐GlcNAc nucleotide‐sugar is a substrate of chitin synthase whose product is essential for fungal cell wall, and the GlcNAc moiety is found in N‐linked glycosylation and the glycerophosphatidylinositol (GPI) anchor of cellular proteins (Cabib et al., 1982; Herscovics and Orlean, 1993). Human UDP‐GlcNAc PPase, AGX1, which causes human male infertility (Diekman and Goldberg, 1994), is able to synthesize both UDP‐GlcNAc and UDP‐GalNAc by alternative splicing of the AX1 gene resulting in a 17 amino acid insertion in the C‐terminal region of the protein (Wang‐Gillam et al., 1998). Other nucleotide sugars, such as UDP‐Glc, play a crucial role in the quality control of newly synthesized glycoproteins (Hammond and Helenius, 1995). Deficiencies in the nucleotide‐sugar UDP‐Glc occur in insulin‐dependent tissues of diabetic organisms. A single mutation, Gly115 to Asp, dramatically impairs the enzymatic activity of Chinese hamster UDP‐Glc PPase (Flores‐Diaz et al., 1997); yet the consequences of this deficiency remain unclear.

The three‐dimensional structures of four bacterial acetyltransferases are known, but none contain a PPase domain. Moreover, the three‐dimensional structures of the sugar‐phosphate transferring enzymes, galactose 1‐phosphate uridyltransferase (Wedekind et al., 1995) and kanamycin nucleotidyltransferase (Pedersen et al., 1995), are known but these enzymes do not activate sugars and differ structurally from GlmU. To understand the catalytic mechanism of GlmU and provide a scaffold for the development of new bacterial antibiotics, we have determined the crystal structures of a truncated form of GlmU (GlmU‐Tr) at 2.25 Å resolution and of GlmU‐Tr bound to UDP‐GlcNAc at 2.3 Å resolution. Coupling of the structural data with mutagenesis studies reveals the structural and functional duality of the enzyme and the structural determinants responsible for the uridyltransferase activity.

Results and discussion

Structure determination

Conditions were found that yielded crystals of the recombinant entire 147 kDa trimeric GlmU, but these crystals did not diffract to a resolution sufficient for determination of a structure. Because of the occurrence of a stable spontaneous proteolytic fragment of GlmU‐Tr, consisting of residues Met1–Arg331 (Gehring et al., 1996), we next reproduced this truncated form of GlmU, which yielded an apparent molecular mass of 36 kDa on SDS–polyacrylamide gel and from which large, well‐ordered crystals diffracting up to 2.0 Å resolution were obtained (cf. Materials and methods). The crystals which belonged to the rhombohedral space group R32 (a = b = 142.7 Å and c = 248.1 Å) with two GlmU‐Tr molecules per asymmetric unit were used to solve the structure by the multiple wavelength anomalous dispersion (MAD) method from a mercury derivative. A representative portion of the experimental electron density map is shown (Figure 2A). The GlmU‐Tr and GlmU–UDP‐GlcNAc complex structures have a crystallographic R‐factor of 23.4 and 22.3% (Rfree = 27.4 and 26.3%) at 2.25 and 2.3 Å resolution, respectively, and have good stereochemistry (Tables I and II). In each of the two structures, the two independent molecules present in the asymmetric unit are very similar within the two domains, but show a different overall bend as a consequence of different packing forces in the crystal. The first two N‐terminal and last five C‐terminal residues are ill‐defined in the two molecules and were not incorporated into the models. In addition, side chains of the surface loop Gly140–His155, located in the N‐terminal domain of one molecule, are poorly defined in the electron density maps.

Figure 2.

(A) Experimental electron density map for a GlmU LβH domain region at 2.8 Å resolution based on the solvent‐flattened, NCS‐averaged MAD phases (contoured at 1.2σ); the current model is superimposed and shows residues Gly264–Glu281 located in coil C2 (see B). (B) Sequence alignment of the LβH domain of GlmU identifying equivalent residues in each coil. The nomenclature PB1, PB2, PB3 and T1, T2, T3 denotes parallel β‐strands and turn residues, respectively (Yoder et al., 1993). The conserved, hydrophobic residues at position i are boxed. Residues in left‐handed conformation at position i + 3 are displayed in bold. This alignment is derived from the present structure for residues Val252–Pro328, and from a model for residues Pro328–Gly424. L1 denotes the loop region Arg333–Lys352 and L2 a second loop region, Asn375–Lys394, in which the hexapeptide repeat sequence is not found.

View this table:
Table 1. Data collection statistics
View this table:
Table 2. Refinement statistics

Overall structure of the molecule

GlmU–Tr consists of two separate domains, formed from contiguous segments in the amino acid sequence and linked by a long α‐helical arm, and has overall dimensions of 45×58×66 Å (Figure 3). The N‐terminal domain comprises residues Asn3–Asn227 and consists of a central, seven‐stranded mixed β‐sheet (β1–β7), of which six are parallel, surrounded by six helices (α1–α6), a fold reminiscent of the dinucleotide‐binding Rossmann fold (Rossmann et al., 1975). At one of its ends, the central β‐sheet is topped by a two‐stranded β‐sheet (β5a and 5b) that participates in a 30 residue connection between strands β5 and β6. Comparison of this N‐terminal domain with the DALI database of protein structures (Holm and Sander, 1995) did not reveal any homologous structures. The C‐terminal domain comprises residues Gly251–Arg331 and adopts the left‐handed parallel β‐helix (LβH) domain fold found in the structures of other bacterial acetyltransferases, which include E.coli UDP‐N‐acetylglucosamine acyltransferase (LpxA) (Raetz and Roderick, 1995), Methanosarcina thermophila carbonic anhydrase (Cam) (Kisker et al., 1996), tetrahydrodipicolinate N‐succinyltransferase (DapD) (Beaman et al., 1997) and Pseudomonas aeruginosa hexapeptide xenobiotic acetyltransferase (PaXAT) (Beaman et al., 1998). The LβH domain resembles a prism, each turn or coil containing three short β‐strands (PB1–PB3) of nearly equal length and no insertion loop at the corners (Figure 2). The first coil is unusual in that a histidine residue (His268) is present at the i position of the hexapeptide repeat sequence, a position which ordinarily is occupied by aliphatic residues (Leu, Ile and Val), and the T2 turn of coil C1 contains two additional residues compared with homologous structures. In addition, an unprecedented disulfide bridge is formed between Cys307 and Cys324 both located at the i + 4 position of β‐strand PB2 in two adjacent coils (Figure 2). The N‐ and C‐terminal domains are connected by a 21 residue α‐helical arm which is 33 Å long, lies perpendicular to the LβH domain axis and projects the N‐terminal domain away from the axis of the LβH domain. As a result, the only contacts between the N‐ and C‐terminal domains involve van der Waals interactions between the surface loop Ala31–Gly32 in the N‐terminal domain and the Arg263 side chain in the C‐terminal domain. In contrast, residues within the long α‐helical arm establish numerous interactions with residues in the two domains, owing to the protruding loop T2 that emanates from the LβH domain coil C1 and interacts with the helical linker (Figures 2 and 3).

Figure 3.

Ribbon diagram of the overall view of GlmU‐Tr showing the N‐terminal domain (Asn3–Asn227) with the central β‐sheet displayed in magenta, the additional two‐stranded β‐sheet in green, and α‐helices in blue; the α‐helical arm (Asn228–Ala250) is displayed in green and the C‐terminal LβH domain (Gly251–Ala330) in magenta.

Biochemical experiments, showing that the present truncated form of GlmU displays normal uridyltransferase activity but has lost acetyltransferase activity (Table III), suggest that GlmU has two separate active sites which can function independently of each other. Moreover, complementation with GlmU‐Tr using an E.coli strain containing a chromosomal disruption of the glmU gene reveals that the bacteria are not viable (Table III); hence the two distinct GlmU activities are required for bacterial viability.

View this table:
Table 3. Enzymatic activities of wild‐type and mutant GlmUa

The pyrophosphorylase domain and UDP‐GlcNAc‐binding site

The N‐terminal domain of GlmU‐Tr consists of a mixed seven‐stranded twisted β‐sheet with strand order 7564123, surrounded by three α‐helices on each side and an additional two‐stranded β‐sheet on one edge (Figure 3). The uridyltransferase active site, identified by the presence of soaked UDP‐GlcNAc, is a large open pocket, 21 Å long and 13 Å deep, which delineates two lobes of approximately equal length and separated by only 7 Å at the pocket entrance. The first lobe consists of residues Asn3–Val111 and surface loop His216–Asn227, and encompasses residues interacting with the nucleotide. The second lobe contains the remaining residues of the N‐terminal domain and encompasses residues interacting with the sugar moiety (Figure 4). The consensus sequence motif G‐X‐G‐T‐(R/S)‐(X)4‐P‐K, which is located in surface loop β1–α1 and is a signature of PPases, exposes Leu11, Gly14, Gly16, Arg18 and Lys25, along with the region Gln76–Thr82, towards the pocket, while the region Gly138–Asp157 and the sequence‐conserved motif Gln193–Tyr197 are located in the second lobe. Two distinct regions, Tyr103–Asp105 and Val223–Asn227, form the floor of the pocket. The importance of the pocket for substrate binding is also apparent from the complete exposure of the sequence‐conserved residues in or at its rim in addition to its electrostatic properties (Figure 4B).

Figure 4.

(A) Ribbon diagram of the N‐terminal domain showing the first lobe (Asn3–Val111) in blue and the second lobe (Glu112–Gln231) in magenta, the bound UDP‐GlcNAc as orange bonds, and the consensus sequence motif Gly14–Lys25 (with Arg18 and Lys25 side chains) in red. (B) Electrostatic potential mapped onto the molecular surface of the N‐terminal domain of GlmU (residues Asn3–Asn227) from −7 kT (red) to +7 kT (blue) with bound UDP‐GlcNAc within the surface pocket. The bottom of the pocket has the most electronegative surface potential (centre), with two discrete patches of electropositive potential [Lys156 and Arg153 (centre left) and Lys15 and Arg18 (centre right)] surrounding the pocket entrance. The UDP‐GlcNAc molecule is displayed as white bonds with red oxygen, blue nitrogen and magenta phosphorous as spheres, as well as a sulfate molecule (green bonds). (C) A close‐up stereo view of the N‐terminal domain, viewed in the same orientation as (A), with UDP‐GlcNAc (orange bonds), sulfate (green bonds) and side chain residues Arg18, Lys25 (red), Leu11, Ala13, Gln76 and Asp105 (nucleotide‐binding site, pink), Gly81, Asp105, Tyr139, Glu154, Asn169, Tyr197 and Thr199 (sugar‐binding site, green), and Gln193, Glu195, Asn227 and Gln231 (yellow). (D) Superposition of the N‐terminal domain of apo GlmU and the GlmU–UDP‐GlcNAc complex. Backbone regions that deviate significantly are highlighted with their associated side chains (yellow for apo and green for the complex). A rigid‐body motion of regions Val131–Gln166 and Val187–Tyr197 is observed, along with significant movements of the Tyr103, Tyr139 and Gln193 side chains within the active site.

GlmU binds UTP with high specificity, utilizing neither ATP, CTP nor TTP as substrates, although CTP is a weak inhibitor (data not shown). The exocyclic N3 and O4 ring atoms of uracil are anchored against the sequence‐conserved Gln76 side chain, while the exocyclic O4 position involves an additional hydrogen bond with the nitrogen backbone atom of Gly81 (Figures 4C and 5A). In the second lobe, all the hydroxyl groups, O2, O3, O4 and O5, of the GlcNAc moiety are directly hydrogen‐bonded to protein residues, with the exception of the O6 atom which is bound to the protein via a water molecule. The N‐acetyl arm, which distinguishes GlcNAc from Glc, is hydrogen‐bonded to both the Glu154 and Thr82 side chains, with the methyl group in van der Waals contacts with the Tyr197 side chain. Unlike the nucleotide and sugar moieties, which are bound to the protein, the phosphate groups are totally solvent accessible (Figures 4 and 5).

Figure 5.

(A) Schematic figure showing the main interactions between GlmU and UDP‐GlcNAc. (B) Amino acid alignment of residues in the regions Ala13–Lys25, Gln76–Thr82 and Met101–Val106 (nucleotide binding) and regions Gly138–Arg141, Val153–Lys156 and Ile168–Gly171 (sugar binding) of E.coli GlmU (accession No. p17114) compared with eight XDP‐sugar PPases: UDP‐Glc (p25520), dTDP‐Glc (p55253), GDP‐Man (p37741), ADP‐Glc (p00584), CMP‐NeuAc (p13266) and CMP‐KDO (p42216). (+) denotes the residues critical for UDP‐GlcNAc binding and (*) those mutated in the present study. A dash means that no significant homology was found in the corresponding region.

Site‐directed mutagenesis shows that substitution of an Ala residue for Arg18 within the consensus sequence motif dramatically impairs the uridyltransferase activity, whereas substitution of Ala residues for Gly14 and Lys25 induces only an 8‐fold decrease in this activity (Table III). In contrast, none of these three mutations alter the acetyltransferase activity, a result consistent with the two distinct activities of GlmU belonging to separate domains (Mengin‐Lecreulx and van Heijenoort, 1994; Gehring et al., 1996). Arg18 is located at the tip of loop β1–α1 near the pocket entrance and is within hydrogen bonding distance of a sulfate molecule, present in both the free and complex structures and separated by only 9 Å from the Pα of UDP‐GlcNAc; this suggests that this sulfate ion mimicks the Pγ position in UTP. Most importantly, Arg18 is located on top of the positively charged N‐cap of the long α‐helical arm, which provides a remarkable charge complementarity for phosphate binding. In contrast, Lys25 is located deep in the bottom of the pocket and is salt‐bridged to Asp105 and hydrogen‐bonded to Asn227, suggesting that Lys25 could stabilize the correct orientation of the Asp105 and Asn227 side chains for UTP recognition, whilst it may not be directly involved in the catalytic reaction. Based on our structure, the Gly14 to Ala mutation would create a steric conflict with the ribose moiety, associated with a higher Km value for UTP, a hypothesis which, however, is not supported by biochemical data (Table III); further study will therefore be needed to confirm the role of this residue unambiguously.

Superimposition of all Cα carbons of the UDP‐GlcNAc‐bound GlmU‐Tr onto those of the free structure results in an average r.m.s. deviation of 0.6 and 0.8 Å for the two molecules, respectively, indicating that significant structural changes occur upon UDP‐GlcNAc binding (Figure 4D). The side chain of the sequence‐conserved Gln193 is moved by 8 Å even though this residue is not directly involved in UDP‐GlcNAc binding. The Gln193 and Glu195 side chains are located in the second lobe, 5 Å away from the Pα phosphate group of UDP‐GlcNAc, and face Arg18, suggesting that they are not necessary for UTP recognition but could be involved in magnesium binding, which is known to be critical for the uridyltransferase activity of GlmU and other XDP‐sugar PPases (Fukui et al., 1993; Mengin‐Lecreulx and van Heijenoort, 1993). We conclude from these data that Lys25 and Asn227 stabilize the UTP substrate through hydrogen bonding, and that Arg18 could be the catalytic residue. Thus, GlmU appears to utilize the α‐helical arm as an active site cap, as a helix dipole and as an oligomerization domain with a probable role for sequence‐conserved Asn227 and possibly Gln231 for recognition of the Pβ and Pγ phosphate groups of UTP. The topology of the active site is consistent with the proposed mechanism of stereochemical inversion at the Pα position for the homologous UDP‐Glc PPase, based on NMR spectra of thio‐substituted nucleotides (Sheu and Frey, 1978; Sheu et al., 1979), indicating that a similar mechanism may occur for GlmU. Based on our crystal structure, GlmU contains two close binding sites allowing the ternary GlmU–GlcNAc‐1‐P‐UTP complex to form, resulting in the formation of a P–O bond and departure of a pyrophosphate from each side of the central Pα phosphate group.

Structural conservation in other XDP‐sugar pyrophosphorylases

Sequence comparison and hydrophobic cluster analysis of bacterial XDP‐sugar PPases, such as dTDP‐Glc, GDP‐Man, UDP‐Glc, ADP‐Glc and CDP‐Glc, and to some extent CMP‐KDO and CMP‐NeuAc PPases, reveal close similarities with the N‐terminal domain of GlmU, with sequence identity ranging from 24 to 12%, which suggests that these proteins may share a similar fold (B.Henrissat and Y.Bourne, unpublished results). Large differences are observed in loops connecting secondary structure elements, indicating that subtle amino acid changes within the active site pocket are responsible for differences in substrate specificity. The functionally important residues, such as GlmU Gly81, Asp105, Glu154, Gly140 and Gly171, are remarkably conserved in most related bacterial XDP‐sugar PPases, as shown for both the nucleotide‐ and sugar‐binding sites (Figure 5); hence, the structure of the N‐terminal domain of GlmU represents a prototypic structure for studying the large family of bacterial XDP‐sugar PPases. A single amino acid substitution at position 82, Thr→Leu, would be predicted to alter the specificity of UDP‐GlcNAc PPase from GlcNAc into Glc, while a Gln76→Glu substitution in Glm U would be required for GTP binding. The recently determined crystal structure of E.coli CMP‐KDO synthetase shows a similar phosphate‐binding site, also located in surface loop β1–α1 (Jelakovic et al., 1996). However, in this class of PPases that comprise CMP‐NeuAc synthase, Arg10 and Lys19 are catalytically important residues as shown by chemical modification and site‐directed mutagenesis. Whereas CMP‐KDO Lys19 corresponds to GlmU Lys25 for UTP recognition, CMP‐KDO Arg10 is directed towards the pocket, whereas the equivalent residue in GlmU, Ala13, points toward the protein and is involved in pyrimidine recognition through its backbone. Therefore, slight differences occur in the catalytic site of these two distinct classes of enzymes.

Saccharomyces cerevisiae and human UDP‐GlcNAc PPases are larger than GlmU (up to 500 residues) but they have conserved most of these functionally important residues, indicating that evolution has favoured the conservation of this fold. Mutagenesis data on Uap1p, which shows Gly112, Arg116 and Lys123 (GlmU Gly14, Arg18 and Lys25) as possible catalytic residues (Mio et al., 1998), are consistent with this observation. The structure of the consensus sequence motif is probably conserved in human UDP‐Glc PPase, where a Gly115→ Asp (GlmU Gly14) mutation dramatically impairs its activity. This mutation may affect UDP‐Glc binding, but the consensus sequence motif of GlmU does not share the structural homology with the short active site peptide NGGLG of glycogen phosphorylase that was proposed for human UDP‐Glc PPase (Flores‐Diaz et al., 1997). We can anticipate that the eukaryotic enzymes possess similar nucleotide‐ and sugar‐binding sites located in distinct protein regions. The fact that a small insertion in the C‐terminal region of human UDP‐Glc PPase alters the sugar specificity is consistent with the present GlmU structure. However, the bifunctionality of the bacterial GlmU enzyme is not conserved in eukaryotes, which evolved with two distinct enzymes: a PPase enzyme, and an acetyltransferase resembling the GCN5‐related N‐acetyltransferases fold (Neuwald and Landsman, 1997) and not the LβH domain. Clearly, further mutagenesis, kinetic and structural studies will be necessary to assess the roles of the functionally important residues in the catalytic mechanism and the specificity of XDP‐sugar PPases.

The LβH trimeric domain fold and comparison with homologous structures

The oligomeric structure of most enzymes containing the LβH domain fold is trimeric; the elution volume for either full‐length GlmU or GlmU‐Tr from a gel filtration column is consistent with this structural assignment (data not shown). The crystal structure of GlmU‐Tr also shows a trimeric arrangement of subunits situated around a crystallographic 3‐fold rotation axis (Figure 6A). The overall dimensions of the trimer are 91×94×68 Å, with the shortest dimension being parallel to the 3‐fold axis. An extensive area of the surface of each monomer participates in monomer–monomer interactions, with 1800 Å2 of each monomer, or 13% of the entire GlmU‐Tr molecular surface per subunit, buried to a 1.6 Å radius probe upon trimer formation. Critical to the trimer assembly of GlmU are the protruding α‐helical arms, which cement the trimer by forming a tight interface with the three C1 coils and thus, overall, form an equilateral triangle seated on top of the LβH trimeric structure (Figure 6A). The long α‐helical arms project each N‐terminal domain of the trimer away from the 3‐fold axis, with their centres of gravity 55 Å apart, thus preventing any interaction between the three N‐terminal domains.

Figure 6.

(A) Ribbon diagram of the GlmU trimer viewed along (left) and perpendicular to (right) the LβH axis. (B) Comparison of GlmU trimeric structure with the four known structures containing a trimeric LβH domain viewed parallel to the LβH axis: DapD with bound CoA (2tdt) (Beaman et al., 1997), Cam with bound Zn ion (1thj) (Kisker et al., 1996), LpxA (1lxa) (Raetz and Roderick, 1995) and PaXAT with bound desulfo CoA (2xat) (Beaman et al., 1998) (clockwise from top left). The LβH domain is coloured in magenta, the linker in green and the insertion loops in blue.

The trimer interface involves 24 residues, three of which are in the N‐terminal domain, seven in the α‐helical arm and the remainder in the C‐terminal LβH domain, and is stabilized by hydrogen bonds, salt bridges and hydrophobic interactions. A key salt bridge forms between the side chains of the sequence‐conserved Arg229 and Asp271, located at the tip of the α‐helical arm, and holds the GlmU trimer assembly. The side chain of LβH domain Arg259, located near the 3‐fold axis, forms van der Waals interactions with its symmetry‐related homologues within the trimer, and is salt‐bridged to Asp261 and hydrogen‐bonded to Asn277 of a neighbouring subunit. Similarly, residues Tyr312, located at the tips of loops T2, pack on each other within the trimer and contribute to the trimer interface. Residues that participate in stabilizing the LβH trimer are type‐conserved substituted within known amino acid sequences of GlmU, suggesting that the present trimeric arrangement may be conserved for other GlmU proteins (Figure 1).

The trimeric association of this particular LβH domain is highly conserved between GlmU‐Tr and LpxA, PaXAT, DapD and Cam, with r.m.s. deviation values of 1.28, 1.65, 2.08 and 2.23 Å for 143, 107, 167 and 118 Cα carbon atoms, respectively. Compared with these structures, the LβH domains of GlmU‐Tr adopt a nearly parallel arrangement (within 1–2°), as observed in LpxA and DapD structures, and, in its truncated version, does not contain any insertion loops. These five structures contain a similar number of coils (from seven to 10), with the exception of Cam which contains only four coils, resulting in slight differences in the parallel packing of the three subunits (Figure 6B). Sequence analysis of GlmU reveals that the hexapeptide repeat motif could span over residues 251–424 with two insertion loops that disobey the hexapeptide repeat sequence rule. We constructed a model of a complete LβH domain of GlmU based on the existing structure. In this model, 2100 Å2 of each monomer are buried to a 1.6 Å radius probe, a value in agreement with that found for other acetyltransferase trimeric structures. The two insertion loops, which were not modelled, are made up of segments Leu332–Ala353 and Asp374–Lys394 and are inserted in turn T3 and T1 of coils C5 and C7, respectively; each contains a sequence‐conserved motif Gly345–Asn–Phe–Val–Glu349 and Asn386–Tyr–Asp–Gly389 which could be important for acetyltransferase activity. Comparison of these homologous structures also reveals that the localization of the substrate acetyl‐CoA is positioned similarly in DapD (Beaman et al., 1997) and PaXAT (Beaman et al., 1998), where the substrate is located between two subunits on the exterior face of the trimeric LβH domains, a positioning that could also be adopted for GlmU. In these structures, acetyl‐CoA binds in a tunnel created by the presence of Gly residues at the i + 1 position of the β‐sheet PB2, a residue also found in the LβH domain of GlmU. In addition, the two insertion loops could complete the acetyl‐CoA‐binding site of GlmU, with loop I emerging from one subunit and loop II from a second subunit, as observed in DapD (Beaman et al., 1997). Together with the fusion of an N‐terminal domain with an unrelated fold, these loops may represent an important means by which hexapeptide proteins have attained their structural and functional diversity. The acetyl‐CoA molecule is oriented with the direction of the long acyl chain roughly parallel to the LβH axis, with the nucleotide moiety directed towards the C‐terminal end while the N‐acetyl group is directed towards the top, 30 Å away from the N‐terminal domain, an orientation that could allow GlcNAc‐1‐P to reach the pocket within a small channel. The localization of the putative acetyl‐CoA‐binding site in GlmU is consistent with recent mutagenesis data showing that mutations of the Cys307 and Cys324 residues, which are located only 12 Å away from the acetyl‐CoA‐binding site, dramatically affect the acetyltransferase activity (Pompeo et al., 1998). Yet, these two cysteine residues are directed inwards, towards the axis of the prism, and are disulfide‐bridged; a direct explanation of their role for the acetyltransferase activity of GlmU will only be achieved using full‐length protein from other bacterial species, or engineered domains designed from the present structure.


The structure of a truncated form of GlmU, a bifunctional enzyme, shows how the enzyme binds UDP‐GlcNAc and reveals precisely the nature of the two domains which are responsible for the acetyltransferase and uridyltransferase activities. Importantly, the structure provides a prototypic template for structure–function analysis of the catalytic domains of a superfamily of pyrophosphorylase enzymes and suggests an explanation for the mutagenesis data on other XDP‐sugar enzymes, whose catalytic domains are likely to be structurally related to that of GlmU. Finally, the structure of GlmU opens the way for future studies on its function and the development of new antibiotics.

Materials and methods

Site‐directed mutagenesis

The pFP3 plasmid allowing high level overexpression of the E.coli glmU gene product (N‐terminal His6‐tagged form) under the control of the isopropyl‐β‐d‐galactopyranoside (IPTG)‐inducible trc promoter has been described previously (Pompeo et al., 1998). pFP3 derivative plasmids for the expression of mutant G14A, R18A and K25A GlmU proteins were constructed using the Transformer™ site‐directed mutagenesis kit (Clontech), based on the method of Deng and Nickoloff (Deng and Nickoloff, 1992; Pompeo et al., 1998). The sequences of the oligonucleotides chosen were 5′‐GATCCTTGCCGCGGCCAAAGGCACGCGCATG‐3′ (G14A), 5′‐GGCAAAGGCACGGCAATGTATTCCGATCTT‐3′ (R18A) and 5′‐ATGTATTCCGATCTTCCTGCAGTGCTGCATACCCTTGCCGGG‐3′ (K25A), for the replacement of amino acids by alanine residues (codons in bold) at the positions indicated, and 5′‐TGGTTGAGTATTCACCAGTCAC‐3′ for suppression of the unique ScaI site lying within the ampicillin resistance gene of the target plasmid pFP3. Plasmid pFP3‐Tr331 for expression of the truncated GlmU protein carrying only the first 331 amino acids was constructed as follows: two oligonucleotide primers 5′‐GGACGGGATCCTTGAATAATGCTATGAGCGTAGTGA3′ and 5′CTCAGCTGCAGGACGCAATCAGGCAAACCG‐3′ were used to amplify by PCR the truncated form of the gene from the chromosome (new stop codon underlined) and the resulting DNA was cut with BamHI and PstI (sites in bold) and inserted between the corresponding sites of vector pTrc His30 (Pompeo et al., 1998).

Expression and purification

Full‐length GlmU was purified from the overproducing strain [BL21(DE3)/pET22b‐glmU] according to Gehring et al. (1996) with modifications. After centrifugation of the cell debris, the supernatant was chromatographed on a Pharmacia Resource Q column (6 ml) using a 0–1 M KCl gradient in 50 mM Tris–HCl pH 8.0, 2 mM EDTA, 1 mM dithiothreitol (DTT). The GlmU‐containing fractions (as attested by SDS–PAGE) were pooled, concentrated on an Amicon YM10 membrane and loaded on a Pharmacia Hiload 26/60 Superdex 200 column equilibrated in 10 mM Tris–HCl pH 8.0, 150 mM KCl. The fractions were analysed by SDS–PAGE, N‐terminal sequencing and mass spectrometry. Purification typically yielded 12.5 mg/500 ml culture medium. A truncated form of GlmU (Δ332–456) was obtained by allowing proteolysis to occur at 20°C for 2 weeks; it was purified by gel filtration only.

For the mutants, E.coli cells [DH5α or JM83 glmU::kan (Pompeo et al., 1998)] carrying plasmids described in the present work were grown exponentially up to OD600 = 0.1 at 37°C in 2YT‐ampicillin medium. IPTG was added to a final concentration of 1 mM, and growth was continued for 3 h. Cells were harvested on ice and washed with 40 ml of ice‐cold 20 mM potassium phosphate pH 7.2, 0.5 mM MgCl2, 0.1% β‐mercaptoethanol. The cell pellet was suspended in 5 ml of the same buffer and supplemented with: 1 μM leupeptin, 1 mM benzamidine, 1 mM phenylmethylsulfonyl fluoride and 20 μg/ml of trypsin inhibitor. Cells were disrupted by sonication on ice and the resulting suspension centrifuged at 4°C for 30 min at 200 000 g. The supernatant was dialysed finally against 100 vol of the same buffer. Proteins in these crude extracts were analysed by SDS–PAGE and quantified using bovine serum albumin (BSA) as a standard (Bradford, 1976). The wild‐type and mutant His6‐tagged enzymes were purified in a single step and under native conditions according to the manufacturer's recommendation (Qiagen). The His6‐tagged wild‐type and mutant GlmU enzymes were in all cases 90% pure, as estimated by SDS–PAGE.

Enzymatic assays

Glucosamine‐1‐phosphate acetyltransferase activity. The standard assay mixture (100 μl) contained 50 mM Tris–HCl pH 8.2, 1.2 mM GlcN‐1‐P, 0.5 mM [14C]acetyl‐CoA (1 kBq), 3 mM MgCl2 and enzyme (0.5 ng to 1 μg of protein, depending on overexpression or purification yields).

N‐Acetylglucosamine‐1‐phosphate uridyltransferase activity. The standard assay mixture (100 μl) contained 50 mM Tris–HCl pH 8.2, 2 mM UTP, 0.2 mM [14C]GpcNac‐1‐P (500 Bq), 3 mM MgCl2 and enzyme (0.01–0.1 μg of protein). [14C]GpcNac‐1‐P (1.9 GBq/mmol) used in this assay was synthesized from [14C]acetyl‐CoA using the acetyltransferase activity of pure GlmU. In both cases, appropriate dilutions of the enzyme were performed in 20 mM potassium phosphate pH 7.2, 0.5 mM MgCl2, 1 mg/ml BSA, 0.1% β‐mercaptoethanol. The samples were incubated at 37°C for 30 min and the reaction was stopped by the addition of 10 μl of glacial acetic acid. Reaction products were separated by high‐voltage electrophoresis on Whatman 3MM filter paper in 2% formic acid pH 1.9 for 2 h at 40 V/cm with an LT36 apparatus (Savant Instruments, Hicksville, NY). The radioactive spots were located by overnight autoradiography using type R2 films (3M, St Paul, MN) or with a radioactivity scanner (Multi‐Tracemaster LB285); they were cut out and counted in a Betamatic IV liquid scintillation spectrometer (Kontron Instruments) with a solvent system consisting of 2 ml of water and 13 ml of Aqualyte mixture (J.T.Baker Chemicals). One unit of enzyme activity was defined as the amount which catalysed the synthesis of 1 μmol of product in 1 min.

Complementation of the glmU mutation

The thermosensitive mutant UGS83 (Mengin‐Lecreulx and van Heijenoort, 1993) was transformed by the plasmids to be tested. Competent cells (300 μl) mixed with plasmid DNA were kept on ice for 3 h before heating for 3 min at 42°C. 2YT medium (400 μl) was then added and cells were incubated at 30°C for >2 h to allow expression of plasmid‐coded genes. Aliquots (100 μl) of the cell suspensions were plated onto two pre‐warmed 2YT‐ampicillin plates, one incubated at 30°C and the other at 42°C. Growth was observed after 24 h of incubation.


Crystals of the spontaneously proteolysed GlmU‐Tr were obtained at 20°C using the vapour diffusion technique. Typically, 2 μl of the protein solution (25 mg/ml) were mixed with 2 μl of the reservoir solution made up of 1.3–1.5 M ammonium sulfate, 0.1 M MES pH 6.0, 4% acetone (v/v). Crystals usually appeared within 3–7 days with typical dimensions of 0.3×0.3×0.5 μm3. The GlmU‐Tr‐mercury derivative was obtained by the addition of mercury II acetate to the protein solution to a final concentration of 2 mM. The complex was incubated overnight on ice before crystallization and crystals were observed after 7 days. The GlmU–UDP‐GlcNAc complex was obtained after soaking apo GlmU‐Tr crystals in the reservoir solution supplemented with 10 mM UDP‐GlcNAc.

Data collection and processing

The Glmu‐Tr crystals belong to the space group R32 with unit cell dimensions a = b = 142.68 Å and c = 248.13 Å. They contain two GlmU‐Tr molecules per asymmetric unit, giving a Vm value of 3.38 Å3/Da and a solvent content of ∼63% (Matthews, 1968). Crystals selected for data collection were transferred in reservoir solution containing increasing amounts of ethylene glycol, flash cooled at 100 K in the nitrogen gas stream and stored in liquid nitrogen. The data collection statistics are shown in Table I. The X‐ray fluorescence and transmission from the GlmU‐Tr‐Hg co‐crystal were measured as functions of incident X‐ray energy in the vicinity of the mercury LIII edge. Two energies were chosen near the absorption edge: 12 288 eV (λ = 1.009 Å) and 12 335 eV (λ = 1.005 Å), corresponding to the minimum f′ and maximum f″, respectively. A third, remote energy (remote II) was selected at 15 400 eV (λ = 0.805 Å). Data were indexed and integrated by DENZO (Otwinowski, 1997). A fourth data set (referred to as remote I), collected on beamline ID14‐EH3 (ESRF, Grenoble) at an energy of 13 240 eV (λ = 0.932 Å), was included in the scaling procedure. The two near‐edge and the remote I data sets were scaled to the remote II data set using SCALA and reduced by TRUNCATE. Estimates of FM, the optimized value of the normalized anomalous scattering, were calculated by the program REVISE; two mercury sites were obtained by SHELX (Sheldrick, 1990) from the corresponding Patterson.

Phasing, model building and refinement

Phases were calculated from these two mercury sites by MLPHARE (CCP4, 1994) and improved by the techniques of solvent flattening as implemented in the program DM. The correct hand was identified by inspection of the electron density maps, obtained within 15 min of completing the data collection. A preliminary Cα model was constructed in the solvent‐flattened maps using the graphics program TURBO‐FRODO (Roussel and Cambillau, 1991). Multi‐domain, 2‐fold non‐crystallographic symmetry (NCS) averaging was employed by DM, taking only the two N‐ and C‐terminal subunits, separately, allowing residues Asn4–Gly138 and Ala155–Pro328 to be built. Preliminary refinement was performed by X‐PLOR (Brünger et al., 1987) against data collected from the native crystal at 2.25 Å resolution using NCS restraints. After several iterative cycles of refinement and model building, water molecules were placed automatically using the REFMAC/ARP procedure (CCP4, 1994; Perrakis et al., 1997). X‐PLOR omit maps from the final model were used to check the N‐terminal domain; 10 residues were deleted systematically in each calculation, simulated annealing then being used to reduce model bias. The model was then refined using CNS version 0.5 (Brünger et al., 1998), giving final Rwork and Rfree values of 23 and 27.5%, respectively. This model comprises residues Asn3–Ile326 and Asn3–Phe330 for the two molecules, respectively, a MES molecule, two sulfate ions and 360 solvent molecules. The two molecules in the asymmetric unit are essentially identical, with an r.m.s. deviation of 0.47, 0.08 and 0.85 Å for Cα atoms of residues 3–227, 251–326 and 3–326, respectively. High temperature factors and weak electron density are associated with residues Tyr139–Gln166 in both molecules. The apo GlmU‐Tr structure, without solvent and cofactors, was used as a starting model for the GlmU‐Tr‐UDP‐GlcNAc complex structure. Rigid‐body refinement decreased the R‐factor to 28.7% in the 10–3.5 Å resolution range. Fourier difference maps clearly revealed the location of the bound UDP‐GlcNAc in the two molecules. Subsequent refinement, alternated with graphic inspection, gave R‐factor and Rfree values of 22.3 and 26.1%, respectively. This model comprises Asn3–Pro328 and Asn3–Ile326 for the two molecules, respectively, four sulfate ions, two ethylene glycol molecules and 233 solvent molecules. The stereochemistry of the two models was analysed with PROCHECK (Laskowski et al., 1993) and WHATIF (Hooft et al., 1996); 90% of the polypeptide backbone dihedral angles were found to lie in the most favourable regions of the Ramachandran plot, with the remainder in allowed regions. The coordinates of the free GlmU and GlmU–UDP‐GlcNAc complex have been deposited with the Protein Data Bank. Figure 1 was generated by Alscript (Barton, 1993), Figure 2A with TURBO‐FRODO (Roussel and Cambillau, 1991), Figures 3,4,5 with Molscript (Kraulis, 1991) and Raster3D (Meritt and Murphy, 1994), Figure 4B with GRASP (Nicholls et al., 1991) and Figure 5A with Ligplot (Wallace et al., 1995).


We are grateful to Eric Brown for providing us with the pET22b‐glmU plasmid and Ana Gonzalez for expert assistance during data collection at the BW7A beamline of DESY. We thank Veronique Zamboni, Sabine Leydier, Nicolas Maguet and Sonia Longhi who initiated this work, and Louis Gastinel, Anne Belaich and Bernard Henrissat for helpful discussions. We thank Pascale Marchot for assistance in data collection and critical reading of the manuscript, and Pascal Arnoux for assistance in preparation of figures. This work was supported in part by the Centre National de la Recherche Scientifique to the UPR–9039 (Marseille) and the EP–1088 (Orsay).


View Abstract