Structure of the foot‐and‐mouth disease virus leader protease: a papain‐like fold adapted for self‐processing and eIF4G recognition

Alba Guarné, José Tormo, Regina Kirchweger, Doris Pfistermueller, Ignasi Fita, Tim Skern

Author Affiliations

  1. Alba Guarné1,
  2. José Tormo1,
  3. Regina Kirchweger2,3,
  4. Doris Pfistermueller2,
  5. Ignasi Fita1 and
  6. Tim Skern*,2
  1. 1 Centre d'Investigació i Desenvolupament (CSIC), Jordi Girona Salgado 18–26, E‐08034, Barcelona, Spain
  2. 2 Institute of Biochemistry, Medical Faculty, University of Vienna, Dr Bohr‐Gasse 9/3, A‐1030, Vienna, Austria
  3. 3 Present address: Department of Cell Biology and Genetics, Sloan Kettering Memorial Cancer Center, East 68th Street, New York, NY, USA
  1. *Corresponding author. E-mail: timothy.skern{at}
View Full Text


The leader protease of foot‐and‐mouth disease virus, as well as cleaving itself from the nascent viral polyprotein, disables host cell protein synthesis by specific proteolysis of a cellular protein: the eukaryotic initiation factor 4G (eIF4G). The crystal structure of the leader protease presented here comprises a globular catalytic domain reminiscent of that of cysteine proteases of the papain superfamily, and a flexible C‐terminal extension found intruding into the substrate‐binding site of an adjacent molecule. Nevertheless, the relative disposition of this extension and the globular domain to each other supports intramolecular self‐processing. The different sequences of the two substrates cleaved during viral replication, the viral polyprotein (at LysLeuLys↓GlyAlaGly) and eIF4G (at AsnLeuGly↓ArgThrThr), appear to be recognized by distinct features in a narrow, negatively charged groove traversing the active centre. The structure illustrates how the prototype papain fold has been adapted to the requirements of an RNA virus. Thus, the protein scaffold has been reduced to a minimum core domain, with the active site being modified to increase specificity. Furthermore, surface features have been developed which enable C‐terminal self‐processing from the viral polyprotein.


The structures of a number of viral proteases have been determined recently (Babé and Craik, 1997). Comparison of these viral structures with those of prototypes of the four classes of proteases reveals similarities, but also distinct differences in mechanism and structure. Thus, the picornaviral 3C cysteine protease and the NS3 serine protease of hepatitis C virus have a similar fold to the serine protease trypsin (Allaire et al., 1994; Matthews et al., 1994; Kim et al., 1996; Love et al., 1996). In contrast, the cytomegalovirus protease possesses a single β‐barrel structure, instead of the classic two β‐barrel motif of trypsin, and a novel catalytic triad of His/His/Ser (Chen et al., 1996; Tong et al., 1996). Furthermore, the adenovirus protease, a viral cysteine protease, also has a novel fold, although it employs a catalytic triad Asp/His/Cys reminiscent of that found in the cysteine protease papain (Ding et al., 1996). Papain‐like cysteine proteases are characterized by a tryptophan residue following the nucleophilic cysteine, and two hydrophobic residues following a conserved histidine which increases the nucleophilic character of the catalytically active cysteine (Berti and Storer, 1995). Indeed, a number of viral proteolytic enzymes have been proposed to be papain‐like cysteine proteases, based mostly on the presence of such conserved patterns (Gorbalenya et al., 1991). However, the lack of direct structural information on these viral cysteine proteases has prevented the establishment of further relationships between papain and its putative viral relatives.

The leader protease (Lpro) of foot‐and‐mouth disease virus (FMDV), an animal pathogen of global importance, is a proposed papain‐like viral cysteine protease. Even though sequence identity between the FMDV Lpro and papain is no higher than 15% (Gorbalenya et al., 1991; Skern et al., 1998), the characteristic residues of papain‐like proteases are conserved in the primary sequences of the Lpro from both FMDV and the related equine rhinoviruses (ERVs). Lpro is the first protein encoded on the FMDV polyprotein (Figure 1A). Its sole role in viral maturation is to free itself from the polyprotein by cleavage between its own C‐terminus and the N‐terminus of VP4 (Figure 1B) at the sequence ArgLys LeuLys↓GlyAlaGlySer. Theoretically, this self‐processing event can occur either intra‐ or intermolecularly (Figure 1B); expression of the Lpro in various in vivo and in vitro systems has provided evidence for both types of reaction (Belsham et al., 1990; Medina et al., 1993; Piccione et al., 1995b). Nevertheless, it has not yet been established which of the two mechanisms for self‐processing is preferred in vivo. As initiation of protein synthesis on the FMDV genome occurs at one of two AUG codons lying 84 nucleotides apart, two species of Lpro (known as Labpro and Lbpro, depending on whether protein synthesis initiates at the first or second AUG codon, respectively) have been identified in infected cells and have been shown to possess the same enzymatic properties (Medina et al., 1993). In addition to LbPro, FMDV encodes two other proteolytic activities, the 2A peptide which has been proposed to cleave autocatalytically between its own C‐terminus and the N‐terminus of 2B, and the 3C protease (Ryan and Flint, 1997). This enzyme carries out all remaining cleavages, with the exception of that between VP4 and VP2 which occurs in the maturing capsid by an as yet unknown mechanism.

Figure 1.

Schematic drawing of the biological activities of Lpro. (A) The RNA genome of FMDV. The single open reading frame is shown as an open box (with the mature viral proteins indicated), non‐coding regions by a line and the IRES with a closed box. Lpro forms are stippled. (B) Protein synthesis on FMDV mRNA; Lpro self‐processing is indicated as an intra‐ or intermolecular event. (C) Role of eIF4 proteins in initiation of protein synthesis and cleavage of eIF4G by Lbpro. eIF4 proteins involved in protein synthesis and the 40S ribosomal subunit are indicated. The m7GDP 5′ cap structure of cellular mRNAs (open circle) and the Lbpro cleavage site (arrow) are indicated. The eIF4G C‐terminal domain still forms an initiation complex with IRES‐containing mRNAs.

After the single self‐processing event, Lpro then plays a central role in FMDV replication by specifically cleaving the host cell protein eukaryotic initiation factor eIF4G at the sequence AlaAsnLeuGly↓ArgThrThrLeu (Figure 1C; Kirchweger et al., 1994). As a result, the domain of eIF4G which binds the cap‐binding protein eIF4E is separated from the domain of eIF4G which binds eIF3, so that the infected cell is unable to recruit its own capped mRNA to the 40S ribosome. The viral mRNA is unaffected as it initiates protein synthesis internally via an IRES (Figure 1C). This shuts off cellular mRNA translation in favour of the viral counterpart.

We report here the three‐dimensional structure of two variants of the FMDV Lbpro, present their relationship to the papain superfamily and discuss mechanisms of self‐processing and the basis of substrate recognition for specific cleavage on two different proteins.

Results and discussion

Structure determination

Wild‐type Lbpro crystals could not be obtained. However, substitution of the active site nucleophile Cys51 (Piccione et al., 1995a; Roberts and Belsham, 1995; Ziegler et al., 1995) with Ala allowed crystals to be grown as reported (Guarné et al., 1996). These contained eight Lbpro molecules in the asymmetric unit but were difficult both to reproduce and to manipulate. Thus, although a native diffraction data set from these crystals was obtained up to 3.0 Å resolution, the corresponding phases could not be determined. We speculated that the properties of these crystals resulted from interactions between the C‐terminus of one molecule and the active site of a neighbour, and therefore prepared a truncated form of the inactive Lbpro lacking six C‐terminal amino acids (termed sLbpro). New crystals were obtained containing two sLbpro molecules in the asymmetric unit and the structure was solved by a combination of isomorphous replacement and density modification techniques. The Lbpro Cys51Ala mutant was then determined by molecular replacement using the sLbpro structure as a search model (Table I; see Materials and methods).

View this table:
Table 1. X‐ray parameters and refinement statistics

Description of the overall Lbpro structure

The Lbpro structure (Figure 2A–C) presents a compact, globular region, ranging from Met29 to Tyr183 with an overall cubic shape of approximate edge dimensions of 30 Å, from which a flexible C‐terminal extension (CTE) ranging from Asp184 to Lys201 extrudes. The globular region is divided into two subdomains, with the catalytically essential residues Cys51 (replaced by Ala in the Lbpro structure) and His148 located at the interface. The first, N‐terminal subdomain contains four α‐helices (α1, α2, α3 and α4) and two short antiparallel β‐strands (β1 and β2) comprising only residues Glu30–Thr32 and Lys38–Thr40, respectively. The longest α‐helices α1 and α3 (comprising residues Asn50–Glu64 and Leu78–Gly91, respectively) run perpendicular to each other, with the catalytic Cys51 being located towards the N‐terminal side of helix α1. The shortest helix α2 spans only six residues (Phe68–Ser73) and runs almost parallel to α3. The second Lbpro subdomain displays a fold belonging to the all β‐family of proteins, as the only regular secondary structure elements are contained in a mixed β‐sheet formed by one parallel (β3 with β4) and six antiparallel (β4–β9) β‐strands (Figure 2A–C). The essential His148 is located on the turn connecting the longest strands β5 and β6 (residues Phe137–Leu143 and Ala149–Thr155) which occupy a central position in the sheet.

Figure 2.

Overall view of the FMDV Lpro and its relationship to papain. (A) Structure‐based amino acid alignment from the indicated FMDV serotypes and ERV1. The alignments of FMDV serotypes are based on those of Ryan and Flint (1997), that of ERV1 is from Skern et al. (1998). Structurally equivalent residues in papain are shown. The N‐terminal positions of Labpro and Lbpro, secondary structure elements and the start of the CTE are also indicated. Asterisks mark Asn46, Cys51, His148 and Asp163. (B and C) Views of the structure of the FMDV Lbpro rotated 90°. α‐helices and β‐strands are coloured green and magenta, respectively. The active site Cys51 and His148 are shown as balls and sticks. The ordered CTE is in orange (referred to as CTE1), that containing the disordered amino acids (dashed lines) is in blue (referred to as CTE2). (D) Stereo drawing of superimposed Cα traces of Lbpro (blue) and papain (yellow), using the standard view of papain (Kamphuis et al., 1985). The location of the prosegment‐binding loop (PBL) of papain is indicated (see text). [Figure 2B–D as well as all parts of Figures 3 and 4 were drawn using the programs MOLSCRIPT (Kraulis, 1991), with modifications by R.Esnouf (Esnouf, 1997), and RASTER3D (Merrit and Murphy, 1994).]

The main difference between the structures of Lbpro and sLbpro is that, in the latter, the 12 residues remaining in the CTE are disordered in both molecules present in the asymmetric unit (from residue Asp184 onwards). The superimposition of the globular regions of Lbpro and sLbpro models gives an r.m.s. deviation of 0.4 Å (performed using the program SHP; Stuart et al., 1979), which can be considered as an upper limit for the coordinate errors in the compact region of the two structures.

Several groups proposed a papain‐like fold for the picornaviral Lpro structure (Gorbalenya et al., 1991; Piccione et al., 1995a; Skern et al., 1998). The papain‐fold consists of left‐ (L) and right‐hand (R) domains [standard papain‐fold view (Kamphuis et al., 1985)] which are structurally equivalent to the two subdomains described for Lbpro. Superimposition of main‐chain Cα atoms from Lbpro with papain (Figure 2A and D) gives an averaged r.m.s. deviation of 1.3 Å for 76 equivalent residues. As the globular region of Lbpro represents the smallest polypeptide fragment with a papain‐like topology, it lacks most of the decoration found in papain, including the prosegment‐binding loop (PBL; Coulombe et al., 1996; Figure 2D). Regions most conserved between the papain‐like proteases and Lbpro structures are located around the active centre, especially secondary elements α1 and β5–β6, containing the catalytic cysteine and histidine residues (Figure 2B and C). Papain‐like proteases have, however, no equivalent to the Lbpro CTE.

The extended conformations of the Lbpro CTEs are stabilized by a network of intermolecular interactions originating from the exchange of the CTEs between neighbouring molecules. In four out of the eight Lbpro molecules in the asymmetric unit, all residues in the CTE are visible (Figures 2C and 3) while, in the other four molecules, there is some disorder, and residues Glu186–Glu191 have not been traced (Figures 2C and 3). In the four molecules containing the disorder, the conformation of the visible residues is closely related to, though subtly different from that found in the other four (Figures 2B, C and 3). The 10 C‐terminal amino acids, residues Trp192–Lys201, are nevertheless well defined in all eight crystallographically independent subunits and present identical interactions with the substrate‐binding pocket of adjacent molecules in the crystal.

Figure 3.

Stereo view of the disposition of the Lbpro molecules in the crystal. The eight Lbpro molecules contained in the asymmetric unit are coloured. The orientation of the unit cell axis is also indicated. Four molecules, with the ordered CTEs, are shown in yellow, while the remaining four, with the disordered CTE, are shown in green. The corresponding CTEs are shown in orange and blue, respectively. Six intermolecular disulfide bridges, represented as thick balls, can also be seen, four of them with molecules of adjacent asymmetric units (in grey). The quasipolymeric character of the packing, with extensive non‐covalent interactions alternating with disulfide bridges, is apparent. Two types of non‐crystallographic symmetries are present. The first (dashed red lines) relates pairs of neighbouring molecules, whilst the second (red spot) relates four molecules of the asymmetric unit with the other four.

Besides the exchange of the CTEs between neighbouring molecules, the crystal packing of both Lbpro and sLbpro forms shows a polymeric character brought about by two noteworthy peculiarities. First, there is a covalent intermolecular disulfide bridge between adjacent molecules (Figure 3) in both the sLbpro and Lbpro forms, although the spatial relationship between disulfide‐linked molecules differs in the two crystal forms. Secondly, there is a large contact area of 1088 Å2 between the helical domains of the neighbouring molecules which are related by a local 2‐fold axis. Again, the dimer defined by the two subunits in contact is present in both the sLbpro and the Lbpro crystal forms. However, as Lbpro functions enzymatically as a monomer, the biological relevance of both possible dimers is not clear.

The active site cleft

The active site containing the catalytic residues Cys51 and His148 is located on top of a deep cleft in the interdomain region, as observed for other members of the papain superfamily. Both the location of the active site and the spatial arrangement of the catalytic residues are well preserved (Figures 2B–D and 4A). In papain superfamily members, the active site histidine (P‐159; papain numbering is used throughout when describing papain‐like proteases) is maintained in the correct orientation with respect to the nucleophilic cysteine (P‐25) by a hydrogen bond to the side‐chain oxygen of a conserved asparagine residue (P‐175). Asp163 carries out this task in Lbpro (Figure 4A and B), and this residue is strictly conserved in leader proteases (Figure 2A). In all members of the papain superfamily examined so far, a tryptophan residue (P‐177) covers the hydrogen bond formed between the Asn–His pair (Figure 4C); substitution of Trp P‐177 reduces papain activity (Berti and Storer, 1995). Neither this aromatic residue nor the 11 residue loop (P‐175– P‐185) that anchors it in the papain‐like enzymes is found in Lbpro. In its place is a β‐turn containing a cluster of four acidic residues (Asp163, Asp164, Glu165 and Asp166; Figure 4B); these residues confer a strong local negative charge, so that the environment is quite different from those in most papain‐like enzymes (Figure 5). In the absence of the tryptophan residue, the carboxylate group of Asp163 may be required to form a stronger hydrogen bond with His148. Despite these differences, Lbpro represents a papain‐like enzyme without this fully conserved tryptophan residue.

Figure 4.

Active site of Lbpro. (A) Arrangement of amino acid side chains around the active site (viewed down through the central helix α1) after superimposition of papain (grey) and Lbpro (green) using the program SHP (Stuart et al., 1979). Catalytic residues of Lbpro (Asn46, Cys51, His148 and Asp163) are in yellow, those of papain (Gln P‐19, Cys P‐25, His P‐159 and Asn P‐175) in grey. (B) Network of hydrogen bonds in the Lbpro active centre. The acidic cluster (Asp163–Asp166) in the S′‐binding region is also shown. The orientation of the amide group of Asn46 is maintained by hydrogen bond interactions with the main‐chain nitrogen of Asp49 and the side chains of Asn54 and Asp164. (C) Hydrogen bonds in the papain active centre. The fully conserved Trp P‐177, in papain‐like enzymes, covering the hydrogen bond between essential catalytic residues Asn P‐175 and His P‐159 is also shown.

Figure 5.

Electrostatic potential surfaces of, left to right, papain (Kamphuis et al., 1984; 1papM PDB code), Lbpro and cathepsin L (Coulombe et al., 1996; 1aec PDB code). The yellow arrow indicates the Lbpro active centre. The standard papain view is used in the top panel; that in the lower panel, obtained by a simple 90° rotation of the upper one, is down the α1 helix. The apparent differences in charge distribution reflect the indicated differences in isoelectric points, but also the different specificity of S and S′ subsites. [Figures 5 and 7 were generated by GRASP (Nicholls et al., 1991)].

Another important catalytic residue in papain superfamily members is a conserved glutamine (P‐19) whose side‐chain amide, together with the main‐chain nitrogen of the catalytic cysteine (P‐25), stabilizes the negative charge developing on the scissile carbonyl oxygen during nucleophilic attack. This structural feature, termed the oxyanion hole, is also present in Lbpro. However, Asn46 replaces the conserved glutamine residue (Figure 4B and C). In Lbpro, the arrangement of the turn positioning Asn46 and the orientation of its side‐chain amide [flipped by 180° compared with the equivalent glutamine (P‐19) in all other papain‐like proteases] differ from those of other members of the papain superfamily (compare Figure 4B and C). This is probably due to the fact that, in Lbpro, only four residues separate Asn46 from Cys51, whereas five residues are found in other papain‐like proteases. The shorter side chain of Asn46 and its orientation are required because the tighter turn of the Lbpro brings the main chain closer to the catalytic residues than in other papain‐like enzymes. The orientation of the side chain of Asn46 is fixed by hydrogen bonds to the main‐chain nitrogen of Asp49, forming an Asn‐pseudoturn, and the side‐chain carboxylate of Asp164 (Figure 4B). This aspartate residue, conserved in all Lbpro sequences analysed so far (Figure 2A), is located immediately after the catalytic Asp163 in the acidic loop described above, and participates in an intricate network of hydrogen bonds involving residues Asn46, Asp49, Asn54 and Asp164 (Figure 4B). This network might contribute to the stability of the active site structure and catalytic activity.

The interaction between Lpro and its C‐terminus

FMDV Lpro frees itself from the growing polypeptide chain by specific cleavage at its own C‐terminus (Figures 1A,B and 2A). Thus, the presence of CTE residues inside the substrate‐binding pockets of adjacent molecules illustrates substrate recognition during self‐processing and represents, in fact, the P side of the substrate in the self‐processing reaction. The peptide backbone of the final residues of the CTE is in an extended conformation (Figure 3) similar to that observed in complexes of enzymes of the papain superfamily with peptide‐like inhibitors (Yamamoto et al., 1991, 1992).

The main interactions between the CTE and the substrate‐binding site (Figure 6) are provided by Lys201′ and Leu200′, with minor contributions from residues Lys199′–Val196′ (residues in the CTE of a symmetry‐related molecule are labelled with primed numbers). The final CTE residue, Lys201′, is positioned close to the active site Cys51 (replaced by Ala in this structure) as if catalysis had been completed. One of its carboxylate oxygens, located in the oxyanion hole, is hydrogen bonded to the side chain of Asn46 and the main‐chain nitrogen of Cys51, whereas the second carboxylate oxygen accepts a hydrogen bond from the imidazole ring of the catalytic His148 (Figure 6), expected to be protonated as described for papain (Yamamoto et al., 1991; Brocklehurst et al., 1998). The CTE establishes additional interactions with the substrate‐binding site through hydrogen bonds between main‐chain atoms. Thus, the main‐chain nitrogen of Lys201′ forms a hydrogen bond with the main‐chain carbonyl of Glu147; the main‐chain oxygen and nitrogen atoms of Leu200′ are hydrogen bonded to the main‐chain nitrogen and oxygen, respectively, of Gly98, building a short antiparallel β‐sheet. These hydrogen bond interactions are a conserved feature in substrate binding by papain superfamily members.

Figure 6.

Stereo view of the interactions between the CTE and the S site. (A) Residues involved in the formation of subsites S1–S6 are shown as balls and sticks, with nitrogen atoms coloured in blue and oxygen atoms in red. Residues of the CTE are also shown and labelled Val196′–Lys201′. (B and C) Residues involved in the formation of the S1 and S2 subsites, respectively, are shown as balls and sticks with their corresponding electron density in light blue. The P1 (Lys201′) and P2 (Leu200′) residues of the CTE are also shown.

The S1 subsite in Lbpro is a narrow cleft bounded by the loop preceding the central helix α1 on one side, the β‐turn connecting strands β5 and β6 on the other side and the active site at the bottom (Figures 2B and 6A,B). In the Lbpro structure, the aliphatic portion of the side chain of Lys201′, which occupies the S1 subsite, is sandwiched between the main chain of residues His95–Glu96 and the side chain of Glu147, while its amino group establishes electrostatic interactions with the carboxylates of Glu96 and Glu147 (Figure 6A). The S1 subsite in papain and other family members is a wide, unrestricted pocket which exerts relatively little influence on the substrate specificity (Figure 5). In Lbpro, which clearly prefers lysine at P1 in the self‐processing reaction (Figure 2A), this subsite has become narrower and deeper due to a rearrangement of the loop connecting strands β5 and β6 on the R domain. Amino acid sequence alignments and modelling of the ERV1 Lpro imply a correlation between the side chain at P1 and that of residue 147. Thus, FMDV enzymes, with lysine in P1, have a negatively charged glutamate at position 147; in contrast, the corresponding residue in ERV1, with serine at P1, is Gly149, shorter and not charged (Figure 2A).

The side chain of Leu200′ (P2) is completely buried in a hydrophobic S2 pocket formed by Trp52, Gly97–Pro100, Leu143, Glu147–Ala149 and Leu178 (Figure 6A and C). The architecture of the Lbpro S2 subsite is very similar to that of other papain superfamily proteases; in fact, the residues defining the pocket in papain are identical to those in Lbpro, with the exception of Leu143 which is equivalent to Val P‐133.

CTE residues Lys199′ (P3) and Arg198′ (P4) occupy loose pockets on opposite faces of the cleft, in the S3 and S4 subsites, respectively. The aliphatic portion of the side chain of Lys199′ makes van der Waals contacts with main‐chain atoms of residues Gly97–Gly98, and its amino group interacts through a hydrogen bond with the main‐chain carbonyl group of Glu93 and through weak ionic interactions with the side‐chain carboxylates of Glu93 and Glu96. Arg198′ makes van der Waals contacts with Gly98, Pro99, Leu143 and extensively with Gln146. Its guanidinium group also hydrogen‐bonds to the amide side chain of Gln146. Residue Gln197′ (P5) has its side chain exposed to the solvent, but still contacts through its main chain Pro99, thus making a very open subsite S5. Finally, Val196′ is buried in a hydrophobic cavity (subsite S6 formed by residues Pro99, Ala101, Val127 and Leu178), located on the interdomain cleft, just underneath subsite S2.

Biological implications of the structure

Self‐cleavage at the C‐terminus. The presence of the CTE in the active site of adjacent molecules argues for intermolecular self‐processing. However, although in the crystal structure of Lbpro the CTE projects away towards neighbouring molecules, instead of folding back into its own substrate‐binding cleft, several structural features suggest that self‐processing in cis is possible and might even be favoured. First, the interface between the globular domains exchanging their CTEs is composed of weak interactions, indicating that this region is not designed to promote an intermolecular reaction. Secondly, residues located immediately after Tyr183, at which point the polypeptide chain leaves the globular region to begin the CTE, favour a turn. Notably, Asp184 and Glu186 are conserved in all serotypes of FMDV and in ERV1, enabling the placement of a conserved hydrophobic residue (Leu188) into a shallow hydrophobic pocket formed by residues Ala118, Pro121, Thr130, Met132 and Cβ of Asp136. This interaction leads the CTE polypeptide in a direction compatible with both cis and trans self‐processing. A polar, highly flexible stretch (residues Asn189–Glu191; modelled in Figure 7) should overcome the distance between this pocket and the above‐described subsites in the substrate‐binding cleft. A tryptophan residue at position 192 (or the aromatic residue found in ERV1 Tyr201) would enhance self‐processing in cis by stacking its aromatic ring with the exposed and conserved Trp105 (ERV1 Tyr103) of the globular domain. Thus, the CTE would reach subsite S6 (interaction with Val196) with minor rearrangements of the main chain (Figure 7); the electrostatic and van der Waals interactions of the CTE with the substrate‐binding cleft would maintain the correct orientation for the self‐processing in cis.

Figure 7.

Model for self‐processing of the Lbpro in cis. The electrostatic potential surface of Lbpro with CTE residues Asp184–Asn189 and Lys195–Lys201 as sticks. Coordinates from Lys195 to Lys201 correspond to a neighbouring molecule in the crystal. However, the orientation and the proximity of this fragment to Asn189 points towards a simple, direct connection within the same molecule of Lbpro. This joining is indicated by dashes in the left panel and modelled by explicitly including residues Gly189–Ala194 in the right one. Polar amino acids would be exposed to the solvent, and only Trp192, pointing towards Trp105, appears to require minor rearrangements of the flexible and highly exposed Arg109. The green arrow indicates the position of the side chain of Cys133.

Why are intramolecular CTE interactions not observed in the Lbpro crystal structure? First, the path of the CTE to its ‘own’ active site is partially blocked by the intermolecular disulfide bridge between neighbouring molecules. This bond is not believed to be present in the reducing environment inside the cell. Secondly, crystal packing requirements may have also favoured the observed CTE interactions. Finally, the ability to cleave eIF4G requires the CTE to be flexible and not remain in the active site. Evidence for the flexible nature of the CTE is provided by the lack of density in the sLbpro form and the differences in the positions of certain residues in the disordered form of the CTE. The disorder in the CTE also appears to be energetically favoured, as freezing the polar CTE in a fixed conformation would impose a high entropic penalty.

Cleavage of eIF4G. The tendency of the CTE to leave the active site of the same polypeptide chain after cis cleavage is of significance for the recognition and cleavage of eIF4G by Lbpro. Thus, the enzyme cannot be inhibited by binding to its own C‐terminus, which served as the recognition site for the cleavage on the viral polyprotein. However, as indicated above, this implies that the S site does not provide sufficient interactions to maintain the substrate in the active site; indeed, the absence of significant intramolecular product inhibition is probably a direct result of this inability. Thus, to bind to its cleavage site on eIF4G, which lacks a basic P1 residue, the enzyme appears to employ the acidic patch of the S′ site (Figure 5) to provide an ionic interaction with the P′1 Arg residue of the cleavage site on eIF4G. Indeed, it is noteworthy that all intermolecular substrates of Lbpro identified so far in vitro contain basic residues at P′1 or P′2 or both, even when the P1 residue is basic (data not shown). Taken together, these observations suggest that a substrate containing basic residues at both P1 and P′1 should be an optimized substrate for the Lbpro. As, however, no data are available on the sequence preference of Lbpro on peptide substrates, experiments are underway to investigate this notion.

The acidic patch at the S′ site of Lbpr°, coupled with the narrow cleft traversing the active site (Figures 5 and 6A), also appear to be the reasons why the Lbpro is clearly much more specific than papain, although the two enzymes possess an S2 pocket almost identical in composition and topology. Thus, Lbpro does not cleave an immunoglobulin molecule, a classic substrate of papain. Furthermore, although eIF4G is an efficient substrate for papain, the cleavage products are not the same as those of Lbpro (B.Hampoelz and T.Skern, unpublished).


The FMDV Lbpro is the first crystal structure determined of a viral papain‐like cysteine protease. The structure shows the interactions involved in C‐terminal processing which enable cleavage between a lysine and a glycine residue to take place. In addition, examination of the substrate‐binding cleft indicates how the enzyme can carry out the specific cleavage of the host protein eIF4G which, in contrast, requires cleavage between a glycine and an arginine residue.

Materials and methods

Protein expression, purification and crystallization

The Cys51Ala mutant of the FMDV serotype O1k Lbpro was expressed in Escherichia coli BL21 (DE3) pLysS and purified as described (Kirchweger et al., 1994). A variant of the Cys51Ala mutant lacking six amino acids at the C‐terminus (sLbpro) was expressed and purified similarly. Lbpro crystals, belonging to space group P212121 with cell dimensions a = 65.4 Å, b = 101.6 Å and c = 277.0 Å, were obtained by vapour diffusion against solutions containing 10% PEG 6000, 0.8 M MgCl2, 0.1 M Tris–HCl, pH 8.5. There are eight molecules in the asymmetric unit and an estimated solvent content of 59%. sLbpro crystals, belonging to space group C2221 with unit cell dimensions of a = 51.0 Å, b = 130.0 Å and c = 126.2 Å, were grown from 8% PEG 4000, 0.2 M MgCl2, 0.1 M Tris–HCl, pH 8.5. This crystal form has two molecules in the asymmetric unit and an estimated solvent content of 55%. sLbpro crystals were harvested in solutions containing 10% PEG 4000, 0.2 M MgCl2, 0.1 M Tris–HCl, pH 8.5, prior to data collection or heavy atom screening.

The two heavy atom derivatives of the sLbpro crystal used for phase determination were prepared by soaking at room temperature for 48 h in the harvesting solution containing 0.1 mM HgCl2 or for 24 h in the harvesting solution which had been adjusted to 10 mM K2PtCl4.

Data collection

For cryogenic X‐ray data collection, native and derivative crystals were soaked in harvesting solutions made up to 25% ethylene glycol and flash‐frozen under a stream of boiled‐off nitrogen at 100 K (Oxford CryoSystems). X‐ray diffraction data sets were collected on MarResearch image plate detector systems using a Rigaku RU‐200B rotating anode and synchrotron facilities at the X11 EMBL outstation (DESY, Hamburg). Data were indexed, reduced, scaled and merged with DENZO and SCALEPACK (Otwinowski and Minor, 1997) (Table I). Most subsequent calculations were performed with the CCP4 Program Suite (Collaborative Computational Project Number 4, 1994).

Structure determination

The structure of the sLbpro crystal form was determined by multiple isomorphous replacement. Heavy atom sites were located by Patterson methods and confirmed using cross‐phased difference maps. Refinement of heavy atom parameters and phase calculation were performed with SHARP (de la Fortelle and Bricogne, 1997). Initial phases were calculated at 3.0 Å (Table I), and were significantly improved using the density modification procedures in SOLOMON. An electron density map calculated using these phases showed clear molecular boundaries and allowed the identification of several secondary structure elements to which polyalanine chains were fitted. These elements and the positions of the heavy atom substitutions were used to determine the position of the local 2‐fold axis relating the two molecules in the asymmetric unit. Masks covering the monomer were created from skeletonized electron density and partial models using MAMA (Kleywegt and Jones, 1994), edited interactively with program O (Jones et al., 1991), and used for local averaging and solvent flattening in further cycles of density modification performed with DM (Cowtan, 1994). The final figure of merit for data in the range 25.0–3.0 Å was 0.68. The resulting electron density map was used for model building with program O. The initial model comprised 138 amino acid residues for each of the two copies in the asymmetric unit and had a crystallographic R‐factor of 0.47 for all reflections in the resolution range 10.0–3.0 Å.

Initial phases for the Lbpro (residues 29–201) crystal form were obtained by molecular replacement with the sLbpro coordinates as a model. Rotation/translation parameters were calculated with 95% of data between 15 and 3.5 Å using the AMoRe package (Navaza, 1994), and the information about the presence of a translational symmetry derived from the strong peak found in the native Patterson at position u = 0.5, v = 0.5, w = 0.225 (Navaza et al., 1998). Four independent solutions were found that generate the second four by translation. The molecular replacement solution gave a final correlation factor of 0.69 and an R‐factor of 0.40 for data in the same resolution range.


Refinement was done following standard protocols using iteratively program X‐PLOR (Brünger, 1992) alternating with manual rebuilding in the interactive graphics program O. For both crystal forms, bulk solvent, overall anisotropic B‐factor corrections and tight non‐crystallographic restraints were introduced based on the behaviour of the Rfree index (Table I). The non‐crystallographic restraints for the Lb crystal form were applied considering two groups of 4‐fold symmetrically related molecules.

The refined atomic model for the sLbpro form comprises residues 29–187 for both copies in the asymmetric unit and 20 ordered solvent molecules, and has an R‐factor of 23.5% (Rfree = 28.8%) for all data between 20.0 and 3.0 Å. Of the non‐glycine residues, 82.1% fall within the ‘most favoured regions’ of the Ramachandran plot, on the higher side of acceptable for a 3.0 Å structure as defined by the program PROCHECK (Laskowski et al., 1994). The rest are inside the ‘additional allowed regions’, except for Asp164 which presents good electron density and is located at position i + 1 of a type II′ β turn. No electron density is observed for residues 184–187, which form part of the CTE and project away from the globular catalytic domain. The remainder of the residues, apart from a few solvent‐exposed side chains, present well‐defined electron density for both molecules in the asymmetric unit.

The refined atomic model for the Lbpro form comprises residues 29–201 for four copies in the asymmetric unit while the other four do not include residues 186–192. The N‐terminal residue appears to be modified by a well‐defined acetyl group in the eight independent subunits. The present model has an R‐factor of 25.8% (Rfree = 29.6%) for all data between 20.0 and 3.0 Å. No electron density is observed for residues 186–192 in copies with the disordered conformation of the CTE; however, the remaining residues, apart from a few solvent‐exposed side chains, present well‐defined electron density for all molecules in the asymmetric unit.


We thank F.Torrents and A.Marina for assistance, J.Navaza for his valuable advice, and J.Bravo, F.X.Gomis‐Rüth and J.Seipelt for critical reading of the manuscript. This work was supported by the Austrian Science Foundation (P‐11222 to T.S.) and DGICYT [PB95‐0218 (to I.F.) and P96‐0271 (to J.T.)]. Data collection in Hamburg was supported by the Human Capital Mobility Project, contract CHGE‐CT93‐0040. A.G. is the recipient of a fellowship from the Ministerio de Educacion y Cultura (Spain).


View Abstract