The crystal structure of chicken egg white riboflavin‐binding protein, determined to a resolution of 2.5 Å, is the prototype of a family that includes other riboflavin‐ and folate‐binding proteins. An unusual characteristic of these molecules is their high degree of cross‐linking by disulfide bridges and, in the case of the avian proteins, the presence of stretches of highly phosphorylated polypeptide chain. The structure of chicken egg white riboflavin‐binding protein is characterized by a ligand‐binding domain and a phosphorylated motif. The ligand‐binding domain has a fold that appears to be strongly conditioned by the presence of the disulfide bridges. The phosphorylated motif, essential for vitamin uptake, is made up of two helices found before and after the flexible phosphorylated region. The riboflavin molecule binds to the protein with the isoalloxazine ring stacked in between the rings of Tyr75 and Trp156. This geometry and the proximity of other tryptophans explain the fluorescent quenching observed when riboflavin binds to the protein.
Transport of riboflavin in mammalian plasma appears to be non‐specific under normal conditions. Both albumin (Jusko and Levy, 1975) and riboflavin‐binding immunoglobulins (Innis et al., 1985; McCormick et al., 1987) are riboflavin carriers, but in recent years evidence has emerged indicating that normal transport mechanisms are no longer sufficient during pregnancy. Since adequate amounts of riboflavin are essential for normal fetal development, it is not surprising that under these more demanding conditions a specific carrier system has evolved with the special task of vitamin delivery to the developing embryo. Thus, pregnancy‐specific riboflavin‐binding proteins have been found in rat (Muniyappa and Adiga, 1980), mice (Natraj et al., 1987), bovine (Merril et al., 1979), simian (Visweswariah and Adiga, 1987) and human (Murthy and Adiga, 1982b) plasma. Although these carriers have not been characterized thoroughly, they all appear to share a common feature: their similarity to the well‐known chicken riboflavin‐binding proteins (RfBPs). Antibodies against chicken RfBP have been found to cause the end of pregnancy in rats (Murthy and Adiga, 1982a; Krishnamurty et al., 1984), mice (Natraj et al., 1987) and the bonnet monkey (Visweswariah and Adiga, 1987), thus showing that there is structural similarity between the avian RfBP and the mammalian pregnancy‐specific riboflavin‐binding proteins and that these proteins play an essential role in the survival of the fetus.
The family of the chicken RfBPs includes three well‐known proteins that are the product of the same gene but have undergone different post‐translational modifications (White and Merrill, 1988). The proteins can be purified from egg white (Rhodes et al., 1959), egg yolk (Ostrowski et al., 1962) and from the plasma of laying hens (Miller et al., 1982a). Egg white RfBP is synthesized by the oviduct cells (Mandeles and Ducay, 1962), plasma RfBP is produced in the liver under estrogen control and yolk RfBP is the result of a proteolytic cleavage of the last 11‐13 amino acids of plasma RfBP when the molecule crosses the oocyte membrane (Norioka et al., 1985). The three proteins have a single binding site for riboflavin with a dissociation constant of 1.3 nM in the pH range 6‐9; removal of the vitamin from the holoprotein is accomplished at lower pH where the affinity for the specific ligand is substantially reduced (Müller and van Berkel, 1991). Plasma and egg white RfBP share a sequence which is 219 amino acids long (Hamazume et al., 1984) but they have different carbohydrates attached in the same positions: Asn36 and Asn147. The primary structures of the carbohydrates attached to plasma (Rohrer and White, 1992) and yolk (Tarutani et al., 1993) RfBP have been determined and shown to be identical. Although the carbohydrate composition of the egg white RfBP is known, the primary structure of its oligosaccharide chains has not yet been determined.
It was shown recently that all three RfBPs can associate with the lipid carrier vitellogenin and that the macromolecular complex of the two carriers recognizes a multifunctional oocyte‐specific lipoprotein receptor (Mac Lachlan et al., 1994). A peculiar characteristic of the three chicken RfBPs is the presence of a highly phosphorylated region that extends from amino acids 186 to 197, where there are up to eight phosphates bound to serines (Hamazume et al., 1984). This region is involved in oocyte uptake of the plasma protein (Miller et al., 1982b), possibly by playing an essential role in the interaction of plasma RfBP with vitellogenin (Mac Lachlan et al., 1994). Another unique characteristic of the three chicken RfBPs is the presence of nine disulfide bridges, whose assignment was made chemically in the egg white protein (Hamazume et al., 1987), that play an important structural role since refolding of the molecule is fast and efficient when denaturation is carried out, leaving them intact (McClelland et al., 1995). The interactions of the chicken RfBPs with different flavin derivatives are probably the best studied of any flavoprotein (Lubas et al., 1977; Walsh et al., 1978; Nishina et al., 1980; Becvar and Palmer, 1982; Matsui et al., 1982; Wessiak et al., 1984). Upon binding to the chicken RfBPs, riboflavin loses its characteristic fluorescence, a fact interpreted as due to a stacking of the isoalloxazine ring of the vitamin with aromatic side chains of the protein molecule (Blankenhorn, 1978). A search for sequence homology reveals that the RfBPs are structurally related to the folate‐binding proteins (FBPs) (Zheng et al., 1988) but not to any protein of known three‐dimensional structure.
The crystal structure of hen egg white RfBP, the first member of this new structural family, is presented here. It is expected that the model described will serve as a starting point for structural studies of other RfBPs and also for the FBPs.
Results and discussion
Overall structure of chicken RfBP
Chicken RfBP is a globular monomeric protein of approximate dimensions 50×40×35 Å (Figure 1). The most distinctive feature of this new fold is the presence of a ligand‐binding domain which runs from the N‐terminus up to about Cys169, and a phosphorylated motif which runs from there to the C‐terminus. In the ligand‐binding domain, there is a cleft ∼20 Å wide and 15 Å deep that accommodates the bound vitamin. Figure 2 shows the electron density of the isoalloxazine ring of riboflavin stacked in between the side chains of Tyr75 and Trp156. About 30% of the residues in RfBP are found in α‐helices and a little less than 15% in β structure. There are a total of six α‐helices, designated A‐F, and four series of discontinuous areas of β structure (a, b, c and d). Helices B and D each have a three residue interruption in a position where there is a pronounced bend that completely changes their direction. The first three areas of β structure, a, b and c, present one, two and three gaps filled by coil. The longest continuous portion of β structure in the molecule is about five residues long. The ligand‐binding domain includes helices A‐D and the four areas of β structure; the phosphorylated motif is made up of helices E‐F and, in between, the phosphorylated region which is not well ordered in the maps. Other regions of the molecule that are not ordered in the crystals are the first two and the last seven residues. The structure of the ligand‐binding domain appears to be strongly influenced by the presence of eight out of a total of nine disulfide bridges. The ninth bridge, which links Cys167 to Cys202, anchors helix F to the ligand‐binding domain and is probably the reason why this portion of the map is ordered, although it follows the flexible phosphorylated area in the sequence. Characteristic of the topology of the ligand‐binding domain is an alternation of the α‐helices and the areas of discontinuous β structure (see Figure 1B). In the topological diagram, the arrows do not represent two pairs of antiparallel β‐strands but areas of β structure that have twists, bends and gaps and which are found in between the helices represented in the figure. The areas of β structure are very complex, as can be appreciated by inspection of Figure 1A.
Table I lists the secondary structure assignments of the 18 cysteines present in the molecule, all of them participating in disulfide bridges. Nine of these residues are assigned to coils, seven to helices and two to the areas of β structure. Helix C is disulfide‐bridged to both helices B and D. The first 35 amino acids in the sequence, the longest portion of the chain that is virtually without elements of repetitive secondary structure, include Cys5, Cys24, Cys32 and Cys33 participating in three disulfide bridges.
The position of the two oligosaccharide chains of the molecule is indicated in Figure 1A. The first chain, bound to Asn36, is found at the beginning of helix A, and the second, linked to Asn147, is in the middle of the coil present in the bend of helix D. Although clear electron density is observed in both cases for the proximal sugar residues, no attempt was made to include them in the model since their primary structure is unknown.
It has been pointed out (Hamazume et al., 1984) that there exists genetic variability at position 14 that can be either a lysine or an asparagine. Since this position is on the surface of the molecule and the electron density of its side chain is not well defined, it has not been possible to decide which of the two residues is present in these crystals.
A search of the EMBL database using the program DALI (Holm and Sander, 1993) confirmed that the fold of chicken RfBP is not similar to that of any of the 680 proteins that were included in that file
The ligand‐binding site
The electron density for riboflavin and the amino acid side chains that are in contact with it is very clear in the DM multiple isomorphous replacement (MIR) map (see Figure 2). Binding of riboflavin occurs in a cleft with the vitamin isoalloxazine ring stacked between the parallel planes of Tyr75 and Trp156. A question which was less simple to resolve was whether binding takes place with the pyrimidine or the xylene moieties buried most deeply in the protein. The isoalloxazine ring of flavins is amphipathic since the xylene portion is hydrophobic and the pyrimidine moiety hydrophilic. When flavins bind to a protein with the hydrophilic end buried, as for example in the case of flavodoxins (Watenpaugh et al., 1973; Rao et al., 1992), a series of strong hydrogen bonds to the hydrophilic portion of the molecule characterizes the protein‐ligand interactions. The electron density for the vitamin and the position of the protein side chains indicated quite clearly that in RfBP this is not the case, i.e. that in RfBP it is the xylene moiety of the triple ring that is buried most deeply in the protein. This mode of binding is in agreement with the results of many studies of flavin derivative binding to chicken RfBP (Merrill and McCormick, 1978; Choi and McCormick, 1980; Mifflin and Langerman, 1983; Ghisla and Massey, 1986). Table II lists the main inter‐atom distances measured between the protein side chains in the ligand‐binding site and the vitamin groups near them and in between atoms of the ribityl moiety. Figure 3A is a stereo diagram of the main side chains present in the ligand‐binding site, and Figure 3B is a schematic representation of their interactions. As expected, the major interactions of the isoalloxazine ring with the protein are hydrophobic.
A characteristic feature of ligand binding to RfBP is the almost complete fluorescence quenching observed upon binding of not only the natural ligand, riboflavin, but of flavin analogs as well. Figure 4 is a stereo diagram that shows Tyr75 and all the tryptophan side chains present in the protein molecule. With the exception of Trp84, all the other tryptophans are close to the active site. Trp120 is ∼4 Å distant from the crucial Trp156 stacked onto the vitamin plane and roughly at the same distance from Trp124. Trp106 is only slightly further from Trp156, ∼4.5 Å. The same distance separates Trp120 from Trp54. Thus, five out of the six tryptophans present in the protein molecule cluster in the vicinity of the ligand‐binding site, and the plane of one of them (Trp156) is in direct contact with the plane of the ligand.
Bound riboflavin was found to protect an essential carboxyl group in the ligand‐binding site from inactivation by carbodiimide (Kozik, 1982a). The position of the residue in the sequence has not been determined, but inspection of Table II reveals that the most likely candidate is Glu72, which is hydrogen‐bonded to the O3* group of the ribityl moiety of the vitamin.
Cleavage of a single disulfide bond results in a loss of the binding capacity of RfBP, but all the disulfide bridges in the holoprotein show the same reactivity as in the apoprotein, indicating that the critical bridge is not located in the ligand‐binding site (Kozik, 1982b). Figure 5 shows the position of the bridges relative to the bound riboflavin. None of them is at the binding site. The two that are closest are Cys24‐Cys73 and Cys103‐Cys152, which are at a distance of ∼9‐10 Å from the methyl groups of the xylene moiety of the isoalloxazine ring.
The phosphorylated motif
The phosphorylated motif is found after Cys169, a member of the last bridge which is located exclusively in the ligand‐binding domain, and it continues to the C‐terminus of the protein molecule (see Figure 1A). A remarkable feature is the highly anionic region that runs from residue 186 to 199 (E‐PS‐PS‐E‐E‐PS‐PS‐PS‐M‐PS‐PS‐PS‐E‐E, PS = phosphoserine). If the phosphates are removed with acid phosphatase, plasma protein uptake by the oocyte decreases dramatically (Miller et al., 1982b). After it was shown that RfBP binds to an oocyte‐specific receptor in association with vitellogenin, it was suggested that this interaction involves bridges with calcium ions between the phosphoserines of the two proteins (Mac Lachlan et al., 1994). Unlike most of the rest of the protein molecule, this anionic region is not ordered in the electron density maps, but the polypeptide chain immediately before and after it folds clearly into the last two helices of the molecule, E and F. An attempt to render it more ordered was made by soaking a crystal in CaCl2, but the two electron density maps, in the presence and in the absence of calcium, did not differ significantly. The phosphorylated motif can thus be described as made up of a flexible anionic region which is inserted in between two antiparallel helices. It has been noticed that the β‐caseins have phosphorylated regions preceded and followed by chains that are similar in sequence to that of chicken RfBP (Holt and Sawyer, 1988). Therefore, structurally analogous motifs may be present in the similarly phosphorylated caseins. The second helix of the motif, F, is anchored to the ligand‐binding domain through the Cys167‐Cys202 bridge. The two helices are found on the surface of the molecule and the anionic region protrudes into the solvent, which is compatible with the function which has been assigned to the motif (see Figure 1).
The chicken egg white RfBP crystals were grown using ammonium sulfate as precipitant (Zanette et al., 1984). When crystals grown in this mother liquor were transferred to solutions that contained 3.8 M ammonium succinate instead of sulfate, it was found that the crystals not only retained their integrity but they even diffracted to slightly better resolution. The unit cell parameters changed by <1% but the mean fractional isomorphous difference was found to be >40% (see below).
Inspection of the model of the protein in this second crystal form showed that the main effect of succinate binding was a rigid body rotation about an axis that is close to that of the first portion of helix D. An Fobs−Fc map of the succinate form, calculated with the refined model, revealed that there was residual density that could be associated with four or five bound succinate molecules. The most interesting was present in a cleft found in between the ligand‐binding domain and the phosphorylated motif (see Figure 6). The Fobs−Fc map of the equivalent sulfate form showed that this density was not present there. A succinate molecule that explained the density found was therefore added to the protein model of the first crystal form. Further refinement with the added succinate showed that the new moiety behaved quite reasonably, and in the end its temperature factors converged to values that were ∼50 for all its atoms (the temperature factors of the other regions of density present in the succinate and absent in the sulfate maps refined all to higher values).
The carboxyl end of the liganded succinate which is buried deepest in the cleft appears to be hydrogen‐bonded to the side chain of His80. The other end is hydrogen‐bonded to Lys174 and the peptide carbonyl of Leu26. The residues which are in closest contact with the succinate moiety are Leu26 and Tyr27, Pro79 and His80 in the ligand‐binding domain and Lys174, Asp176, Met177 and Leu180 in the phosphorylated motif. The residues of the ligand‐binding domain are found in a helical turn of the coiled region that precedes helix A and in the bend of helix B. The amino acids of the phosphorylated motif are part of helix E.
The fact that this succinate‐binding cleft is found at the interface between domain and motif suggests that it may not be a mere artifact but it may have some as yet unidentified function. Both egg white (Mano et al., 1992) and yolk (Massolini et al., 1995) RfBPs have been used as stationary phases for the separation of chiral compounds that in some cases have structures that are quite different from that of riboflavin; some of those compounds may bind in this cleft.
Structural homology with folate‐binding proteins
Figure 7 compares the amino acid sequence of chicken RfBP (Zheng et al., 1988) with those of bovine milk FBP (Svendsen et al., 1984) and FBP from human malignant tissue culture (KB) and placenta cells (Elwood, 1989). The boxed residues are those that are identical in RfBP and at least one of the FBPs. The sequence similarity of the residues that in RfBP fold into the ligand‐binding domain to equivalent portions in the two FBPs justifies the proposal that a structurally similar binding domain is present in the FBPs. The FBPs play an essential role in the distribution and assimilation of folic acid (for a review, see Henderson, 1990). They can be either water soluble or membrane bound; bovine milk FBP is a prototype of the first class while human FBP has a C‐terminal portion which is hydrophobic enough to make it compatible with a membrane‐anchoring motif (Elwood, 1989).
It is worth noticing that the 16 cysteines that in RfBP form the eight bridges present in the ligand‐binding domain are conserved in all the three proteins. The C‐terminal portion of the molecules, on the other hand, shows very little similarity, which is what would be anticipated for motifs with different functions.
The structure established here is thus related more closely to the FBPs than to the known structures of other flavoproteins which bind ligands that contain the flavin moiety.
Materials and methods
Protein purification and crystallization
The purification and crystallization of chicken egg white RfBP have been described previously (Zanette et al., 1984). The crystals were stable for a very long time if kept in the dialysis microcells in 43% (w/v) ammonium sulfate, 0.05 M Tris pH 8.5. When the crystals were transferred to a solution of 3.8 M ammonium succinate, 0.05 M Tris pH 8.5, the unit cell parameters changed by <1% but the mean fractional isomorphous difference between the two native data sets was 43.2%. The structure was solved using crystals transferred to the succinate mother liquor because resolution and data quality were slightly better in this medium.
Data collection and reduction
All data were collected at room temperature from crystals with typical dimensions 1.5×0.4×0.4 mm (Table III). The native data sets in succinate and sulfate and the Pt and 2 (acetoxy‐mercuri)‐4‐nitrophenol (MNP) derivatives were collected on a Rigaku R‐axis II imaging plate area detector mounted on a Rigaku RU‐200 rotating anode X‐ray generator. The source was operated at 50 kV and 160 mA using a focal spot size of 0.3×3 mm. Monochromatic copper Kα radiation was obtained using a graphite crystal monochromator. These data were processed using the R‐axis program package. The mercury derivative was collected on a Mar Research imaging plate area detector at the X11 beamline of the EMBL outstation of the DESY synchrotron (Hamburg, Germany). The wavelength was 0.9 Å and the data were processed using the DENZO package (Otwinowski, 1993). Friedel pairs were kept separated to be used in the phasing.
The Patterson difference functions of the Pt and the Hg derivatives collected in the laboratory were interpreted using programs of the CCP4 (Collaborative Computational Project Number 4, 1994) and SHELX‐90 (Sheldrick, 1991) packages. From the preliminary phasing, it became apparent that the second derivative potentially could be exploited for anomalous dispersion phasing, and it was decided to collect Bijvoet pairs under more favorable conditions. The mercury derivative data collected in Hamburg, which included 81% of the pairs up to a resolution of 2.5 Å, were used initially to resolve the ambiguity concerning the crystal space group. The crystals of chicken egg white RfBP are trigonal, with a = b = 112.5 Å and c = 72.0 Å. The diffraction pattern is compatible with the two enantiomeric space groups P3121 and P3221. The ambiguity was resolved using the classical method (Blundell and Johnson, 1976). The single major site of the Hg derivative was used to calculate single isomorphous replacement (SIR) plus anomalous dispersion phases which were then used to calculate the intensity of the Pt peaks in difference Fourier syntheses calculated for the two possible space groups. The intensity of the Pt peak corresponding to space group P3221 was approximately three times as high as that of the space group P3121. The heavy atom positions of the three derivatives were refined and MIR phases were calculated using the program MLPHARE (Otwinowski, 1991). The overall figure of merit for the reflections in the resolution interval from 10 to 2.8 Å was 0.69. The calculated map was then subjected to solvent flattening and histogram matching using the program DM (Cowtan, 1994). The map thus produced (see Figure 2) was of very good quality and readily allowed tracing of the polypeptide chain.
Model building and refinement
The RfBP model was built using the program O (Jones et al., 1991). The presence of good density for the disulfide bridges facilitated the assignment of the primary sequence. Since the primary structure of the oligosaccharide chains is unknown, no attempt was made to include them in the model. In the region between Leu184 and Glu199 no clear density was present in the maps and, therefore, those amino acids are missing from the model. The R‐factor for the initial model was 43.8% for the data in the resolution interval between 10 and 2.8 Å. After conventional conjugate gradient least squares refinement using the program XPLOR (Brünger, 1992a), the R‐factor decreased to 31.5 and the free R‐factor (Brünger, 1992b), calculated for 5% of the reflections, to 34.2%. Subsequent cycles of model building alternated with refinement brought the R‐factor to 26.2% and the free R‐factor to 31.3%. During the process of refinement and model building, the quality of the model was controlled using the program PROCHECK (Laskowski et al., 1993). Additional refinement was done using the program TNT (Tronrud et al., 1987). When the model had an R‐factor of ∼21%, its coordinates were used to calculate an R‐factor using the data collected from the native crystals in the sulfate mother liquor. This initial R‐factor was 43.2% for data in the resolution interval between 10 and 2.8 Å. After rigid body refinement, this R‐factor decreased to 31.7% and, after some minor corrections and further cycles of refinement, to 23.0%. Table IV lists the final refinement statistics of the models of RfBP in the succinate and sulfate mother liquors. All the residues of the two models are in the energetically favored regions of the Ramachandran plot (Laskowski et al., 1993)
I am grateful to Alessandro Coda, Menico Rizzi, Andrea Carfì, Sandro Ghisla, Andrea Mattevi, Elena Giulotto and Silvia Onesti for many fruitful discussions. This work was supported by grants from the Italian Ministry of the Universities, the Italian National Research Council (P.F. Structural Biology) and the Italian Space Agency (ASI). I thank the European Union for support of the work done at the EMBL outstation in Hamburg through the HCMP to Large Installations Project, contract No. CHGE‐CT93‐0040.
- Copyright © 1997 European Molecular Biology Organization