POU domain transcription factors have two separate helix–turn–helix DNA‐binding subdomains, the POU homeodomain (POUhd) and the POU‐specific domain (POUs). Each subdomain recognizes a specific subsite of 4 or 5 bp in the octamer recognition sequence. The Oct‐1 POU subdomains are connected by a 23 amino acid unstructured linker region. To investigate the requirements for the linker and its role in DNA recognition, we constructed POU domains in which the subdomains are connected with linkers varying in length between 2 and 37 amino acids. Binding to the natural octamer site required a minimal linker length of between 10 and 14 amino acids. A POU domain with an eight amino acid linker, however, had a high affinity for a site in which the POUs recognition sequence was inverted. Computer modelling shows that inversion of the POUs subdomain shortens the distance between the subdomains sufficiently to enable an eight amino acid linker to bridge the distance. DNase I footprinting as well as mutation of the POUs‐binding site confirms the inverted orientation of the POUs domain. Switching of the POUs and POUhd subdomains and separation by 3 bp leads to a large distance which could only be bridged effectively by a long 37 amino acid linker. In addition to linker length, mutation of a conserved glutamate residue in the linker affected binding. As shown by surface plasmon resonance measurements, this was caused by a decrease in the on‐rate. Our data indicate that there are both length and sequence requirements in the linker region which allow flexibility leading to selective binding to differently spaced and oriented subsites.
High sequence specificity of cis‐acting proteins is essential for correct target site selection. Several strategies are utilized to achieve accurate binding. Most of these require two or more proteins which either hetero‐ or homodimerize or bind DNA cooperatively. Increased specificity is also achieved when independent DNA‐binding domains are joined via covalent linkage. Connecting two or more domains can create a novel DNA recognition protein with combined specificity and higher affinity. Binding of one subdomain tethers the other, thereby creating a high local concentration. Because of this chelating effect, a stable DNA‐binding domain is formed. Examples of such connected DNA‐binding modules are zinc finger proteins (Pavletich and Pabo, 1993), the myb gene family (Ogata et al., 1994), Cut repeat homeoproteins (Andres et al., 1994) and POU domain proteins (Herr and Cleary, 1995). In the latter family of transcription factors, a homeodomain (POUhd) is covalently attached at its amino‐terminus to another helix–turn–helix DNA‐binding domain, the POU‐specific domain (POUs). The isolated POUhd makes specific contacts to the sequence (A/T)AAT and the POUs subdomain recognizes TATGC (Verrijzer et al., 1992a). The recognition sequence of the bipartite Oct‐1 POU protein consists of the consecutive joining of the two separate subsites to form the optimal octamer sequence, TATGC(A/T)AAT.
Extensive structural studies of the isolated subdomains and the intact Oct‐1 POU domain, employing both NMR spectroscopy and X‐ray diffraction, have established the folding topology of this DNA‐binding domain (Assa‐Munt et al., 1993; Dekker et al., 1993; Klemm et al., 1994; Cox et al., 1995). In Oct‐1, both subdomains are coupled by a stretch of 23 amino acids which does not have a defined structure in the Oct‐1 co‐crystal (Klemm et al., 1994). Proteolysis experiments show that this linker sequence is accessible to proteases both when bound to DNA (Aurora and Herr, 1992) and in solution (Botfield et al., 1992), suggesting that this region is a disordered and possibly flexible part of the protein. This agrees well with the lack of sequence conservation and length observed for the >40 POU domain family members where the linker length varies from only 15 amino acids in Pit‐1 (Ingraham et al., 1988) to as many as 57 amino acids in the Caenorhabditis elegans Ceh‐18 gene product (Greenstein et al., 1994).
Almost all residues which make DNA contacts in the Oct‐1 co‐crystal structure are conserved. Nevertheless, the optimal binding site differs considerably amongst the various POU domain family members (Aurora and Herr, 1992; Verrijzer et al., 1992c). This may indicate that the linker sequence plays a role in site selection. Earlier experiments showed that when linkers are exchanged between Pit‐1 and Oct‐1, the DNA binding specificity of the POU domains is influenced but only in particular POU domain contexts (Aurora and Herr, 1992). For the Brn‐2 POU protein, it was shown that the orientation of POUs relative to the POUhd can be inverted for optimal binding to a site in which the POUs recognition sequence is inverted (Li et al., 1993).
Here we show that the linker length and composition can influence both binding specificity and affinity, independently of the DNA‐binding subdomains. An important determinant of the configuration and orientation of the bound protein appeared to be the distance to be bridged between the two subdomains.
To investigate the requirements of the Oct‐1 linker in POU DNA binding we made a set of deletions by introducing a second EcoRI site next to the unique EcoRI site in the linker sequence and removed the EcoRI–EcoRI fragment. Longer linker constructs were obtained by cloning double‐stranded oligonucleotides into the EcoRI site. Proteins were bacterially expressed with a histidine tag at the amino‐terminus. The various POU proteins were purified by successive application of anion exchange, nickel‐NTA affinity and cation exchange chromatography (see Materials and methods). Purification was verified by SDS–PAGE and Coomassie Blue staining (Figure 1A).
The minimal linker length for binding to the octamer sequence is between 10 and 14 amino acids
Specific binding of the various mutant proteins to the optimal octamer site TATGCAAAT, was tested in a bandshift assay (Figure 1C). Equilibrium dissociation constants were calculated from the amount of proteins used to obtain 50% DNA binding (Figure 1B). Deletion of the middle eight amino acids, leaving a 15 amino acid linker, had no effect on the binding affinity, but reducing the linker length further to nine or eight amino acids resulted in a 3‐ to 4‐fold lower affinity. This indicates that the minimal length for optimal binding to the octamer lies between 10 and 14 amino acids.
In addition, there are also compositional requirements since the 16 and 19 amino acids linkers, which obviously are longer than the 15 amino acids linker, have lower binding affinities (discussed below).
The 4‐fold lower affinity of the eight amino acid linker protein indicates that POUs still contributes to DNA binding, since deletion of the whole POUs subdomain results in a 600‐fold lower affinity (Verrijzer et al., 1992a). When the linker region is almost completely deleted, leaving only two amino acids, DNA binding can no longer be detected (Figure 1C).
Extending the linker to 28 or 37 amino acids did not result in higher affinity for the octamer site.
Separation of the POUs and POUhd recognition sites cannot be compensated for by a longer linker
Although no protein–protein interactions exist between POUs and POUhd in the co‐crystal structure (Klemm et al., 1994), separation of both DNA‐binding sites by introducing one or two C:G base pairs between the subsites (TATGCCAAAT or TATGCCCAAAT) resulted in lower affinity for the wild‐type POU domain (Figure 2 and Klemm and Pabo, 1996). This loss in affinity might be caused by restrictions imposed by the length of the linker and, therefore, we tested various lengths (Table I, Figure 2). The binding characteristics of the 15, 23 or 37 amino acid linker proteins were almost identical with all three differently spaced sites (Table I). Thus the lower affinity cannot be compensated for by an extended linker. The eight amino acid linker showed a 3‐ to 4‐fold reduced affinity with all three binding sites, indicating that the distance constraint on all three sequences is comparable.
Modelling of the spaced subsites
To obtain information on the actual restrictions imposed by the linker, we modelled the subdomains on the separated sites using the coordinates of the co‐crystal structure of the complete Oct‐1 POU domain on the octamer site (Klemm et al., 1994). Employing the program InsightII (Biosym Technologies), both subdomains connected to their recognition sites were defined as separate objects and disconnected. After insertion of one or two C:G base pairs, the objects were reconnected (Figure 3A). From these three modelled structures, we estimated the minimal connecting distance in space between the carboxy‐terminus of POUs to the amino‐terminus of POUhd (Figure 3A). In all cases, this straight line runs through part of the protein. The actual path of the linker will therefore be longer. For the contiguous octamer site, the straight distance is 28 Å and the shortest distance over the surface of POUs is 32 Å. Assuming a length spanned by one amino acid of 3 Å, the 23 amino acid wild‐type linker could easily bridge this distance but the eight amino acid linker could not. Because POUs binding was detected for the eight amino acid linker protein (Figures 1 and 2), some conformational change must allow POUs binding. It is known that the POU protein bends the DNA slightly (Verrijzer et al., 1992b; Klemm et al., 1994). One could envisage that an increased DNA bend could bring these ends closer together. In a circular permutation assay, however, we could not detect any difference between the eight and 37 amino acid linker in DNA bending (Figure 4).
Upon insertion of two C:G base pairs (site referred to as +2), the straight distance in the modelled complex did not increase much (31 Å), whereas upon insertion of one C:G (+1) base pair in the complex this distance is slightly shorter (25 Å, Figure 3A). The insertions led to a rotation of POUs towards POUhd, thus bringing the subdomains closer together. Since 15 amino acids are sufficient to bridge the ‘wild‐type’ distance between the subdomains this explains why we did not observe differences in binding affinity between the 15 and 37 amino acid linker proteins.
The positional rotation of POUs towards the POUhd might also explain the lower affinity of the POU proteins for the spaced sites. When bound to the +1 and +2 sites, POUs residues come into close proximity with the POUhd minor groove residues which will most likely lead to steric hindrance.
Binding to inverted POUs sites
To see if the linker length could influence the flexibility and freedom of movement of the individual subdomains, we inverted the POUs recognition sequence and varied the intervening nucleotides, GCAT–n–TAAT (n = 0, 1 or 2).
Model building (Figure 3B) showed that inversion of the POUs–DNA complex without introducing nucleotides hardly changed the minimal distance connecting both subdomains when compared with the normal octamer orientation (26 versus 28 Å). Further separation decreased the connecting distance to 18 (+1) and 19 Å (+2). No obvious interference of POUs and POUhd residues occurs. The +2 inverted site represents the only configuration in which the linker does not have to pass over the POUs surface.
The binding affinity of wild‐type POU protein for the three inverted sites varies slightly and is 28‐ to 45‐fold lower compared with the octamer site (Table I). In these cases, the distance between the two subdomains can be bridged by the 23 amino acid linker. The binding characteristics of the 15 and 37 amino acid linker proteins are comparable with the 23 amino acid protein (Table I). For the eight amino acid linker a different pattern was seen (Table I). Weak binding (190‐fold lower) was observed to the GCATTAAT (n = 0) site and no binding to the +1 site was detected. However, a surprisingly high affinity was observed for the +2 site despite the almost equal minimal connecting distance (18 versus 19 Å) This difference could be explained by the fact that the straight linker line in the +1 complex runs through the protein, suggesting steric interference, while on the +2 site the linker does not encounter any part of the protein (Figure 3B).
To investigate whether POUs is really inverted, we used two approaches: mutagenesis and DNase I footprinting. An essential contact of the POUs subdomain is the third G:C base pair which is contacted by two amino acid residues in the recognition helix (Klemm et al., 1994). We mutated this base pair in the inverted +2 site to an A:T base pair, leading to GTATccTAAT (+2 m1, Table I). Changing this base pair should lead to a lower affinity in case the POUs subdomain prefers to bind to the inverted (GCAT) sequence. However, if POUs binds in the normal orientation sequence, this should lead to a higher affinity since a T at the −1 position in the octamer orientation is a preferred contact via a hydrophobic T–methyl interaction (Botfield et al., 1994; van Leeuwen et al., 1995). Binding of the eight amino acid linker to this mutated site appeared to be strongly reduced (Table I), indicating that POUs indeed binds in the inverted orientation. Binding of the longer linker length proteins which bind preferentially in the normal orientation is hardly affected by this mutation. The 23 amino acid linker protein even binds slightly better due to the extra T −1 contact.
Another consequence of inverted POUs binding is that extension of this site to TGCATA should lead to a higher affinity site since these bases are additional contacts in the octamer site (Verrijzer et al., 1992a; Klemm et al., 1994). Indeed, this site became more tightly bound by the eight amino acid linker (+2 optimal, Table I) and was even 3‐fold higher than the wild‐type protein. From these mutation studies, we conclude that the POUs subdomain can bind in the inverted orientation on a reversed site.
DNase I footprinting confirmed the inverted POUs orientation. We compared the octamer sequence with the inverted +2 optimal site since both are high affinity sites. If POUs is inverted this should produce a larger protected region on the inverted +2 optimal site. This is indeed what we observe (Figure 5). On the octamer site the region protected by the eight amino acid protein is 3 bp shorter than on the inverted +2, indicating that POUs binds in an inverted orientation rather than tolerating the imperfect ATAC in the octamer arrangement. The same result was obtained for the wild‐type protein, showing that this also binds in the inverted orientation to the +2 optimal site. The homeodomain border was not visible on these footprints because the smallest DNA fragments do not precipitate efficiently. When the bottom strand was footprinted, the homeodomain was positioned normally over the TAAT sequence both on the octamer and on the inverted site (data not shown).
Switching POUs and POUhd recognition sequences
The observed flexibility of the POUs subdomain seems not to be restricted to its ability to invert. The POUs subdomain can also be positioned on the other side of the POUhd, leading to a POUhd–POUs order rather than POUs–POUhd. Such a POUhd–POUs configuration was suggested for the Oct‐1–VP16 complex (Cleary and Herr, 1995) and was observed recently for the Drosophila Drifter (DFR) protein (Certel et al., 1996).
To study whether this was possible for Oct‐1, we first switched the POUs–DNA complex and the POUhd complex in the computer models. Contiguous binding to TAAT‐ATGC, as suggested for DFR, or TAAT‐GCAT, seems very unlikely in view of the large overlap of POUs and POUhd in the major groove (Figure 3C and D). Further spacing of the elements, however, prevented overlapping protein contacts and, therefore, binding to these sites seems plausible.
We tested the three nucleotide spaced sites because no overlapping DNA contacts are made in these configurations (Figure 3C and D). Binding of the wild‐type protein to the switched +3 site with a 34 Å connecting distance between POUhd and POUs shows a 23‐fold lower affinity. An almost equal affinity is observed for the 37 amino acid linker protein (Figure 6A, Table I). The 15 amino acid linker which has wild‐type affinity on all sites tested so far now shows a 2‐fold lower affinity compared with wild‐type. The eight amino acid linker has a much lower affinity, presumably due to the long distance to be bridged (Table I). The contribution of POUs binding on this distant site was confirmed by mutating two POUs contacts (+3 m2, Table I), which resulted in a lower affinity of the 37 amino acid linker protein.
The switched, inverted +3 site TAATcccGCAT has a minimal spanning distance of 44 Å, but the actual linker path is probably longer since the straight line runs through the protein (Figure 3D). At this large connecting distance, a different pattern is observed in which the longer the linker length, the higher is the DNA binding affinity (Figure 6B and Table I). The wild‐type protein binds with low affinity to this site, but extension to 37 amino acids allows more effective binding.
Since the linker length in the POU class 2 family (to which Oct‐1 belongs) varies from 19 to 29 amino acids (Figure 7), it seems plausible that different spacing preferences are possible within the members of this group.
A glutamate residue in the linker is required for optimal binding
In addition to the linker length, the linker composition is also important for site‐specific binding. As shown in Figure 1B, deletion of eight amino acids in the middle part of the linker does not impair binding to the octamer, whereas deletion of seven amino acids of the linker adjacent to the homeodomain resulted in a 5‐fold lower affinity (Figure 1B). A smaller deletion in the same region, leaving as many as 19 amino acids resulted in a 2.5‐fold weaker binding. Alignment of all known class 2 POU linkers (Figure 7) shows low homology throughout divergent species. A glutamate residue is almost completely conserved and is removed in the 16 and 19 amino acid proteins. To study the role of this glutamate residue in DNA binding, we mutated it to a lysine (E95K). This led to a 2.5‐fold reduced affinity (data not shown).
We also tested both the wild‐type and the E95K protein in an IBIS surface plasmon resonance biosensor which allows real time analysis of protein–DNA complex formation. The cuvet setup was a 5′‐biotinylated octamer‐binding site bound to a streptavidin sensor chip (Pharmacia). Association was measured at three protein concentrations. Curve fitting of the association phase resulted in the determination of the ks values [−ks = ka*C + kd, (O'Shannessy et al., 1993)]. Plotting of −ks values versus the concentration (Figure 8) showed that the association constant for E95K is 3‐fold lower compared with wild‐type, while the dissociation constant is almost equal.
These measurements show that the lower affinity of the linker E95K mutant is caused by a lower on‐rate of the protein and that there is little difference in off‐rate.
In this study, we show that both the length and the composition of the linker region connecting the two subdomains of Oct‐1 can influence the specificity and affinity of POU domain DNA binding. We tested the affinity of various linker length POU proteins on recognition sites which have differently arranged and spaced subsites. Using computer‐built models, we estimated the length that the linker has to bridge on these variably spaced and oriented subsites.
The linker length influences DNA recognition
The minimal linker length required for optimal binding to the octamer site lies between 10 and 14 amino acids. This fits well with the shortest natural linkers (15 amino acids) found in the POU family. Such a linker could span 45 Å, allowing some flexibility since the smallest measured distance connecting POUs and POUhd in the Oct‐1 co‐crystal structure is 32 Å.
A large deletion resulting in an eight amino acid linker resulted in a 4‐fold lower affinity for the octamer site because this linker is too small to bridge the distance between the subdomains. This eight amino acid linker protein, however, has a high affinity for a site in which the optimal POUs recognition site is inverted and separated from the POUhd site by 2 bp, tGCATacTAAT. Model building shows that on this site the distance to be bridged between POUs and POUhd is only 19 Å, explaining why the eight amino acids now can bind efficiently. This affinity is even 3‐fold higher than the longer wild‐type protein, possibly due to the higher flexibility of the wild‐type protein in solution.
While DNA binding of a short linker can be restored on a site which brings the two subdomains closer together, a large linker can (partially) restore binding to two subsites which are distant. The site TAATcccGCAT creates a connecting distance over the surface of at least 50 Å. This site is bound six times more efficiently by a 37 amino acid linker protein than by the wild‐type protein.
These data show that the linker length can play an active role in site selection. The sites efficiently bound by the eight and 37 amino acid proteins do not resemble the octamer‐binding site and therefore would not be detected easily in a computer search for possible target sites.
The linker composition influences binding affinity
Two observations show that the linker composition influences binding to DNA. The 16 amino acid linker which has nearly the same length as the 15 amino acid linker protein, but is different in amino acid sequence, has a 5–fold lower affinity for the octamer site. Secondly, a linker mutant E95K has a 3‐fold lower on‐rate but off‐rates comparable with the octamer site. Apparently, DNA docking is influenced by the linker but not its stability on DNA. This seems to exclude a direct DNA contact by the glutamate residue. Another observation which seems to rule out DNA contacts by the linker is that the linker region attached to POUhd by itself does not increase the affinity (data not shown). An explanation for the observed lower affinity of the E95K protein could be that there is a structural constraint in the linker region which determines the flexibility of the overall POU structure in solution. Introducing a positively charged residue could affect this flexibility.
Earlier experiments have shown that swapping Oct‐1 and Pit‐1 linkers only influences specificity depending on the POU DNA‐binding domain context (Aurora and Herr, 1992). This context dependency could indicate that the linker makes protein–protein contacts on the POU surface. The linker glutamate could form a salt bridge with a lysine or arginine residue on the POUs surface. This salt bridge would be disrupted in the E95K mutant. We found two possible candidates on the POUs surface: Lys36 and Lys69. However, changing these to Glu in the linker E95K mutant context, and thereby possibly restoring a reversed salt bridge, did not restore affinity to wild‐type levels (data not shown).
A comparable E→K mutation in the linker region was found in a random mutagenesis screen of the Pit‐1 DNA‐binding domain fused to the GCN4 transactivation domain with LacZ as an indicator gene, but the precise effect of this mutation in this screen is unknown (Liang et al., 1995).
The octamer site remains the optimal binding site
Insertions of up to 2 bp between the consecutive POUs‐ and POUhd‐binding sites do not change the connecting distance much. Nevertheless, binding to these sites is reduced. Several explanations are possible. First, computer modelling shows the possible interference of POUs residues and part of the POUhd in the centre. Another explanation could be that preferential DNA contacts of POUs and POUhd to the same base pair(s) are lost upon spacing (Klemm and Pabo, 1996) or that new overlaps are disadvantageous. Klemm and Pabo (1996) showed that the isolated POUs and POUhd bind cooperatively even in the absence of the linker. Since no protein–protein contacts were detected between the domains in the crystal structure, they suggest that overlapping DNA contacts near the centre of the octamer site may mediate this cooperativity and explain why the non‐spaced octamer site is the preferred site. Several of such joined contacts have indeed been observed. For example, the fifth A/T base pair of the octamer site (ATGCAAAT) seems to be contacted by the POUs subdomain, via a major groove contact (Gstaiger et al., 1996), and by the POUhd in the minor groove (Klemm et al., 1994). However, when we mutated this POUs residue (Leu55) or the POUhd contacts (Lys103 and Arg105), this did not influence its binding pattern to spaced sites (data not shown). Also, mutation of a POUs residue which makes a phosphate backbone contact to the fifth A:T base pair (Asn59) did not change the spacing preference. Thus, no evidence has been obtained so far that a single residue with overlapping contacts is responsible for keeping the two POU subdomains in the octamer arrangement.
Finally, it could be that an overall DNA configuration causes the preference for the contiguous octamer site. Verrijzer et al. (1992b) showed by biochemical means that the POU domain bends the DNA slightly, and a DNA bend was also seen in the co‐crystal structure (Klemm et al., 1994). Indirect evidence for structural changes in the DNA comes from the DNase I footprints displaying hypersensitive sites, higher up in the gel, when POUs binds in the inverted orientation (Figure 5). Such a change in DNA bending angle was observed when the tail connecting the MATα2 and MATa1 homeodomains was extended (Jin et al., 1995; Li et al., 1995). The normal spacing of 6 bp between the α2 and a1 half‐sites can be increased to 7 bp when three glycine residues are inserted within this linker.
Differences in binding flexibility might influence gene activation
Differences in sequence specificity between the many members of the POU domain family can be achieved by preferential binding to particular spaced and orientated subsites (Li et al., 1993; Certel et al., 1996). As a consequence, interactions with other proteins might be influenced.
An example of an interacting protein which requires a particular recognition site is the herpes simplex virus co‐activator VP16 which associates with the Oct‐1 homeodomain (Lai et al., 1992; Pomerantz et al., 1992). This multiprotein–DNA complex is formed on the sequence TAATGAGATAC but not on the optimal octamer site (Walker et al., 1994). Oct‐1 by itself binds weakly to this site but is stabilized by its association with VP‐16. POUs binding to this sequence has been suggested 3′ of the POUhd TAAT sequence, contacting either ATAC or the opposite strand tATCT (Cleary and Herr, 1995). The first arrangement is comparable with the site TAATcctATGC which we tested but is more stably bound by Oct‐1 due to the optimal POUs sequence (Verrijzer et al., 1990). In contrast to VP16, a B‐cell‐specific transcriptional co‐activator of Oct‐1 and Oct‐2, variously termed Bob1, OBF‐1 or OCA‐B (Gstaiger et al., 1995; Luo and Roeder, 1995; Strubin et al., 1995), only forms complexes with Oct‐1 on a contiguous octamer arrangement but not on the TAATGARAT sequence (Gstaiger et al., 1996). Thus, these two factors require different recognition sequences of Oct‐1. Co‐factors, rather than being dependent on the sequence, can also dictate the sequence arrangement. One could envisage that factors bound to the surface of the POU domain influence the path the linker takes and thus the arrangement of the subdomains.
Most POU domain transcription factors are expressed in different cell types where they are implicated in developmental regulation through specific activation of their target genes (Schöler, 1991). It is evident that different preferences in orientation and spacing of the subsites may influence gene activation either directly or via interacting proteins.
In many POU family members the coding sequence for the two subdomains, POUs and POUhd, are separated by an intron in the genome. This could be a potential determinant of linker length if differential splicing of the intron occurs. The Oct‐1 linker intron is out‐of‐frame with the coding sequence (Sturm et al., 1993), thus only alternative splicing of this intron could generate a different functional protein. To our knowledge, no such Oct‐1 mRNAs have been reported. In the case of Oct‐2, lack of removal of this particular intron (93 bp) might indeed lead to lengthening of the linker (an extra 31 amino acids) since this intron does not contain a stop codon (Matsuo et al., 1994). However, such a protein has not been observed so far.
Using models to analyse putative conformations
The Oct‐1 co‐crystal structure enabled us to rearrange the two subdomains on differently spaced and oriented sites and subsequently measure either the straight distance or estimate the distance over the surface that the linker has to bridge in order to connect the two subdomains. These models cannot, by definition, take into account changes in the protein or DNA conformation and this may limit our interpretation of the data. However, such modelling can be effective as shown here and in a comparable computer modelling strategy, where Pomerantz and co‐workers reasoned that a four amino acid linker connecting a fusion between the Zif268 zinc finger motif and the Oct‐1 homeodomain only allowed one orientation in which the carboxy‐terminal end of the zinc finger was within 8.8 Å of the connecting amino‐terminal arm of the homeodomain (Pomerantz et al., 1995).
Combining two DNA‐binding domains with a flexible linker into a single structure has provided the cell with a new site‐specific DNA‐binding protein. Further divergence of the linker length and composition has created even more complexity in sequence recognition. Different arrangements of the subdomains directed by linker composition may be accompanied by other protein–DNA and protein–protein interactions in various POU family members, leading to further stabilization. Such a fine tuning can only be unravelled if other POU–DNA complex structures on differently spaced sites are solved.
Materials and methods
Construction of linker mutants
The construction of POU linker mutants was facilitated by the presence of a unique EcoRI site in the linker region. Using oligonucleotide‐directed in vitro mutagenesis (Promega), nucleotides were added and/or changed to introduce a second EcoRI site. The EcoRI–EcoRI fragment subsequently was removed and the ends religated, thereby creating small deletions. The mutations are given below: the newly introduced nucleotides are in bold, the EcoRI sites are underlined. The numbers indicate the amino acid linker length after removal of the EcoRI fragment. 9, gaattcctctcatctgattcgt ccctctccagcccaagtgccctgaattctccaggaattgagggcttgag; 15, gaacctctca tctgattcgaattccctctccag cccaagtgccctgaattctccaggaattgagggcttgag; 23, gaacctctcatctgattcgtccctctccag cccaagtgccctgaattctccaggaattg agggcttgag; 19, gaacctctcatctgattcgtccctc tccagcccaagtgccctgaattctccaggaattcgggcttgag; 16, gaacctctcatctgattcgtccctctccagc ccaagtgccctgaattctccaggaattgagggcttgaattc.
Ligation of the nine amino acid mutant EcoRI site to the 16 amino acid mutant EcoRI site resulted in the two amino acid linker construct. Combination of the 15 amino acid mutant EcoRI site with the 16 amino acid mutant EcoRI site resulted in the eight amino acid linker construct.
The 28 amino acid linker was created by hybridization of two oligonucleotides, 5′AATTCTACCGCCTCC3′ and 5′AATTGGAGGCGGTAG3′, which were cloned into the EcoRI site of the wild‐type linker. The 37 amino acid linker was created by introducing the 27 bp EcoRI–EcoRI fragment which was removed from the 15 amino acid linker mutant in the unique EcoRI site of the 28 amino acid linker construct. For amino acid sequences of the deletion constructs, see Figure 1B.
Expression and purification of wild‐type and mutant POU proteins
Oct‐1 POU constructs were cloned in pET15b (Novagen) as described in van Leeuwen et al. (1995). Proteins were expressed in Escherichia coli BL21(DE3) pLysS cells using the T7 expression system (Studier et al., 1990). The His‐tagged fusion proteins were purified on DEAE–Sepharose (Pharmacia), Ni2+‐nitrilo‐tri‐acetic acid (Qiagen) columns essentially as described by the manufacturer and an SP Sepharose (Pharmacia) column as described in van Leeuwen et al. (1995). Purification was checked on a 15% SDS–polyacrylamide gel.
The probes used for bandshift assays were double‐stranded oligonucleotides, end‐labelled with T4 polynucleotide kinase and purified by preparative polyacrylamide gel electrophoresis. The sequences are as indicated in Table I. DNA concentrations were determined by absorption at 260 nm. The concentration of input DNA was 0.5 nM. Binding reactions were carried out for 60 min on ice in 20 ml of binding buffer [20 mM HEPES–KOH pH 7.5, 1 mM EDTA, 1 mM dithiothreitol (DTT), 0.025% NP‐40, 4% Ficoll, 100 mM NaCl]. Free DNA and protein–DNA complexes were separated on a 15% polyacrylamide gel (37.5:1) run in 0.5× TBE at 4°C for 4 h at 100 V. Dried gels were exposed and quantified by phosphoimaging (Molecular Dynamics). The equilibrium dissociation constant (Kd) was calculated at half saturation from the equation Kd = Pt−Db (Db = DNA bound) (Verrijzer et al., 1992a). The total protein concentration (Pt) was calculated using a deduced Mr of 20 kDa for the wild‐type His6‐POU protein. Appropriate corrections were made for insertions and deletions.
Circular permutation assay
DNA fragments were generated by digestion of plasmid pBend2Ad4 (Verrijzer et al., 1992b) with restriction enzymes MluI (A), EcoRV (D) and BamHI (H). Fragments were dephosphorylated and end‐labelled with [γ‐32P]ATP and polynucleotide kinase. Binding conditions were as described above. Free DNA and protein–DNA complexes were separated on an 8% polyacrylamide gel (37.5:1) run in 0.5× TBE at 4°C for 16 h at 5 V/cm.
DNase I footprinting
The DNA used in the footprinting assays was a 40 bp EcoRI–XbaI fragment from the plasmids WT38, 45.02 and 55.07 described before (van Leeuwen et al., 1995). The EcoRI site was end‐labelled by a partial fill‐in reaction with Klenow polymerase and [α‐32P]dATP. Footprint reactions were performed as described previously (Verrijzer et al., 1990). Amounts of protein are indicated in the figure legends.
Surface plasmon resonance measurements
Real time analyses of POU–DNA interactions have been investigated using an optical IBIS biosensor device (Intersense Instruments, Amersfoort, The Netherlands) based on surface plasmon resonance signals which are related to the number of molecules bound to the sensor surface. 5′‐Biotinylated DNA‐binding sites were immobilized to a streptavidin sensor chip (Pharmacia) at ∼1×10−13 mol/mm2. Association rates were measured at various protein concentrations in 60 ml of phosphate‐buffered saline/0.02% NP‐40. Biphasic curve fitting (O'Shannessy et al., 1993) of the binding curves reveals the ks value (−ks = ka*C + kd). When −ks values are plotted versus the protein concentrations, the slope equals the association rate constant (ka) and the y‐intercept equals the dissociation rate constant (kd).
Models were constructed by excising the POU subdomains plus their respective recognition sequences, 5′ATGC3′ and 5′AAAT3′, as separate objects from the original co‐crystal structure (ID = 1oct; (Klemm et al., 1994). The POUs complex was reconnected with the POUhd in the two possible orientations and, in some cases, 1–3 B‐DNA C:G base pair(s) were first connected to the appropriate end of the POUs recognition sequence. Construction of these models was performed with the program InsightII (Biosym Technologies) on an Indigo XZ machine (Silicon Graphics) and depicted with the program Molscript (Kraulis, 1991).
We would like to thank Job Dekker, Frank Holstege, Marian Walhout and Bas Werten for stimulating discussions and Marc Timmers for critical reading of the manuscript. This work was supported in part by the Netherlands Foundation for Chemical Research (SON) with financial support from the Netherlands Organization for Scientific Research (NWO).
- Copyright © 1997 European Molecular Biology Organization