The LIM‐only protein Lmo2, activated by chromosomal translocations in T‐cell leukaemias, is normally expressed in haematopoiesis. It interacts with TAL1 and GATA‐1 proteins, but the function of the interaction is unexplained. We now show that in erythroid cells Lmo2 forms a novel DNA‐binding complex, with GATA‐1, TAL1 and E2A, and the recently identified LIM‐binding protein Ldb1/NLI. This oligomeric complex binds to a unique, bipartite DNA motif comprising an E‐box, CAGGTG, followed ∼9 bp downstream by a GATA site. In vivo assembly of the DNA‐binding complex requires interaction of all five proteins and establishes a transcriptional transactivating complex. These data demonstrate one function for the LIM‐binding protein Ldb1 and establish a function for the LIM‐only protein Lmo2 as an obligatory component of an oligomeric, DNA‐binding complex which may play a role in haematopoiesis.
The LMO2 gene (previously known as RBTN2 or TTG2) and the TAL1 gene (previously known as SCL or TCL5), are both involved with the development of T‐cell acute lymphoblastic leukaemia (T‐ALL) (reviewed in Rabbitts, 1994). LMO2 and TAL1 were identified via chromosomal translocations involving chromosome 11p13 in T‐cell acute leukaemia (Boehm et al., 1991; Royer‐Pokora et al., 1991) and 1p32 (Begley et al., 1989; Finger et al., 1989; Carroll et al., 1990; Chen et al., 1990) respectively. A normal function of the LMO2 and TAL1 genes is the regulation of haematopoiesis. Gene targeting experiments have shown that both the mouse Lmo2 (Warren et al., 1994) and Tal1 genes (Robb et al., 1995; Shivdasani et al., 1995) are essential during embryogenesis, as homozygous mutant mice die due to failure of yolk sac erythropoiesis. These results are very similar to the effect of a null mutation in the GATA‐1 gene in embryonal stem (ES) cells, which are also unable to differentiate into mature erythrocytes (Pevny et al., 1991; Weiss et al., 1994). The phenotypes of the Lmo2, Tal1 and GATA‐1 null mutations suggest that these three genes have closely related, perhaps synergistic roles in erythroid differentiation. In support of this, we discovered that Lmo2 binds to both Tal1 and GATA‐1 in erythroid cell lines, and that these two interactions may occur simultaneously with Lmo2 bridging between Tal1 and GATA‐1 (Osada et al., 1995). A more general role for Tal1 in haematopoiesis has been suggested recently from studies of Tal1 null mutant ES cells which fail to develop into any haematopoietic lineages in chimeric mice (Porcher et al., 1996; Robb et al., 1996).
The LMO2 and GATA‐1 genes both encode proteins belonging to the zinc finger family (reviewed in Sanchez‐Garcia and Rabbitts, 1993; Weiss and Orkin, 1995). Lmo2 is a member of the LIM‐only class of LIM zinc finger proteins, which also includes the related proteins Lmo1 and Lmo3 (previously called RBTN1 or TTG1, and RBTN3 respectively) (Foroni et al., 1992). In addition to its two LIM domains, Lmo2 also has a short amino‐terminal domain which has transcriptional transactivation activity (Sanchez‐Garcia et al., 1995). GATA‐1 was the first member of the GATA family of zinc finger proteins to be cloned (Tsai et al., 1989), and these proteins all have two zinc fingers and bind the DNA sequence GATA (Weiss and Orkin, 1995). The structure of the LIM and GATA‐1 zinc fingers are similar (Omichinski et al., 1993; Perez‐Alvarado et al., 1994). Although the global fold of the LIM domain is unique, one part of the LIM zinc finger adopts an α‐helical structure similar to the DNA recognition helix of GATA‐1 (Perez‐Alvarado et al., 1994). Therefore, the LIM domain may function in a similar way to GATA‐1 zinc fingers, which mediate both DNA binding and protein dimerization (Perez‐Alvarado et al., 1994; Crossley et al., 1995; Merika and Orkin, 1995; Osada et al., 1995; Yang and Evans, 1995). Although there is no evidence so far that LIM domains bind DNA, they do mediate protein dimerization. LIM–LIM interactions facilitate the formation of Lmo2 homodimers (Sanchez‐Garcia et al., 1995), as well as heterodimerization between zyxin and CRP (Feuerstein et al., 1994; Schmeichel and Beckerle, 1994). Furthermore, interactions between LIM zinc fingers and other protein dimerization domains are known to occur, e.g. Lmo2 binds to the basic helix–loop–helix (bHLH) domain of TAL1 (Valge‐Archer et al., 1994; Wadman et al., 1994). In addition to the interaction found between Lmo2, TAL1 and GATA‐1, a new protein has been discovered, called Ldb1 (Agulnick et al., 1996) or NLI (Jurata et al., 1996), which binds to the LIM domains of LIM homeodomain and LIM‐only proteins, including Lmo2 and the related protein Lmo1.
In contrast to the Lmo2 and GATA‐1 zinc finger proteins, TAL1 belongs to the bHLH class of transcription factors (Baer, 1993). The bHLH domain facilitates both the formation of protein dimers and sequence‐specific DNA recognition, and bHLH dimers bind to E‐box motifs which have the general sequence CANNTG (Murre et al., 1989). bHLH proteins have been divided into different categories. The class A factors are broadly expressed, they form homo‐ and heterodimers, and include the E12 and E47 proteins encoded by the E2A gene (Murre et al., 1989; Sun and Baltimore, 1991; Roberts et al., 1993). TAL1 is a member of the B class of bHLH proteins which display tissue‐specific expression (Baer, 1993) and interact with class A proteins to form DNA‐binding heterodimers (Hsu et al., 1991, 1994b).
Although both TAL1–E2A heterodimers and GATA‐1 proteins bind to DNA, it was not known whether the protein complexes containing Lmo2 also associate with DNA. These questions were addressed by initially performing CASTing experiments with crude nuclear extracts from erythroid cells which express Lmo2, TAL1, E2A and GATA‐1 (Valge‐Archer et al., 1994; Osada et al., 1995) and Ldb1 (our unpublished data). We identified a consensus binding site containing an E‐box–GATA motif, consisting of the E‐box, CAGGTG, 9 bp upstream of a GATA site. Electrophoretic mobility shift assays with this sequence and the crude nuclear extracts demonstrate that a specific complex of the LIM‐only protein Lmo2, with TAL1, E2A and GATA‐1, binds to this motif, and the newly identified Ldb1 protein is part of this complex.
DNA sequences obtained from CASTing with anti‐Lmo2 antiserum
The possible involvement of Lmo2 in a DNA‐binding complex was assessed with CASTing experiments (Wright et al., 1991) using crude nuclear extracts from MEL cells, which express Lmo2 (Valge‐Archer et al., 1994; Wadman et al., 1994). A pool of double‐stranded oligonucleotides, containing a central core of 26 random nucleotides with conserved flanking regions (Pollock and Treisman, 1990), was incubated with the nuclear extract. Oligonucleotides bound by Lmo2 protein in the extract were immunoprecipitated with anti‐Lmo2 antibody, purified and amplified by PCR. After a further four rounds of CASTing, the final PCR products were subcloned and sequenced.
Figure 1A shows the sequence of 31 clones obtained after CASTing with anti‐Lmo2 antibody. All of these contained an E‐box upstream of a GATA site (herein called the E‐box–GATA sequence); furthermore, 18 out of the 31 clones (58%) encoded one particular E‐box: CAGGTG. There was some variation in the number of base pairs of DNA separating the E‐box and GATA sites, as eight clones (26%) had a spacing of 8 bp between the two sites, 20 clones (65%) had a 9 bp spacing, two clones (6%) had a 10 bp spacing and one clone (3%) had an 11 bp spacing. A number of additional clones contained sequences closely related to the E‐box–GATA motif (see Figure 1B). Seven clones had GATT instead of GATA, and six encoded a non‐canonical E‐box motif (CANNNTG) in combination with either GATA or GATT. Three extra clones only contained GATA sites. The consensus binding site (Figure 1A) shows that neither the residues flanking the E‐box and GATA sites, nor the sequence of the DNA between the two sites, was conserved. It is noteworthy that the E‐box and GATA sites have been identified previously as binding sites for non‐LIM domain proteins, since bHLH proteins recognize E‐box sequences (Murre et al., 1989), and members of the GATA family recognize the GATA sequence (Weiss and Orkin, 1995).
DNA sequences bound by Tal1 and E2A proteins in MEL cells
Lmo2 complexes with the bHLH protein Tal1 in MEL cells (Valge‐Archer et al., 1994; Wadman et al., 1994), and therefore it seemed likely that Tal1 itself might be contributing to the recognition of the E‐box–GATA sequence obtained by anti‐Lmo2 CASTing. The consensus DNA‐binding site of TAL1 was compared with that of Lmo2 by conducting CASTing experiments with anti‐TAL1 antiserum and nuclear extracts. Twenty‐nine clones were obtained with an E‐box upstream of a GATA site (15 of these sequences are shown in Figure 2A). As observed in the Lmo2 CASTing data, a majority of the E‐box–GATA clones contained the E‐box CAGGTG (17 clones, i.e. 59%). Furthermore, the number of base pairs of DNA separating the E‐box and GATA sites also varied, with four clones (14%) having a spacing of 8 bp between the two sites, 22 clones (76%) having a 9 bp spacing, two clones (7%) having a 10 bp spacing and one clone (3%) having an 11 bp spacing. Another 15 CASTing clones were obtained (in addition to those shown in Figure 2A). Four of these sequences were similar to the E‐box–GATA consensus and encoded E‐box‐like or GATT motifs, another four sequences contained isolated E‐boxes, and seven contained single GATA sites. The consensus DNA‐binding site (Figure 2A) was very similar to that derived from the anti‐Lmo2 CASTing (Figure 1A), suggesting that a complex consisting of at least TAL1 and Lmo2 binds to the E‐box–GATA motif.
DNA binding by TAL1 has been observed previously, but only in the presence of class A bHLH proteins (Hsu et al., 1991, 1994a). The E2A gene is widely expressed, and it encodes two class A bHLH proteins, E12 and E47, which dimerize with TAL1 in leukaemic T cells (Hsu et al., 1994a, b). Furthermore, mammalian two‐hybrid experiments showed that Lmo2 can interact with TAL1–E2A complexes (Wadman et al., 1994; Osada et al., 1995). It is therefore possible that the Lmo2 in MEL cells binds to a complex of TAL1 and E2A, and the three proteins together contribute to recognition of the E‐box–GATA motif. Accordingly, we repeated the CASTing experiments using anti‐E2A antiserum (Figure 2B) and an E‐box–GATA motif was observed, similar to that found after the anti‐Lmo2 and anti‐TAL1 CASTing. Three clones (22%) had an 8 bp spacing, nine clones (64%) had a 9 bp spacing and two clones (14%) had a 10 bp spacing. The consensus sequence (Figure 2B) was different from that previously observed, as only four clones had the CAGGTG E‐box; the other features of the consensus were very similar to the results of the anti‐Lmo2 and anti‐TAL1 CASTing (Figures 1A and 2A). A large number of sequences closely related to the E‐box–GATA consensus were also obtained (17 clones), which encoded E‐box‐like GATA and E‐box–GATT motifs. Additional sequences contained either isolated E‐boxes (11 clones) or GATA sites (nine clones). The high degree of similarity in the consensus DNA‐binding sites of Lmo2, TAL1 and E2A suggests that a complex of these three proteins binds specifically to an E‐box–GATA motif in MEL nuclei.
Sequence specificity of DNA binding by GATA‐1 protein in MEL cells
The GATA‐1 protein is highly expressed in MEL cells, and a proportion of it is found complexed with Lmo2, suggesting that an oligomeric complex occurs which comprises Lmo2, Tal1, E47 and GATA‐1 (Osada et al., 1995). Because GATA‐1 protein binds to the DNA sequence GATA, we considered whether GATA‐1 might also be part of the Lmo2–Tal1–E2A complex which recognizes the E‐box–GATA motif. We therefore repeated the CASTing procedure with MEL nuclear extract using anti‐GATA‐1 antiserum. The sequences of 15 clones derived from five rounds of CASTing are shown in Figure 2C. In contrast to the sequences shown in Figures 1A and 2A and B, only six clones out of a total of 54 contained an E‐box as well as a GATA site, but the spacing between the two, as well as their relative orientation, was variable. The majority of clones (85%) encoded only one GATA site (Figure 2C).
Specific DNA recognition of the E‐box–GATA motif by proteins in MEL nuclear extract
The sequences derived from the CASTing experiments probably corresponded to high affinity binding sites for the proteins analysed in MEL cells. This possibility was tested by performing electrophoretic mobility shift assays (EMSAs) with oligonucleotides corresponding to the consensus E‐box–GATA motif, or mutant sequences where either the E‐box or GATA site, or both sites, were mutated (see Materials and methods). The EMSA illustrated in Figure 3A shows that incubation of MEL nuclear extract with each of the four 32P‐labelled oligonucleotides resulted in the formation of a number of distinct protein–DNA complexes. The DNA sequence specificity of four of these complexes was defined by comparing the mobility shift characteristics of the nuclear extract with wild‐type and mutant oligonucleotides. These specificities are indicated on the left hand side of Figure 3A. Two of the complexes migrated a short distance into the gel and were formed only when both the E‐box and the GATA sites were present on the same oligonucleotide (Figure 3A, lanes 1 and 2, bands designated E‐box + GATA). These particular low mobility complexes were not observed when the GATA site (Figure 3A, lanes 3 and 4), the E‐box (lanes 5 and 6) or both sites (lanes 7 and 8) were mutated in the oligonucleotides. Other DNA‐binding complexes were present in the MEL nuclear extracts which bound to the E‐box sequence alone or the GATA sequence alone (designated E‐box and GATA respectively in Figure 3A). The band indicated as corresponding to the E‐box‐binding complex was only observed with the E‐box–GATA consensus oligonucleotide (Figure 3A, lanes 1 and 2), or the mutant GATA oligonucleotide (lanes 5 and 6), but not with mutant E‐box or double mutant oligonucleotides (lanes 3, 4, 7 and 8). Conversely, the band designated GATA only appeared in the presence of the consensus oligonucleotide (lanes 1 and 2) or the mutant E‐box oligonucleotide in which the GATA site was retained (lanes 3 and 4).
In the two low mobility DNA‐binding complexes observed in the band shift assays (Figure 3A, labelled E‐box + GATA), there appeared to be simultaneous recognition of both E‐box and GATA motifs. It was possible, however, that two independent protein complexes were bound simultaneously to the E‐box and GATA sites. EMSAs were therefore used to examine the sequence requirements of DNA recognition by proteins in the MEL nuclear extract. These were incubated with the radiolabelled E‐box GATA consensus oligonucleotide (CAGGTG–9 bp–GATA), in the presence of increasing concentrations of unlabelled consensus or mutant oligonucleotides, to determine the competition characteristics of the complexes (Figure 3B). Non‐radiolabelled, E‐box–GATA consensus oligonucleotide competed very effectively with the radiolabelled, consensus oligonucleotide, and inhibited the formation of all the protein–DNA complexes including the E‐box–GATA and E‐box‐binding proteins, and to a lesser extent the GATA‐binding activity (Figure 3B, compare lane 1 with lanes 2, 3 and 4). The apparent resistance of the GATA‐binding protein to the competitor oligonucleotide was most likely due to the high concentration of this protein (GATA‐1, see below) in the MEL nuclear extract.
By contrast, none of the oligonucleotides which had mutant E‐boxes (Figure 3B, lanes 5, 6 and 7), mutant GATA sites (lanes 8, 9 and 10) or both sites mutated (lanes 11, 12 and 13) inhibited the binding of the two low mobility protein complexes to the E‐box–GATA motif. The sole exception to this was a 50‐fold excess of the oligonucleotide with a mutated E‐box and an intact GATA site. At this level, the mutant oligonucleotide was able to inhibit, to some extent, the recognition of the E‐box–GATA motif (lane 7). The data therefore show that MEL nuclei contain DNA‐binding proteins which specifically recognize an E‐box–GATA motif consisting of the E‐box, CAGGTG, 9 bp upstream of a GATA site. These data also indicate that recognition by the low mobility complex of the E box GATA motif requires complementary binding to both sites, as oligonucleotides carrying the individual sites failed to compete for binding.
A complex of Lmo2, Ldb1/NLI, TAL1, GATA‐1 and E2A proteins specifically recognizes the E‐box–GATA motif
The possible presence of Lmo2, TAL1, E2A and GATA‐1 in each of the protein–DNA complexes was assessed using antibody‐mediated supershifts, in which an antibody binds to a pre‐formed protein–DNA complex thereby increasing its molecular size, and decreasing its electrophoretic mobility in EMSAs. A number of the protein–DNA complexes in the MEL F4N nuclear extract were supershifted by the various antibodies. Most important is that the two E‐box–GATA‐dependent low mobility complexes were supershifted by anti‐GATA‐1, anti‐Lmo2, anti‐TAL1 and anti‐E2A polyclonal antisera (Figure 4A, lanes 5, 9, 13 and 17), whereas pre‐immune serum (lane 1), or an antiserum against GATA‐3, had no effect (lane 2). In addition, antiserum binding to GATA‐1 protein supershifted the band previously designated as the ‘GATA’ band (Figure 4A, lanes 5 and 6) and anti‐E2A antiserum supershifted the E2A band (Figure 4, lanes 17 and 18).
The addition of antibody resulted in only a small retardation of electrophoretic mobility of the two bands corresponding to the E‐box–GATA‐binding complexes, reflecting the high molecular weight of the multi‐protein complexes. Improved resolution of antibody‐supershifted bands was achieved by inclusion of the immunoglobulin‐binding protein, protein A, to increase the size of the complex further, thereby reducing the mobility of bands within the gel. In some cases, the addition of protein A actually prevented the entry of the supershifted complexes into the gel, but its presence did not alter the mobility of the protein–DNA complexes from samples incubated without added antibody (data not shown), nor with an antiserum against GATA‐3 (Figure 4A, lane 2). However the addition of protein A did reduce the mobility of the complexes supershifted by antisera against Lmo2, TAL1, E2A and GATA‐1 (lanes 6, 10, 14 and 18), while having no effect with the corresponding pre‐immune sera (lanes 3, 4, 7, 8, 11, 12, 15 and 16). It is also noteworthy that antibodies recognizing TAL1 only affected the mobility of the two low mobility complexes. In summary, the two high molecular weight DNA‐binding complexes in MEL nuclei, which specifically recognize the E‐box 9 bp upstream of a GATA site, contain Lmo2, TAL1, E2A and GATA‐1 proteins, previously demonstrated to form an interactive complex in MEL cells (Osada et al., 1995). In addition, analogous Lmo2‐containing complexes were observed in two other erythroid lines, 707 and HEL, but not in two non‐erythroid cell lines, C3H10T1/2, and the mouse myeloma line NS0 (data not shown).
A novel protein has recently been identified which binds specifically to LIM domains of LIM‐only and LIM homeobox proteins (Agulnick et al., 1996; Jurata et al., 1996). This novel LIM domain‐binding protein, Ldb1 or NLI, can bind to both Lmo1 and Lmo2, and is expressed in a wide range of tissues. Thus it seemed a potential component of the E‐box–GATA‐binding complex, particularly in view of the large size of this complex. Therefore, we added anti‐Ldb1 antiserum to MEL F4N nuclear extract in the presence and absence of protein A, and analysed the samples by EMSA with radiolabelled E‐box–GATA oligonucleotide (Figure 4B). The anti‐Ldb1 antiserum specifically supershifted the two high molecular weight Lmo2–GATA‐1–TAL1–E47 complex bands. The presence of protein A reduced the mobility of the two Lmo2–GATA‐1–TAL1–E47 complexes further (Figure 4, lanes 4 and 5) whilst pre‐immune serum had no effect on the mobility of any of the DNA‐binding complexes (Figure 4, lanes 2 and 3). Therefore, in MEL nuclei, the Ldb1 protein is part of a complex with Lmo2, TAL1, E2A and GATA‐1, and together these five proteins constitute an oligomeric complex which binds to the E‐box–GATA motif.
The E‐box–GATA clones obtained after CASTing with the anti‐Lmo2, anti‐TAL1 and anti‐E2A antisera displayed a restricted variation in the spacing between the E‐box and GATA sites (Figures 1A and 2A and B). The effect of varying the distance between the E‐box and GATA sites on DNA binding to the motif by the E‐box–GATA‐binding complexes was investigated by performing EMSAs with oligonucleotides, in which the spacing between the E‐box and GATA sites varied between 7 and 12 bp of DNA. Although the binding of E2A and GATA‐1 to their individual binding sites was unaffected by the spacing between the E‐box and GATA sequences (Figure 5, lanes 1–6), DNA binding by the low mobility complexes of Lmo2, Ldb1, Tal1, E2A and GATA‐1 was dependent on the distance between these sites. The low mobility complexes were detected only when the spacing between E‐box and GATA sites was 8, 9 or 10 bp (lanes 2–4), and not when reduced to 7 bp, or increased to 11 or 12 bp. It is interesting that this separation corresponds to approximately one turn of the DNA helix, and would place the E‐box and GATA sites on the same face of the DNA. However, it should be noted that the design of the CASTing oligonucleotides restricts bound products to the 26 internal random oligonucleotides.
Specificity of TAL1–E2A complexed with Lmo2 and GATA‐1
The E‐box most frequently found associated with the GATA site in the CASTing data sequences was CAGGTG (39 out of 74 clones), which differs from the previously published TAL1–E2A consensus E‐box (Hsu et al., 1994a). The sequence specificity of DNA recognition by Tal1–E2A‐containing complexes in MEL cells was tested directly using EMSAs with MEL nuclear extract and three different oligonucleotides in the presence of either anti‐Tal1 or anti‐E2A antisera. The data in Figure 6 (lane 1) show the pattern of bands obtained with the consensus E–box–GATA oligonucleotide, and the bands corresponding to Lmo2, Ldb1, Tal1, E2A and GATA‐1 are indicated. Using the consensus E‐box–GATA oligonucleotide, specific Tal1–E2A complexes were only found in combination with Lmo2 and GATA‐1, since the two low mobility bands were supershifted by both anti‐TAL1 (lanes 3 and 4) and anti‐E2A antisera (lanes 5 and 6). An E2A homodimer band which recognized the CAGGTG E‐box was supershifted by anti‐E2A antiserum (Figure 6, lanes 5, 6, 11 and 12).
The low mobility Lmo2, Ldb1, Tal1, E2A and GATA‐1 oligomeric complexes were not observed by EMSA when the GATA site was absent (Figure 6, lanes 7–12), while Tal1–E2A heterodimers did bind to the CAGATG E‐box in the absence of Lmo2 or GATA‐1, since a new band appeared in the EMSA with this oligonucleotide (Figure 6, lanes 13 and 14). This complex was supershifted by anti‐TAL1 (lanes 15 and 16) and by anti‐E2A antisera (lanes 17 and 18), but not by anti‐GATA‐1 or anti‐Lmo2 antisera (data not shown). In addition, anti‐GATA‐2 antiserum had no effect on the EMSA pattern of the nuclear extract (lanes 2, 8 and 14). Therefore, two distinct complexes involving Tal1 exist in MEL nuclei, an oligomeric complex binding to the CAGGTG E‐box (only when adjacent to a GATA site) and one binding to the CAGATG E‐box (in the absence of the GATA site).
Assembly of the oligomeric complex in vivo establishes a function in transcriptional transactivation
The transcriptional activation potential of the constituent proteins, except Ldb1, has been investigated and, while GATA‐1 is a potent activator, TAL1 may act as a transcriptional repressor (Hsu et al., 1994c). No direct DNA binding has so far been demonstrated for Lmo2 or Ldb1. The possibility of the oligomeric complex activating transcription was tested by reconstituting the complex in vivo together with an E‐box–GATA reporter in COS cells. The transcriptional activation of an E‐box–GATA–luciferase reporter was related to the co‐transfection of expression constructs (Figure 7A), the protein expression levels were assessed by Western blotting (Figure 7B) and the DNA‐binding activity of the extracts was ascertained by EMSA using the consensus E‐box–GATA oligonucleotide (Figure 7C).
COS cells were transfected with combinations of expression plasmids (Figure 7A) together with the reporter. The low mobility bands in the EMSA, due to the oligomeric complex, were absent if any one of the five expression plasmids was omitted (unpublished results). Reporter construct activation was observed in four transfection combinations (3, 4, 7 and 8). GATA‐1 alone is a potent transactivator (transfection 3), and when GATA‐1 was produced in cells together with Lmo2 and Ldb1, activation was also observed, albeit at a consistently reduced level (Figure 7A, transfection 4). No transactivation was observed when TAL1 and E47 were co‐transfected with or without Lmo2 and Ldb1 (transfections 6 and 5 respectively), presumably due to the weak transactivation by TAL1–E47 dimers (Hsu et al., 1994a). Accordingly, when GATA‐1 was co‐transfected with TAL1 and E47, similar levels of activation were observed as with GATA‐1 alone (transfections 7 and 3 respectively).
When the five protein components of the complex were co‐expressed, assembly of a complex occurred which gave the most efficient transactivation (transfection 8), yielding ∼2.5 times greater luciferase activity than GATA‐1 expression alone. The sum of transactivation observed after transfection of combinations of GATA‐1 with other components only marginally increases that found with GATA‐1 alone, except in the case of expression of the five components together. These data indicate that full assembly of the oligomeric complex (i.e. Lmo2, Ldb1, Tal1, E47 and GATA‐1) is necessary to facilitate efficient binding across the E‐box and GATA motifs, and establish a function for this complex in transcriptional transactivation.
Since the in vivo assembly of the complex required the simultaneous transfection of many components, each of the cDNAs were cloned into the same expression vector, and the levels of protein were monitored by Western blotting of the cellular extracts from the eight transfections (Figure 7B). These data confirm the presence of each protein in the relevant transfection, and that the amounts of each synthesized are approximately equivalent between transfections.
The same transfected cells were also used as a source of nuclear extract for EMSAs with an E‐box–GATA probe to correlate the transactivation with DNA binding to the E‐box–GATA motif (Figure 7C). No protein complexes were observed when Lmo2 and Ldb1 were expressed together (transfection 2) or alone (data not shown), indicating that the LIM‐only protein Lmo2 does not bind to the GATA site, at least in this context. Cells transfected with the GATA‐1 clone alone (transfection 3) or together with Lmo2 and Ldb1 clones (transfection 4) yielded one band in the EMSA which could be supershifted with the anti‐GATA‐1 serum. Combined transfection with E47 and TAL1 expression vectors (transfection 5) yielded three main bands, one of which corresponds to an E47–TAL1 heterodimer, since it was supershifted by antisera recognizing either of the proteins, and a doublet of bands corresponding to E47 homodimers. Co‐expression of Lmo2 and Ldb1 with E47 and TAL1 (transfection 6) had no effect on the binding of the E47 homodimer, but it is interesting that this combination ablated the binding of the E47–TAL1 heterodimer to the DNA. Conversely, expression of GATA‐1 had no apparent effect on DNA binding by E47–E47 or E47–TAL1 dimers (transfection 7).
The transfected COS cells which showed the greatest transactivation of the E‐box–GATA–luciferase reporter (Figure 7A, transfection 8) show a number of bands in the EMSA which can be attributed to various complexes by supershift assays. These bands include those attributable to E47 homodimer and GATA‐1 bands (observed in transfections 3, 4, 5 and 7) and, in addition, two low mobility bands were found, due to oligomeric complexes comprising Lmo2, Ldb1, GATA‐1, TAL1 and E2A as judged by supershifts. The oligomeric complex was only found in transfection 8, consistent with the need for the five components of the DNA‐binding complex in activation of transcription from the E‐box–GATA sequence.
Lmo2 forms part of an oligomer binding to the E–box–GATA motif
Our previous data, using reporter and two‐hybrid assays, indicated that a complex involving Lmo2, TAL1, E47 and GATA‐1 could form in haematopoietic cells (Osada et al., 1995). We have now shown that such a complex exists in erythroid cells and that this complex binds to DNA. The oligomeric complex, which also involves Ldb1, specifically recognizes a unique bipartite E‐box–GATA motif, consisting of an E‐box followed 9 bp downstream by a GATA site. In addition, the oligomeric complex can function in transcriptional activation. The data we present here demonstrate for the first time that a LIM‐only protein can be part of a sequence‐specific DNA‐binding complex. Therefore, two classes of LIM protein interact with DNA; the LIM homeodomain proteins bind directly to DNA via their homeodomains (Karlsson et al., 1990) and the LIM‐only protein Lmo2 is associated with DNA because it is a component of a DNA‐binding complex. However, there is still no evidence that the LIM domains of either class of LIM protein specifically contact DNA.
The CASTing procedure, when performed with nuclear extracts, preferentially selects high affinity binding sites for complexes of proteins, as opposed to lower affinity binding sites for the individual protein components (Funk and Wright, 1992; Wright and Funk, 1993). Since the majority of clones from the anti‐Lmo2 and anti‐TAL1 CASTing experiments encoded an E‐box in combination with a GATA site, this E‐box–GATA motif probably corresponds to a high affinity binding site. Any lower affinity binding sites for Lmo2 or Tal1 would have been under‐represented in the final PCR products. Although the DNA recognition motif is composed of two previously identified binding sites (an E‐box and a GATA site) several features demonstrate that it is a novel recognition site for a distinct transcription factor complex. The relative orientation of the E‐box and GATA sites was conserved in the CASTing sequences, as the E‐box always occurred upstream of the GATA site. In addition, the spacing between the two sites appeared important because protein binding was only observed when the E‐box and GATA sites were separated by 8, 9 or 10 bp of DNA, indicative of binding on one surface of the double helix. Taken together, these data suggest that the E‐box–GATA‐binding complex not only recognizes the spatial separation of the E‐box and GATA sites in the motif, but also binds to DNA in a sequence‐specific manner.
A simple model for the molecular role of Lmo2 is that this protein forms a bridge, probably in conjunction with Ldb1/NLI, between DNA‐binding elements, in this case TAL1–E47 heterodimers and GATA‐1 (Figure 8). The complex of Lmo2, Ldb1, TAL1, E2A and GATA‐1 bound to the E‐box–GATA motif may therefore contain multimers of Lmo2 and Ldb1. An Lmo2 molecule may be in contact with TAL1 on one side and with a dimer of Ldb1 on the other, which in turn touches a second Lmo2 molecule which contacts GATA‐1. The formation of the oligomeric complex may occur in the presence of DNA such that GATA‐1 and/or TAL1–E47 may contact DNA recognition sites on the chromosome and form a ‘nucleation site’ for the assembly of the oligomeric complex. Thus the large quantity of free GATA‐1, not bound to DNA, would provide a pool from which this process could be started.
The newly identified LIM‐binding protein Ldb1/NLI plays a role in the E‐box–GATA‐binding complex presumably by binding to the LIM protein Lmo2. Unless Ldb1 has distinct DNA‐binding or enhanced transcriptional transactivation functions, its function in vivo is probably to mediate protein–protein binding activity (Agulnick et al., 1996; Jurata et al., 1996). We consistently observed two low mobility protein–DNA complexes bound to the E‐box–GATA consensus oligonucleotide which may be due to one or more of the components of the complex being present at a variable copy number. In addition, TAL1 and GATA‐1 are both subject to post‐translational modification (Cheng et al., 1993a, b; Crossley and Orkin, 1994), which might also affect the electrophoretic mobility of the proteins. Alternatively, there may be other proteins in the complex which we have not yet identified.
The role of Lmo2 protein‐associated complexes in haematopoiesis
Lmo2, TAL1 and GATA‐1 are essential for stages of haematopoiesis as shown by gene targeting experiments (Pevny et al., 1991; Warren et al., 1994; Weiss et al., 1994; Robb et al., 1995, 1996; Shivdasani et al., 1995; Porcher et al., 1996). Our molecular data link these three gene products together into a DNA‐binding complex (Figure 8), suggesting that the similar phenotypes result from the failure of recognition of common target genes, carrying the E‐box–GATA motif in their regulatory regions. This might be achieved by modulating their transcription during haematopoiesis by activation or repression.
A search of DNA databases revealed a number of genes with consensus E‐box–GATA motifs (with 8, 9 or 10 bp separation between E‐box and GATA sequences) located adjacent to their promoters, including the genes for glycophorin A and B and porphobilinogen deaminase. Other unidentified genes may also possess these regulatory elements. The sequences adjacent to the identified promoters, however, contain many other potential regulatory elements. While the oligomeric complex was found to bind to these promoters (unpublished data), reporter constructs built with these complex sequences did not yield data which allowed the distinction between the effects of E–box–GATA and other control elements (unpublished results).
During the early events of haematopoiesis, GATA‐2 predominates over GATA‐1, and targeting experiments of the GATA‐2 gene have shown that null GATA‐2 mutations have an effect on early haematopoiesis (Tsai et al., 1994). Therefore, complexes involving Lmo2 in earlier haematopoietic lineages might utilize GATA‐2 rather than GATA‐1 since both can bind Lmo2 (Osada et al., 1995). In addition, Lmo2, GATA‐1 and E2A can form homodimers (Murre et al., 1989; Sun and Baltimore, 1991; Crossley et al., 1995; Merika and Orkin, 1995; Sanchez‐Garcia et al., 1995) or heterodimers TAL1–E2A (Hsu et al., 1991, 1994b) and Lmo2–GATA‐1 (Osada et al., 1995). Thus it is conceivable that various Lmo2‐containing complexes, with or without GATA‐1 or 2, bind to specific target genes, and presumably activate (or repress) their transcription during various stages of haematopoiesis (Figure 8). This would permit variations in both DNA‐binding site recognition and in composition of the oligomers, thereby yielding precise control of target gene expression patterns during the differentiation process. In the situation of LMO2 and TAL1 protein co‐expression in leukaemic T cells, analogous oligomeric complexes are conceivable. When LMO2 is expressed in the absence of TAL1, it may bind different molecules, thereby either activating, or repressing, different target gene populations during leukaemogenesis.
Materials and methods
MEL and subclone F4N, HEL and 707 cell lines were maintained in RPMI growth medium supplemented with 10% fetal bovine serum. C3H10T1/2, COS‐7 and NS0 cell lines were cultured in Dulbecco‘s modified Eagle's medium plus 10% fetal bovine serum.
Antisera and Western blotting
Rabbit anti‐peptide antisera recognizing Lmo2 (residues 2–17) and mouse GATA‐1 (residues 5–20 or 376–391) have been described previously (Warren et al., 1994; Osada et al., 1995). Antisera against human GATA‐2 (residues 5–20) and human GATA‐3 (residues 413–428) were made as described (Warren et al., 1994). The anti‐TAL1 No. 1080 (residues 1–121), anti‐TAL1 No. 370 (residues 238–331) and anti‐E2A No. 526 (residues 217–371) rabbit antisera have been described elsewhere (Cheng et al., 1993b; Hsu et al., 1994a, b). Rabbit anti‐peptide antiserum recognizing Ldb1 (A.D.Agulnick and H.Westphal, unpublished) was raised against a peptide corresponding to residues 256–270 of Ldb1 (Agulnick et al., 1996). Western blotting was performed with 16 μg of nuclear extract as described in Larson et al. (1996) using the rabbit anti‐Lmo2 and anti‐Ldb1 antisera. TAL1 was detected with mouse monoclonal antiserum BTL73 (the kind gift of K.Pulford), E47 with mouse monoclonal antiserum YAE (Santa Cruz Biotechnology Inc.) and GATA‐1 with rat monoclonal antiserum N6 (also from Santa Cruz Biotechnology, Inc.).
Preparation of nuclear extract for CASTing
Crude nuclear extracts were prepared from MEL, or MEL subclone F4N cells, essentially as described (Lee et al., 1988). MEL cells (3×107) were harvested, washed twice with phosphate‐buffered saline (PBS) and resuspended in 5 ml of hypotonic buffer [10 mM Tris–HCl, 1.5 mM MgCl2, 10 mM KCl, 1 mM dithiothreitol (DTT), 50 μM ZnAc, pH 7.9] with protease inhibitors [1 mM phenylmethylsulfonyl fluoride (PMSF), 1 μg/ml pepstatin A, 1 μg/ml aprotinin, 1 μg/ml leupeptin]. After a 10 min incubation at 4°C, the cells were homogenized with a Dounce homogenizer, centrifuged at 1000 g for 5 min, and the pellet of nuclei resuspended in 500 μl of KCl‐free buffer (20 mM Tris–HCl, 20% glycerol, 1.5 mM MgCl2, 1 mM DTT, pH 7.9) with protease inhibitors. KCl was added to a final concentration of 0.3 M, the lysate was incubated for 30 min, and centrifuged at 13 000 g for 15 min. Dilution buffer containing protease inhibitors was added to the supernatant (750 μl of 20 mM Tris–HCl, 20% glycerol, 1 mM DTT, 50 μM ZnAc, pH 7.9), and the sample was centrifuged again at 13 000 g for 10 min. The protein concentration of the nuclear extract was estimated by the Bradford assay (BioRad) and aliquots were stored at −70°C.
Enrichment for specific DNA‐binding sites from a pool of random oligonucleotides was performed approximately as described (Blackwell et al., 1990; Pollock and Treisman, 1990). MEL nuclear extract (200 μg) was mixed with 300 ng of double‐stranded random oligonucleotide R76 (Pollock and Treisman, 1990) and 20 μg of poly(dI–dC), in 600 μl of binding buffer (20 mM HEPES, 100 mM NaCl, 10% glycerol, 0.1% NP‐40, 50 μM ZnAc, 0.5% bovine serum albumin, 1 mM DTT, pH 7.4) at 4°C for 60 min. Five μl of the appropriate antiserum (anti‐Lmo2, anti‐TAL1 No. 1080, anti‐E2A No. 526 and anti‐NH2‐terminal GATA‐1 peptide) and 0.6 μl of 1 M iodoacetamide were added, and the incubation was continued for a further 30 min. Fifty μl of a 50% protein A–Sepharose slurry was added, and after 30 min the beads were washed four times with binding buffer, incubated with 200 μg/ml proteinase K in digestion buffer (50 mM Tris–HCl, 20 mM EDTA, 1% SDS, 10 mM NaCl, pH 7.4) for 1 h at 50°C, and the sample was extracted with phenol. Oligonucleotides were purified by non‐denaturing PAGE, precipitated with ethanol in the presence of 10 μg of glycogen, resuspended in 10 mM Tris–HCl, 0.1 mM EDTA, pH 7.5, and amplified by PCR. PCR products from five (Figures 1A and 2A and C) or seven (Figure 2B) rounds of selection were subcloned and sequenced. Consensus binding sites for CASTing performed with anti‐Lmo2 and anti‐TAL1 antisera were calculated from data from two separate experiments.
Preparation of nuclear extract for EMSA
Cells (5×107) were washed twice with 10 ml of cold PBS, resuspended in 500 μl of buffer A (10 mM HEPES, 1.5 mM MgCl2, 10 mM KCl, 0.5 mM DTT pH 7.9) with protease inhibitors (100 μg/ml aprotinin, 5 μg/ml leupeptin, 1 μg/ml pepstatin A, 0.5 mM PMSF) (Lee et al., 1988) and incubated on ice for 15 min. NP‐40 was added to a final concentration of 0.5%, and the cells were vortexed for 10 s. The nuclei were pelleted by centrifugation at 6500 g for 20 s, and resuspended in 150 μl of buffer C (20 mM HEPES, 1.5 mM MgCl2, 420 mM NaCl, 0.2 mM EDTA, 25% v/v glycerol pH 7.9) with protease inhibitors. The nuclear suspension was stirred vigorously on ice with a small, magnetic stirrer bar for 30 min. The sample was centrifuged at 13 000 g for 10 min, and aliquots of the nuclear extract were frozen immediately on dry ice. Samples were stored under liquid nitrogen. The protein concentration of the nuclear extract was determined by the Bradford assay (BioRad).
EMSAs were performed with 32P‐labelled double‐stranded oligonucleotides, essentially as described previously (Hsu et al., 1991). The sequence of the sense strands of individual oligonucleotides is shown below. MEL subclone F4N nuclear extract (5 μg) was incubated for 20 min at 25°C in binding buffer (20 mM HEPES, 50 mM KCl, 1 mM EDTA, 25% v/v glycerol, 1 mM DTT, pH 7.6) in the presence of 40 μg/ml of sonicated Escherichia coli DNA and 2 nM 32P‐labelled oligonucleotide. Where indicated, 1 μl of the appropriate rabbit antiserum [anti‐Lmo2, anti‐TAL1 Nos 370 (Figure 6) and 1080 (Figure 4), anti‐E2A No. 526, anti‐GATA‐1 residues 376–391 and anti‐Ldb1] and 1 μg of Staphylococcus aureus protein A (Sigma Cat. No. P‐6650) were added. However, anti‐Ldb1 supershifts were performed using 2 μg of protein A. Nuclear extracts from transfected COS‐7 cells were treated identically, except that each sample contained 3 μg of protein and 3 nM oligonucleotide, and the anti‐Lmo2 supershift was performed with 2 μl of antibody and 2 μg of protein A. Samples were loaded onto 4% native polyacrylamide gels, which were run for 4 h at 13 mA constant current in 0.5× TBE (45 mM Tris, 45 mM boric acid, 1.25 mM EDTA). After drying the gel, the DNA–protein complexes were visualized by autoradiography. For clarity, the band corresponding to the free probe has been cut from the autoradiograph; however, all EMSA experiments were carried out with an excess of free probe.
Complimentary oligonucleotides encoding the E1b TATA box (GATCACTAGTAGGGTATATAATGGAATTCA and GATCTGAATTCCATTATATACCCTACTAGT) were phosphorylated, annealed and ligated into the BglII site of the luciferase reporter vector pGL3‐Basic (Promega), to make the plasmid E1bLUC. Two copies of a pair of complementary oligonucleotides encoding the E‐box–GATA consensus (TCGACCGCCAGGTGCTGCGTCCCGATAGGGGCC and TCGAGGCCCCTATCGGGACGCAGCACCTGGCGG) were then inserted in the same orientation into the XhoI I site of E1bLUC to create the luciferase reporter (E‐box GATA)2‐E1bLUC. The pEF‐BOS‐β‐galactosidase control plasmid has been described before (Osada et al., 1995). Expression plasmids encoding Lmo2, TAL1, murine GATA‐1 and human E47 were constructed by subcloning the relevant cDNA sequences into the vector pEF‐BOS, as described previously (Warren et al., 1994; Osada et al., 1995). A pEF‐BOS‐Ldb1 expression plasmid was constructed by excising a HindIII–XbaII fragment from the plasmid Ldb1‐pcDNA I/Amp, and blunt cloning it into pEF‐BOS.
COS‐7 transfections, luciferase and β‐galactosidase assays
COS‐7 cells (1.4×106), were seeded onto 100 mm plates in 10 ml of medium, 30 h prior to calcium phosphate‐mediated transfection (Gibco‐BRL). Each sample was transfected with 10 μg of (E‐box GATA)2‐E1bLUC reporter plasmid, 0.25 μg of pEF‐BOS‐βGAL control plasmid and 17 μg of pEF‐BOS expression plasmids, and the transfections were set up in triplicate. The amount of each expression plasmid used was 5 μg of pEF‐BOS‐Lmo2, 5 μg of pEF‐BOS‐TAL1, 3 μg of pEF‐BOS‐GATA‐1, 3 μg pEF‐BOS‐E47 and 1 μg of pEF‐BOS‐Ldb1. In those samples where the total amount of pEF‐BOS expression plasmids was <17 μg, pEF‐BOS vector was added to provide a constant mass of DNA. The cells were harvested from each plate after 36 h and were resuspended in 150 μl of buffer A, lysed with 0.5% NP‐40 and the nuclei were pelleted as described above. The luciferase activity of each post‐nuclear supernatant was normalized with respect to the β–galactosidase activity of the sample, as described previously (Hsu et al., 1994b). Nuclear extracts were prepared by pooling the nuclear pellets from the triplicate transfections in 200 μl of buffer C.
Sequence of oligonucleotides used in EMSAs
1. 9 bp spacing, consensus E‐box–GATA oligonucleotide:
2. 9 bp spacing, mutant E‐box oligonucleotide:
3. 9 bp spacing, mutant GATA oligonucleotide:
4. 9 bp spacing, double mutant oligonucleotide:
5. TAL1–E2A consensus E‐box oligonucleotide:
6. 7 bp spacing, consensus E‐box–GATA oligonucleotide:
7. 8 bp spacing, consensus E‐box–GATA oligonucleotide:
8. 10 bp spacing, consensus E‐box–GATA oligonucleotide:
9. 11 bp spacing, consensus E‐box–GATA oligonucleotide:
10. 12 bp spacing, consensus E‐box–GATA oligonucleotide:
We would like to thank Dr H.Axelson for carrying out database searches and Dr G.Smith for helpful advice and discussions. The 707 and MEL cell lines were a gift from Dr M.Greaves, and the MEL subclone F4N was a gift from Dr A.Green. We are especially grateful to Dr R.Baer for providing the anti‐TAL1 and anti‐E2A antisera, and Dr C.Murre for anti‐E2A antiserum. I.W. and H.O. were supported by the Leukaemia Research Fund, and G.G was funded by an EMBO fellowship.
↵† I.A.Wadman and H.Osada contributed equally to this work
- Copyright © 1997 European Molecular Biology Organization