TAL1/SCL is a master regulator of haematopoiesis whose expression promotes opposite outcomes depending on the cell type: differentiation in the erythroid lineage or oncogenesis in the T‐cell lineage. Here, we used a combination of ChIP sequencing and gene expression profiling to compare the function of TAL1 in normal erythroid and leukaemic T cells. Analysis of the genome‐wide binding properties of TAL1 in these two haematopoietic lineages revealed new insight into the mechanism by which transcription factors select their binding sites in alternate lineages. Our study shows limited overlap in the TAL1‐binding profile between the two cell types with an unexpected preference for ETS and RUNX motifs adjacent to E‐boxes in the T‐cell lineage. Furthermore, we show that TAL1 interacts with RUNX1 and ETS1, and that these transcription factors are critically required for TAL1 binding to genes that modulate T‐cell differentiation. Thus, our findings highlight a critical role of the cellular environment in modulating transcription factor binding, and provide insight into the mechanism by which TAL1 inhibits differentiation leading to oncogenesis in the T‐cell lineage.
Cell differentiation is regulated by finely tuned mechanisms directed by cell‐specific and ubiquitous transcription factors. Mutations (e.g. deletions, fusions) that affect the integrity of transcription factors by altering their DNA‐binding specificity and/or capacity to interact with cofactors can transform these proteins into potent oncogenes. At the same time, wild‐type (WT) (non‐mutated) transcription factors can also become oncogenic when aberrantly expressed in an inappropriate cell type (Tenen, 2003; O'Neil and Look, 2007). While this argues for an important role of the cellular context in modifying transcription factors’ ability to control cell fate, the extent to which the cellular environment affects the function of transcription factors is unclear (Pan et al, 2009).
The basic helix‐loop‐helix (bHLH) protein TAL1 (also called SCL) displays distinct, sometimes opposite, functions in different cell types (Begley and Green, 1999; Lecuyer and Hoang, 2004). Indeed, TAL1 expression is necessary for the specification, survival and competence of haematopoietic stem cells and for the differentiation of megakaryocytes and erythrocytes (Lecuyer and Hoang, 2004; Reynaud et al, 2005; Brunet de la Grange et al, 2006; Souroullas et al, 2009; Lacombe et al, 2010). Yet TAL1, which is normally turned off early in the lymphoid lineage, exhibits oncogenic properties when aberrantly expressed in lymphoid tissue (Condorelli et al, 1996; Kelliher et al, 1996). Importantly, wild‐type TAL1 is aberrantly expressed in over 60% of T‐cell acute lymphoblastic leukaemia (T‐ALL) patients and is considered a major factor in initiating leukaemic transformation via perturbation of the transcriptional regulatory network (Aifantis et al, 2008). TAL1‐mediated leukaemogenesis has been linked to both an early arrest in the T‐cell differentiation program and elevated levels of anti‐apoptotic genes (Ferrando et al, 2002). While the mechanism of TAL1‐mediated leukaemogenesis is unclear, it has been proposed that TAL1 interferes with the function of bHLH E‐proteins (i.e. E2A, HEB or E2‐2), which are important regulators of T‐cell differentiation and whose inactivation leads to T‐cell tumours in mice (Quong et al, 2002). Indeed, TAL1 binding to E‐box DNA motifs (CANNTG) requires heterodimerization with an E‐protein and in vitro binding selection experiments have identified a TAL1/E‐protein heterodimer's preferred E‐box (CAGATG), which differs from the E‐protein homodimers’ preferred E‐box (CAGGTG) (Hsu et al, 1994). Interestingly, E‐box recognition is not always an important determinant of TAL1 binding as it has been proposed to be tethered to genes via other DNA‐binding transcription factors, including GATA3 in leukaemic T cells (Ono et al, 1998), and SP1 (Lecuyer et al, 2002) or GATA1 (Wadman et al, 1997) in erythroid cells. Recent ChIP‐seq experiments in erythroid cells have revealed a strong correlation between GATA and TAL1 recognition motifs, with genomic sites bound by TAL1 being frequently associated to GATA motifs while GATA1‐bound sites are enriched in E‐boxes (Cheng et al, 2009; Fujiwara et al, 2009; Kassouf et al, 2010; Soler et al, 2010). In addition, GATA1 and TAL1 cooccupancy appears to correlate with active genes in erythroid cells, although these two transcription factors can be cobound to genes that are repressed (Cheng et al, 2009; Tripic et al, 2009; Soler et al, 2010). Interestingly, degenerate selection experiments for TAL1 binding in vitro have identified a composite E‐box/Gata motif where the two DNA‐binding sites are separated by 8–10 bp (Wadman et al, 1997). This particular distance is thought to be important for binding of a pentameric protein complex in which a TAL1/E2A heterodimer and a GATA factor are bridged by LMO2 and LDB1 proteins (Wadman et al, 1997). While this composite E‐box/Gata motif was recently shown to be enriched under TAL1 peaks identified in erythroid cells (Kassouf et al, 2010; Soler et al, 2010), it has not been identified in ChIP‐microarray studies performed in T‐ALL cells (Palomero et al, 2006). As such, our lack of knowledge regarding the mechanism of how TAL1 recognizes binding sites in vivo represents one of the major limitations to our understanding of the role of this bHLH protein in promoting different cell fates depending on the lineage.
TAL1 promotes erythroid differentiation while it blocks T‐cell differentiation
To identify features that distinguish the role of TAL1 in different cell types, we employed a comparative strategy whereby the transcriptional network of TAL1 is contrasted between an erythroid environment in which TAL1 promotes cellular differentiation and a T‐cell context in which TAL1 promotes oncogenic transformation. Our strategy combines phenotypic analysis and gene expression profiling after TAL1 knockdown (KD) with chromatin immunoprecipitation and deep sequencing (ChIP‐seq).
To study TAL1 in the erythroid lineage we used primary erythroid cells differentiated ex vivo from human haematopoietic multipotential progenitors, a system that mimics the differentiation of erythroid cells in vivo (Giarratana et al, 2005) (Supplementary Figure S1 and data not shown). TAL1 KD was induced in pro‐erythroblasts using lentivirus‐delivered shRNA (Figure 1A). Following TAL1 KD (Figure 1B and C), we observed a strong diminution in cell growth (Figure 1D), which is due to both a decrease in cell proliferation (Figure 1E), and an increase in apoptosis (Figure 1F). Cell cycle analysis demonstrates accumulation of cells in the G0/G1 phases, suggesting a block at the G1/S transition (Figure 1G). To determine whether TAL1 KD also affects erythroid differentiation, we analysed accumulation of haemoglobin (Figure 1H; Supplementary Figure S2B), CD36, CD71 and GPA cell surface markers (Supplementary Figure S2C) as well as Gpa (Figure 1I) and β‐globin (Figure 4C) transcripts. We found that these erythroid markers are all decreased in TAL1 KD cells confirming the importance of TAL1 for terminal erythroid differentiation.
To study TAL1 in a T‐cell environment, we first used the TAL1‐expressing Jurkat cell line, which was originally derived from a T‐ALL patient and represents a prototypical immature transformed T cell (Schneider et al, 1977). To KD TAL1 in Jurkat cells, clonal lines expressing a Dox‐inducible shRNA against Tal1 were generated (Figure 2A and B). Similarly to erythroid cells, we observed a dramatic decrease in the growth of Jurkat cells upon TAL1 KD (Figure 2C). This is mostly due to apoptosis as shown by a 10‐fold increase in Annexin V positive cells (Figure 2E), as well as a decrease in mitochondrial transmembrane potential and an increase in caspases 3 and 8 activities (data not shown). While TAL1 KD also led to a limited decrease in BrdU incorporation (Figure 2D), progression through the cell cycle is not affected (Figure 2F). Comparable phenotypic effects were observed in a second TAL1 KD clone stably expressing a distinct anti‐Tal1 shRNA sequence (data not shown).
Gene expression analysis upon TAL1 KD
To gain further insight into the genes affected by TAL1 KD in erythroid and Jurkat cells, expression profiling by microarray was performed on WT and KD cells (Supplementary Figure S3). These experiments used Jurkat cells (WT) and their counterparts treated with Dox for 72 h (KD), as well as pro‐erythroblasts at day 12 of differentiation that were either infected with lentiviruses expressing scrambled shRNA (WT) or infected with lentiviruses expressing anti‐Tal1 shRNA (KD) as indicated on Figure 1A. We found that in erythroid cells, the majority of differentially expressed transcripts are downregulated upon TAL1 KD—442 transcripts downregulated versus 148 transcripts upregulated (Supplementary Figure S3). In contrast, the majority of differentially expressed transcripts in Jurkat cells are upregulated upon TAL1 KD—370 transcripts upregulated versus 249 transcripts downregulated. Microarray results were confirmed by RT–qPCR for all five tested TAL1‐dependent genes in erythroid cells (Supplementary Figure S4A) and for 44 of the 45 genes tested in Jurkat cells (Supplementary Figure S4B and data not shown). In agreement with the observed phenotypic effects, Gene Ontology (GO) analysis of genes that are downregulated upon TAL1 KD in erythroid cells identified biological process categories related to cell cycle control, DNA replication and erythroid‐related functions (Supplementary Figure S3C; Supplementary Table I). The same analysis of genes that are upregulated upon TAL1 KD in Jurkat cells identified categories related to apoptosis, negative regulation of growth and T‐cell differentiation (Supplementary Figure S3B; Supplementary Table II). This last GO category suggested to us that upon TAL1 KD, Jurkat cells might have partially re‐entered the T‐cell differentiation transcriptional program. For example, four transcription factors that act as master regulators of T‐cell differentiation (i.e. Gata3 (Ting et al, 1996), Sox4 (Schilham et al, 1997), Ikzf3 (coding for the Ikaros homolog Aiolos (Morgan et al, 1997)) and the thymocyte selection‐associated gene Tox (Aliahmad and Kaye, 2008)), are among the genes identified by microarray (and confirmed by RT–qPCR (Supplementary Figure S4)) as being upregulated upon TAL1 KD. Furthermore, the increased expression of GATA3 and Aiolos upon TAL1 KD was confirmed at the protein level (Figure 3D).
Genes that are differentially expressed at particular stages during human CD4+ T‐cell differentiation were previously identified from purified human cells, and sorted into signature sets (Lee et al, 2004). To further examine the possibility that a decrease of TAL1 in CD4+ Jurkat cells leads to partial re‐entry into the T‐cell transcriptional program, we used gene set enrichment analysis (GSEA) to compare genes that are differentially expressed upon TAL1 KD in Jurkat cells to these signature sets (Figure 3). We found that gene sets corresponding to early CD4+ intrathymic T‐cell progenitors (sets 1 and 2) are enriched in Jurkat cells, while gene sets upregulated in further differentiated T‐cell states are enriched in the TAL1 KD condition (sets 3–6) (Figure 3). Notably, an increase in Tox, Sox4 and Gata3 transcript level upon TAL1 KD was again identified in this GSEA analysis (Figure 3A). Combined results from all six gene sets enrichments indicate that Jurkat cells resemble T cells blocked at the CD4+ immature single‐positive stage (CD4ISP), and that upon TAL1 KD these cells partially resume the transcriptional program of T‐cell differentiation before dying by apoptosis.
Identification of TAL1 genomic binding sites by ChIP sequencing
To understand how TAL1 might regulate the expression of genes identified by microarray, we performed genome‐wide binding analysis. For each cell type (i.e. Jurkat cells and pro‐erythroblasts at day 12 of differentiation, see Figure 1A), two biologically independent TAL1 ChIPs were performed, and the resulting DNA was amplified, subjected to high‐throughput sequencing and aligned to the human reference genome (Supplementary Figure S5A). Examination of chromosome 2 indicates that TAL1 peaks are more abundant in erythroid compared with Jurkat cells (Supplementary Figure S5C). Genome wide, we counted 6315 TAL1 peaks in erythroid cells while in Jurkat cells the number of peaks decreases down to 2547 (Figure 4A). Considering the high frequency of E‐boxes in the human genome, these numbers of TAL1‐binding sites reflect a restricted genomic binding. An analysis based on the rate at which background signal is converted to foreground signal shows that we have attained sufficient sequencing depth for genome coverage (data not shown). Confirming the quality of our data set, ChIP‐seq analysis identified a number of previously characterized TAL1 targets, including Epb42 (Xu et al, 2003), Gypa (Lahlil et al, 2004), DNase I hypersensitive site (HS) 2 of the β‐globin locus (Song et al, 2007), c‐kit (Lecuyer et al, 2002), Runx1 (Wilson et al, 2009) and Lmo2 (Landry et al, 2009) in erythroid cells, and Aldh1a2 (Ono et al, 1998), Tcra (Bernard et al, 1998), Chrna5 (Palomero et al, 2006) and Nfkb1 (Chang et al, 2006) in Jurkat cells (Figure 4; Supplementary Figures S6 and S7 and data not shown). In addition, we identified TAL1 peaks in pericentromeric regions (Supplementary Figure S5C and data not shown), which is consistent with reported TAL1 binding to satellite DNA (Wen et al, 2005). Validating the quality of our ChIP‐seq results, all 23 known and novel TAL1 genomic targets that we tested (with peak heights ranging from 8 to 178 reads) were confirmed by independent ChIP‐qPCR experiments (Figure 4C; Supplementary Figures S6 and S7 and data not shown).
In agreement with the different functions of TAL1 in erythroid versus T‐ALL cells, the proportion of overlapping TAL1 peaks between the two cell types is relatively small representing 6% of total peaks in erythroid cells and 15% of total peaks in Jurkat cells (Figure 4A). Functional annotation analyses of genes associated to the nearest TAL1 peak in erythroid and Jurkat cells identified overrepresented GO terms related to erythroid and T‐cell differentiation, respectively (Figure 4A; Supplementary Tables III and IV). We also observed that while there is a higher local density of TAL1 peaks near TSSs in both erythroid and Jurkat cells (Figure 4B), the majority of TAL1 peaks are located away from promoter regions of known genes, mostly within introns and intergenic regions (Figure 4A). Binding at intergenic regions could represent in some cases binding to distal regulatory elements such as enhancers. For example, in erythroid cells TAL1 is bound to the four erythroid‐specific DNase I HSs that comprise the distal β‐globin locus control region LCR (Figure 4C). TAL1 is also frequently bound to introns as shown for the Cdk6 regulator of T‐cell differentiation (Grossel and Hinds, 2006), whose expression is upregulated by TAL1 in T‐ALL cells (Figure 4C). An example of TAL1 binding to a promoter region is shown on the Cd69 gene (Figure 4C), which is expressed transiently in immature thymocytes undergoing positive selection (Bendelac et al, 1992) and is also one of the earliest inducible cell surface glycoprotein acquired during lymphoid activation (Sancho et al, 2005). In contrast to Cdk6, the Cd69 gene is downregulated by TAL1 in Jurkat cells. Interestingly, the KD of TAL1 in an erythroid environment does not affect Cd69 expression despite TAL1 binding to this gene's promoter. Together, these ChIP‐seq data provide us with a number of novel TAL1 genomic targets in both erythroid and T‐ALL cellular environments.
Identification of TAL1‐target genes functionally regulated by TAL1
A major question arising in genome‐wide studies of transcription factor binding is that of the association between binding events and regulated genes. As a first approximation, we associated TAL1 peaks to their closest genes. Functional annotation of these genes led to the identification of biological categories that are consistent with TAL1 function (Figure 4A), providing confidence that TAL1 indeed regulates some genes by binding to their promoter (e.g. Cd69; Figure 4C). However, in many cases, TAL1 is bound away from promoters. Therefore, a simple association of TAL1 peaks to their closest genes may miss target genes regulated by TAL1 via binding to a distal regulatory element.
To identify TAL1 targets that are also functionally regulated by this factor, we took advantage of our identification of TAL1‐dependent genes by microarray and restricted our analysis to genes that are differentially expressed upon TAL1 KD. To remain permissive to distal regulatory elements, differentially expressed genes were associated to TAL1 peaks located within 50 kb upstream or downstream of their TSS. Using this approach, 289 genes were identified in erythroid cells, including 246 that are downregulated upon TAL1 KD (thereafter called ‘TAL1‐activated genes’) and 43 that are upregulated (thereafter called ‘TAL1‐repressed genes’) (Supplementary Table V). This result is consistent with previous studies, which have predominantly associated TAL1 to active genes in erythroid cells (Cheng et al, 2009; Tripic et al, 2009; Soler et al, 2010). In agreement with our phenotypic analyses (Figure 1), many TAL1‐activated genes in erythroid cells are involved in the regulation of DNA replication, cell cycle and erythroid differentiation. In Jurkat cells, we identified 73 TAL1‐repressed genes and 44 TAL1‐activated genes (Supplementary Table VI). Among them, we note the presence of genes that are upregulated upon TAL1 KD and code for important T‐cell‐specific transcription factors such as TOX (Aliahmad and Kaye, 2008) and Aiolos/ikzf3 (Morgan et al, 1997). Conversely, the TAL1‐target gene Cdk6, which is normally downregulated during T‐cell differentiation (Grossel and Hinds, 2006), is decreased upon TAL1 KD (Figure 4C). In addition, five genes with previously described functions in apoptosis were identified as direct targets of TAL1, including four pro‐apoptotic genes (i.e. Pmaip1/Noxa (Oda et al, 2000); Stk17a/Drp1 (Inbal et al, 2000); Isg20l1/Aen (Kawase et al, 2008) and Plk3 (Xie et al, 2001)) that are repressed by TAL1. Interestingly, TAL1 may also decrease apoptosis via upregulation of Cd226, which is involved in the resistance of thymocytes to apoptosis (Fang et al, 2009). Finally, we identified three tumour suppressor genes: Tnfaip3 (Malynn and Ma, 2009), Pten (Li et al, 1997; Gutierrez et al, 2009) and Lrp12/st7 (Zenklusen et al, 2001)) that are repressed by TAL1. Finally, we observed a significant enrichment (2.8‐fold; P‐value=5.488e–13, proportions test with continuity correction) of TAL1 peaks near the signature genes associated with TAL1 expression in T‐ALL patients (Ferrando et al, 2002), suggesting that many of these signature genes are direct targets of TAL1.
Overall, our results, which were confirmed by ChIP–qPCR (Figure 4; Supplementary Figures S6 and S7 and data not shown), RT–qPCR (Figure 4; Supplementary Figure S4 and data not shown) and western blot (Figure 3D), identified several possible mechanisms through which TAL1 may contribute to leukaemogenesis: repression of pro‐apoptotic genes, tumour suppressor genes and T‐cell differentiation genes, as well as activation of anti‐apoptotic and anti‐T‐cell differentiation genes.
Validation of the role of TAL1 in additional T‐ALL cell lines and in primary blasts from T‐ALL patients
We next sought to confirm our findings in four additional TAL1‐expressing T‐ALL cell lines (i.e. CEM/C1, MOLT4, PF382 and RPMI8402) and in primary blasts from two TAL1‐expressing T‐ALL patients. We first used ChIP to verify the binding of TAL1 to 17 previously identified sites associated to TAL1‐regulated genes, including T‐cell marker genes (e.g. Aiolos/Ikzf3, Tox, Ccr9), pro‐ and anti‐apoptotic genes (e.g. Pmaip1/Noxa, Plk3 and Cd226) and other genes (Figure 5A; Supplementary Figure S7 and data not shown). TAL1 binding was confirmed at all the sites we have tested, not only in the four T‐ALL cell lines, but also in the primary blasts from both T‐ALL patients. Next, the KD of TAL1 was induced in these cells by infection with lentivirus expressing anti‐TAL1 or scrambled (scr) shRNA (Figure 5B and C). Similarly to the phenotype observed in Jurkat cells, the KD of TAL1 leads to a decrease of Cdk6 transcription and a reactivation of Cd69 in all T‐ALL cell lines and in the primary blasts from both T‐ALL patients (Figure 5D). Finally, the apoptotic phenotype induced upon TAL1 KD is also conserved in the T‐ALL cell lines and primary blasts (Figure 5E and F and data not shown). Therefore, the results we have obtained in four additional T‐ALL cell lines that express TAL1 and in primary blasts from two TAL1‐expressing T‐ALL patients confirm our findings from Jurkat cells.
TAL1 displays distinct genomic binding profiles in erythroid and Jurkat cells
To determine what restricts TAL1 binding in vivo, motif analyses were performed on DNA sequences underlying TAL1 peaks (Figure 6). As TAL1 is a bHLH transcription factor known to bind to E‐boxes, we first measured the proportion of E‐box variants centred at TAL1 peak summits (Figure 6B). We found that the in vitro preferred TAL1/E‐protein E‐box (CAGATG) (Hsu et al, 1994) ranks first among TAL1 peak‐centred E‐boxes in both erythroid and Jurkat cells (Figure 6B), confirming that it is also the TAL1 preferred E‐box in vivo. The frequency ranking of E‐box variants is identical between Jurkat and erythroid cells, except for the E‐protein homodimer's preferred E‐box (CAGGTG), whose proportion doubles in Jurkat cells (Figure 6B, red square). Consistent with this, a de novo search for motifs overrepresented in Jurkat compared with erythroid cells also identified this same E‐protein homodimer's preferred E‐box (Figure 6A, right panel). The finding that TAL1 binds to the E‐protein homodimer's preferred E‐box more frequently in Jurkat than erythroid cells is consistent with the long‐standing model in which TAL1 deregulates E‐protein homodimers’ function in T‐ALL (Begley and Green, 1999; O'Neil and Look, 2007).
On the other hand, E‐boxes are absent from an unexpectedly high proportion of TAL1 peaks (14% in Jurkat and 24% in erythroid cells) (Figure 6C). Consistent with this, the de novo motif search did not identify E‐boxes as the top overrepresented motif in erythroid or Jurkat cells (Figure 6A, left and middle panels). Instead, in erythroid cells, a Gata motif ranks first within overrepresented sites, and two variants of this motif are also among the top 7 overrepresented motifs (Figure 6A, middle panel). Furthermore, virtually all TAL1 peaks (96%) contain a Gata motif while only 76% contain an E‐box within a 100‐bp radius of the peak summit (Figure 6C). In erythroid cells, Gata motifs are also on average closer to TAL1 peak summits than E‐boxes, with 80% of TAL1 peaks containing a Gata site within 35 bp of the peak summit versus only 50% containing an E‐box within this distance (Figure 6C). This is consistent with previous studies showing cooccupation of TAL1 and GATA1 at many genomic sites in erythroid cells (Cheng et al, 2009; Tripic et al, 2009; Soler et al, 2010). In Jurkat cells, Gata sites are also highly prevalent since they are present in 79% of TAL1 peaks (Figure 6A and C). These results suggest that GATA factors are important for targeting TAL1 to specific regions of the genome in both erythroid and Jurkat cells.
Strikingly, a Runx‐binding site was identified as the top‐ranking overrepresented motif in Jurkat cells when comparing to either control regions that share the same GC content distribution (Figure 6A, left panel), or regions specifically bound in Jurkat cells (Jurkat sp) versus those specifically bound in erythroid cells (erythroid sp) (Figure 6A, right panel). Furthermore, Runx motifs occur in 56% of TAL1 peaks in Jurkat cells (versus 18% only in erythroid cells) (Figure 6C). An Ets‐binding site is the second most overrepresented motif when comparing Jurkat versus erythroid cells (Figure 6A, right panel) and is also overrepresented in Jurkat peaks versus control sequences (Figure 6A, left panel). In agreement with this, Ets motifs are present in 39% of TAL1 peaks in Jurkat cells (versus 21% in erythroid cells) (Figure 6C). Interestingly, the Ets motif identified under TAL1 peaks in Jurkat cells by our de novo search (CAGGAA(A/G); Figure 6A) resembles the recently identified ETS1‐specific motif variant (CAGGA(A/T)GT) that is predominantly associated to T‐cell‐specific enhancers as opposed to the ‘redundant’ variant (CCGGAAGT) that can also be bound by other ETS‐family members (e.g. GABPA) and is mostly located in promoter regions of housekeeping genes (Hollenhorst et al, 2009).
To get further insight into a potential binding pattern of TAL1 relative to the other transcription factors for which we detected overrepresented binding sites (i.e. GATA, RUNX and ETS), we searched within TAL1 peaks for preferred distances between E‐boxes and the DNA motifs underlying binding of these factors (Figure 6D; Supplementary Figure S5D). About 10% of TAL1 peaks in erythroid cells and 4% in Jurkat cells display a preferred distance of 8–10 bp between E‐box and Gata sites (Figure 6D; Supplementary Figure S5D). Remarkably, this corresponds to the composite E‐box/Gata motif previously identified in vitro (Wadman et al, 1997) and in vivo in erythroid cells (Fujiwara et al, 2009; Kassouf et al, 2010; Soler et al, 2010). Precisely in erythroid cells, we counted 218/397/116 occurrences of the E‐box‐Gata motif with spacing of 8/9/10 bp, respectively. In Jurkat cells, we counted 94/99/50 occurrences of the E‐box‐Gata motif with spacing of 8/9/10 bp, respectively. We also found a significant preference (occurring in about 1% of E‐box‐containing TAL1 peaks; 99 occurrences) for a novel composite juxtaposed Ets/E‐box motif specific to Jurkat cells (Figure 6D; Supplementary Figure S5D). Finally, no preferred distance was detected for Runx motifs with respect to E‐boxes.
The identification of Runx‐ and Ets‐binding sites within TAL1 peaks suggest that similarly to GATA factors, RUNX and ETS proteins might be involved in guiding TAL1 in close proximity to their binding sites in T‐ALL cells. In agreement with this possibility, reciprocal co‐IP experiments show that TAL1 interacts with RUNX1 and ETS1 factors in Jurkat nuclear extracts (Figure 6E). In contrast, no interaction was detected between TAL1 and GABPA, another ETS‐family member that is highly expressed in Jurkat cells (Hollenhorst et al, 2004) and predominantly targets promoters of housekeeping genes, as opposed to ETS1 which targets T‐cell‐specific enhancers (Hollenhorst et al, 2009). The fact that TAL1 interacts specifically with ETS1 is highly consistent with our identification of the ETS1‐specific motif (but not the redundant Ets motif that also binds GABPA) as being enriched under TAL1 peaks (Figure 6A). Interestingly, the Runx1 and Ets1 genes are direct targets of TAL1 (Supplementary Figure S7) and both genes are downregulated upon TAL1 KD at the transcript (Supplementary Figure S4B) and protein (Figure 3D) levels.
RUNX1/3 and ETS1 are required for TAL1 binding to genomic loci in Jurkat cells
Since genome‐wide localization of RUNX1/3 (using an antibody that recognizes both RUNX1 and RUNX3) and ETS1 factors has recently been determined in Jurkat cells (Hollenhorst et al, 2009), we looked for genomic overlap of these factors with TAL1. Notably, we found colocalization of TAL1, RUNX1/3 and ETS1 on the representative TAL1 peaks of the Cdk6 and Cd69 genes (Figure 7A). Furthermore, genome‐wide comparisons indicate that 50% of TAL1 peaks containing a Runx motif are bound by RUNX1/3 in Jurkat cells while 65% of TAL1 peaks containing an Ets motif are bound by ETS1. While we did find the previously described ETS1‐RUNX‐cobound composite motif (Hollenhorst et al, 2009) in a subset of our TAL1/RUNX/ETS1 shared peaks, this motif was also present in our TAL1/ETS1 (no RUNX) cobound sites.
To assess the importance of the interplay between TAL1, RUNX1/3 and ETS1, we used lentiviral‐mediated KD of RUNX1/3 or ETS1 in Jurkat cells. We found that decreased levels of either RUNX1/3 or ETS1 (Figure 7B and C) lead to a decrease of TAL1 genomic binding at all sites tested (Figure 7D; Supplementary Figure S8), which confirms an important role for these transcription factors in targeting TAL1 to specific loci. Notably, the binding of TAL1 to the important T‐cell‐specific genes (i.e. Tox, Ikzf3/Aiolos, Ccr9, Cd69, Tcrbv), apoptotic genes (i.e. Cd226, Pmaip1/Noxa, Plk3) and other genes requires RUNX1/3 and/or ETS1 (Figure 7; Supplementary Figure S8), suggesting that these additional transcription factors may be important contributors to TAL1‐mediated leukaemogenesis. Consistent with this possibility, an apoptotic phenotype is observed in Jurkat cells upon KD of RUNX1/3 and ETS1 (Figure 7E).
Composite DNA motifs have an important function in selecting TAL1‐binding sites in vivo
The question of how a transcription factor selects its binding site in vivo has been the centre of recent focus (Pan et al, 2009). To address this question, we have examined the genomic binding of TAL1 in two cellular contexts. Importantly, in our study, the cellular context reflects both the lineage (erythroid versus T cell) and the state (normal versus malignant) of the cell. While these contributing factors are not distinguished, both are physiologically relevant since TAL1 is oncogenic in the T‐cell lineage (Condorelli et al, 1996; Kelliher et al, 1996). Strikingly, we found that the binding profile of TAL1 and its preference for particular composite DNA motifs vary depending on the cellular context. Common properties of TAL1‐binding sites in erythroid and T‐ALL cells include a high prevalence of E‐boxes and Gata motifs, and a preference for the same E‐box variant CAGATG. DNA motifs that are recognized by TAL1 preferentially in a T‐ALL environment include Runx‐ and Ets‐binding sites as well as another E‐box variant (CAGGTG), which in vitro is preferentially bound by E‐protein homodimers (Hsu et al, 1994). Our finding that the same transcription factor displays distinct DNA‐binding profiles in two cell types was unexpected and reveals an additional level of complexity in the mechanism by which transcription factors select their binding sites in vivo. Specifically, we found that TAL1 recognizes a combination of DNA motifs comprised of TAL1's direct binding site (E‐box) and that of other transcription factors with which it interacts, including GATA factors as well as RUNX and ETS proteins (Figure 8A).
An important role for GATA factors in targeting TAL1 has been suggested previously. For example, Gata motifs have an important function in mediating TAL1 binding to specific sites (Ono et al, 1998; Vyas et al, 1999; Song et al, 2007). In addition, recent genome‐wide studies in erythroid cells have shown that a large proportion of GATA1‐bound genomic sites are also occupied by TAL1 (Cheng et al, 2009; Soler et al, 2010). Furthermore, the E‐box/Gata composite motif, which underlies the binding of a GATA1–TAL1 pentameric complex (Wadman et al, 1997), is enriched under TAL1 peaks (Kassouf et al, 2010) and under GATA1 peaks (Fujiwara et al, 2009) in erythroid cells. Finally, a recent ChIP‐seq study in mouse primary fetal erythroid cells showed that a TAL1 mutant that has lost its E‐box‐mediated DNA‐binding activity can still be recruited to one fifth of TAL1 targets, and TAL1 and GATA1 cooperate to stabilize each other at these sites (Kassouf et al, 2010). Our results are consistent with these findings and provide evidence that a similar mechanism likely occurs in T‐ALL cells. Indeed, Gata motifs are largely overrepresented within the 200‐bp region underlying TAL1 binding in both cell types studied. Furthermore, between 50 and 80% of Gata sites (depending on cell type) are centred within 35 bp of TAL1 peak summit. Finally, our ChIP‐seq study detected an overrepresentation of the E‐box/Gata composite motif in both erythroid and T‐ALL. Therefore, TAL1‐interacting GATA factors appear to have an important function in mediating TAL1 DNA binding in vivo in both erythroid and T‐ALL cells.
Specificity of TAL1 binding in T‐ALL
In addition to Gata motif, we found an overrepresentation of Ets and Runx motifs within TAL1‐binding sites in T‐ALL cells, and comparative bioinformatic analysis shows extensive binding of these TAL1‐associated sites by RUNX1/3 and ETS1 factors in Jurkat cells. Furthermore, our co‐IP experiments show that TAL1 interacts with RUNX1 and ETS1 proteins and knocking down RUNX1/3 or ETS1 factors leads to a disruption of TAL1 genomic binding at specific genes. Spatial analysis of TAL1 sites containing an Ets motif identified a novel T‐ALL‐specific composite E‐box/Ets motif where the two DNA elements are juxtaposed. These findings suggest that RUNX1 and ETS1 may be involved in guiding TAL1 to specific genomic sites in the T‐cell lineage.
Analysis of the Ets motifs found under TAL1 peaks in Jurkat cells revealed an enrichment for the recently identified ETS1‐specific motif but not for the ‘redundant’ Ets motif, which can be bound by either ETS1 or the other ETS‐family member GABPA (Hollenhorst et al, 2009). This is particularly interesting since the ‘redundant’ Ets motif is found within housekeeping gene promoters while the ETS1‐specific motif is specifically enriched within enhancers of T‐cell genes (Hollenhorst et al, 2009). Furthermore, our co‐IP studies show that TAL1 does not interact with GABPA. These key findings are consistent with the idea that TAL1 acts on T‐cell‐specific genes in T‐ALL. Furthermore, this specificity might at least partially be mediated through TAL1 interaction with ETS1. Taken together, our results reveal that RUNX1/3 and ETS1 are important mediators of TAL1 genomic binding in leukaemic T cells, notably at genes involved in T‐cell differentiation (e.g. Tox, Ikzf3/Aiolos) and apoptosis (e.g. Cd226, Pmaip1, Plk3). Combined with the apoptotic phenotype observed upon their KD in T‐ALL, these findings provide strong support for a model in which RUNX1/3 and ETS1 transcription factors are important contributors to the mechanism of TAL1‐mediated leukaemogenesis (Figure 8A).
Interplay between TAL1, ETS1 and RUNX factors in T‐ALL leukaemogenesis
RUNX and ETS proteins have essential roles in regulating T‐cell differentiation (Anderson, 2006). In addition, both families of transcription factors often become oncogenic when mutated or overexpressed (Blyth et al, 2005; Gallant and Gilkeson, 2006). In light of our finding that RUNX1/3 and ETS1 are important for mediating TAL1 genomic binding, it is interesting to note that TAL1 also appears to directly regulate the expression and function of RUNX and ETS proteins, which suggests a multipronged mechanism for TAL1‐mediated leukaemogenesis. Indeed, TAL1 directly targets the Runx1 gene whose expression is upregulated by TAL1 in both Jurkat cells and blasts from T‐ALL patients. In addition, TAL1 represses the Pim1 kinase gene (Supplementary Table VI), whose product enhances RUNX1 activity via phosphorylation (Aho et al, 2006). Therefore, TAL1 acts at multiple levels to disrupt the transcriptional network regulated by RUNX1. This is of particular interest since RUNX1 is part of the ‘gene expression signature’ of TAL1‐positive T‐ALL (Ferrando et al, 2002).
Similarly, TAL1 appears to target the ETS1 transcriptional network. Indeed, the Ets1 gene is also a direct target of TAL1, its expression is upregulated by TAL1 in both Jurkat cells and primary T‐ALL blasts, and the ETS1 protein interacts with TAL1. Since ETS1 promotes the survival of T cells (Muthusamy et al, 1995) and is overexpressed in T‐ALL (Sacchi et al, 1988), deregulation of its transcriptional network by TAL1 might also have a major role in TAL1‐mediated leukaemogenesis. Interestingly, Runx‐ and Ets‐bound motifs have also been identified as TAL1 targets in a murine haematopoietic progenitor cell line (Wilson et al, 2010), suggesting that aberrant expression of TAL1 in a T‐cell precursor environment could cause these cells to acquire/retain some properties of multipotent progenitors.
Target genes as potential mediators of TAL1‐mediated leukaemogenesis
A number of our newly identified TAL1‐regulated target genes code for proteins that are involved in T‐cell differentiation and apoptosis. In addition, tumour suppressors that are targeted by TAL1 are upregulated upon TAL1 KD, suggesting that TAL1 represses these genes in T‐ALL (Figure 8B). While all TAL1‐regulated direct targets (Supplementary Table VI) are candidate effectors of TAL1‐mediated leukaemogenesis, Cdk6, which is activated by TAL1, presents a particular interest. Indeed, this gene must be downregulated for T‐cell differentiation to proceed (Grossel and Hinds, 2006) and forced expression of CDK6 contributes to the development of lymphoid malignancies in mice (Schwartz et al, 2006). Importantly, CDK6 is overexpressed in human T‐ALL (Chilosi et al, 1998) and mice lacking CDK6 are completely resistant to T‐cell malignancies induced by constitutively active AKT1 (Hu et al, 2009). Since constitutive activation of Akt has been proposed to be involved in resistance to the NOTCH1 inhibition therapy by small molecule γ‐secretase inhibitors (Palomero et al, 2008), CDK6 (which acts downstream of the AKT pathway) may represent a valuable alternative therapeutic target for treating T‐ALL.
Here, we have performed the first genome‐wide comparison of TAL1 function in an erythroid versus T‐cell environment. This study revealed an unexpected contribution of the cellular environment in selecting particular composite DNA motifs for TAL1 binding in vivo. T‐ALL‐specific genomic targeting provides the TAL1 protein with access to transcriptional regulatory networks of key regulators of T cell, and leads to inhibition of differentiation in this lineage. Validation in T‐ALL patients of key functional targets of TAL1 identified by this approach demonstrates the utility of such a comparative strategy in identifying mediators/effectors of oncogenesis. Taken together, our findings underscore the cellular environment as an important factor in the genomic binding selectivity of transcription factors and suggest how changing the cellular context can render a transcription factor oncogenic.
Materials and methods
Light‐density mononuclear cells were isolated by ficoll density gradient centrifugation (GE Healthcare) from G‐CSF‐mobilized adult blood from donors without haematological malignancies (Ottawa Hospital Research Ethics Board #2007804‐01H). CD34+ cells were enriched through positive immunomagnetic selection using the CD34 MicroBead Kit (Miltenyi Biotec Inc.) (purity >95±3%) and differentiated ex vivo as described in Giarratana et al (2005) except that we used 20% BIT (StemCell Technologies). Jurkat cells stably expressing the Tet repressor (Invitrogen) were cultured in RPMI 1640 supplemented with 10% (v/v) Tet‐free FBS, 100 U/ml penicillin, 100 μg/ml streptomycin and 20 μg/ml blasticidin. CEM/C1, MOLT4 (obtained from ATCC) and PF382, RPMI8402 (obtained from DSMZ) were cultured in RPMI 1640 supplemented with 10% (v/v) FBS, 100 U/ml penicillin, 100 μg/ml streptomycin.
Expression profiling on Affymetrix microarray
Total RNA was purified using the RNeasy Mini Kit (Qiagen). Labelling and hybridization to the Affymetrix Human Gene 1.0 ST gene expression microarray were performed following standard Affymetrix procedures. See Supplementary data. The microarray raw data are deposited into GEO (accession number GSE20546).
Gene set enrichment analysis
GSEA (Subramanian et al, 2005) with gene sets from Lee et al (2004) was performed on the Jurkat array set with the default parameters. For a description of the genes sets used, see Supplementary data.
ChIP experiments using anti‐TAL1 Ab or normal IgG as a negative control were carried out as described in Chaturvedi et al (2009) except that chromatin was fragmented to a size range of 100–300 bp. For Solexa sequencing, ChIPed DNA was prepared according to the Illumina protocol with two modifications: (1) DNA fragments ranging from 150 to 300 bp were selected at the gel selection step and (2) 21 instead of 18 cycles of PCR were done at the amplification step, as previously described (Cao et al, 2010). The samples were sequenced using the Illumina Genome Analyzer II. For qPCR, DNA was quantified using SYBROGreen. Immunoprecipitated DNA was quantified using a standard curve generated with genomic DNA and were normalized by dividing by the amount of the corresponding target in the input fraction. The enrichment at each DNA target is expressed as the fold enrichment relative to a mock ChIP performed with normal IgG.
High‐throughput DNA sequencing data analysis
Sequences were extracted using the Firecrest and Bustard programs from package GApipeline‐0.3.0. Reads were aligned using MAQ (version 0.6.6) to the human reference genome (hg18) as described in Cao et al (2010). For a complete description of data analysis including sequencing depth coverage, see Supplementary data. ChIP‐seq data are deposited in GEO (accession number GSE25000).
Supplementary data are available at The EMBO Journal Online (http://www.embojournal.org).
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary Table I
Supplementary Table II
Supplementary Table III
Supplementary Table IV
Supplementary Table V
Supplementary Table VI
We thank L Douay and MC Giarratana (Université Paris VI, France) and R Pasha (OHRI) for advice on the ex vivo erythroid differentiation, D Trono (Ecole Polytechnique Federale de Lausanne) for lentiviral vectors, L Filion (University of Ottawa) for advice on FACS, J Wu (OHRI) for technical help, S Huang (University of Florida), B Gottgens (University of Cambridge), L Megeney, Y Kawabe and M Rudnicki (OHRI) for discussion, shared data and reagents. This project was funded by a CIHR grant MOP‐82813 to MB and an NIH NIAMS R01AR045113 grant to SJT. CGP is a recipient of a Ontario Research Fund Computational Regulomics Training Postdoctoral Fellowship. MB holds the Canada Research Chair in the Regulation of Gene Expression. FJD holds the Canada Research Chair in the Epigenetic Regulation of Transcription.
This is an open‐access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation without specific permission.
- Copyright © 2011 European Molecular Biology Organization