Proneural basic helix–loop–helix proteins are key regulators of neurogenesis but their ‘proneural’ function is not well understood, partly because primary targets have not been systematically defined. Here, we identified direct transcriptional targets of the bHLH proteins Neurogenin and NeuroD and found that primary roles of these transcription factors are to induce regulators of transcription, signal transduction, and cytoskeletal rearrangement for neuronal differentiation and migration. We determined targets induced in both Xenopus and mouse, which represent evolutionarily conserved core mediators of Neurogenin and NeuroD activities. We defined consensus sequences for Neurogenin and NeuroD binding and identified responsive enhancers in seven shared target genes. These enhancers commonly contained clustered, conserved consensus‐binding sites and drove neural‐restricted transgene expression in Xenopus embryos. We then used this enhancer signature in a genome‐wide computational approach to predict additional Neurogenin/NeuroD target genes involved in neurogenesis. Taken together, these data demonstrate that Neurogenin and NeuroD preferentially recognize neurogenesis‐related targets through an enhancer signature of clustered consensus‐binding sites and regulate neurogenesis by activating a core set of transcription factors, which build a robust network controlling neurogenesis.
Transcription factors regulate many biological processes, including cell‐fate determination and differentiation during embryonic development. With the completion of genome sequencing in many organisms, a major remaining challenge is to globally define transcriptional regulatory networks underlying complex biological processes. This requires identifying primary targets directly controlled by each transcription factor and then defining how expression of these targets is regulated with the proper specificity in a particular biological context.
Transcription factors of the neural basic helix–loop–helix (bHLH) class, including the Neurogenins (Ngn1/2/3) and NeuroD, are key regulators of vertebrate neurogenesis. The Ngns are required for neuronal commitment, as evidenced by loss of cranial and spinal sensory ganglia and ventral spinal cord neurons in Ngn1 or Ngn2 single‐ or Ngn1/2 double‐mutant mice (Fode et al, 1998; Ma et al, 1998, 1999; Scardigli et al, 2001). Among the downstream mediators of Ngn activity is NeuroD, which is expressed in essentially all areas of the brain, spinal cord, peripheral ganglia, and sense organs that express Ngn1, 2, or 3 (Sommer et al, 1996). NeuroD expression diminishes as neurons mature except in the cerebellum and hippocampus, where its expression is maintained throughout adulthood (Miyata et al, 1999). In contrast to NeuroD's extensive expression within the nervous system, NeuroD‐null mice display defects in restricted areas. The NeuroD‐null phenotype includes loss of cerebellar and hippocampal granule cells, inner ear sensory neurons, and retinal photoreceptor cells but other regions do not exhibit gross cellular deficits (Miyata et al, 1999; Kim et al, 2001; Pennesi et al, 2003). Taken together, these features suggest that NeuroD is a major mediator of Ngn activities that may act redundantly with other molecules to regulate neuronal differentiation.
bHLH transcription factors act as heterodimers with ubiquitously expressed E proteins and bind sequences with the general consensus CANNTG, termed E‐boxes, in their target genes (Bertrand et al, 2002 and references therein). bHLH proteins have different preferences for the two central nucleotides (CANNTG) and so can recognize distinct consensus sequences and additional nucleotides flanking the E‐box on either side also contribute to specificity of bHLH–E‐box interactions. However, consensus sequences for most neural bHLH proteins have not been systematically determined.
While the expression and activities of the neural bHLH transcription factors have been extensively characterized in many organisms, molecular mechanisms underlying their ability to regulate neurogenesis are not well understood. This is in large part because primary target genes and transcriptional programs that are directly regulated by neural bHLH proteins have not been systematically defined. It is also not clear which regulatory sequence features enable neural bHLH proteins to distinguish among many potential targets in the genome to specifically activate targets relevant to neurogenesis. This is significant as E‐box sequences are present in regulatory regions of many non‐neural target genes and also occur frequently in the genome by chance.
Here, we addressed these questions by identifying direct transcriptional targets of the Atonal‐related neural bHLH proteins Ngnr1 and NeuroD. We used Xenopus animal cap explants, which are multi‐potent naive ectoderm that can differentiate into ectodermal, endodermal, or mesodermal cell derivatives. To control transcription factor activity, we used Ngnr1 and NeuroD proteins fused to the human glucocorticoid receptor (GR) hormone‐binding domain. In the absence of the ligand dexamethasone (DEX), these fusion proteins are sequestered in the cytoplasm and remain transcriptionally inactive (Kolm and Sive, 1995). Addition of DEX allows GR‐fused transcription factors to translocate into the nucleus and regulate target genes. We induced Ngnr1‐GR and NeuroD‐GR activities in the presence of the protein synthesis inhibitor cycloheximide, which blocks translation and prevents induction of secondary targets, whose expression is regulated by primary target proteins. This strategy effectively defines primary target genes activated directly by transcription factors (for example, Pozzoli et al, 2001; Logan et al, 2005).
Using this approach, we identified direct transcriptional targets of Ngnr1 and NeuroD in neurogenesis. We also determined evolutionarily conserved target genes from Xenopus to mammals, which represent a core machinery to generate Ngn and NeuroD‐regulated neurons. We then analyzed these targets, enabling us to (i) understand primary roles of these key transcription factors in neurogenesis, (ii) define a transcriptional network controlling Ngn‐ and NeuroD‐dependent neurogenesis, and (iii) define and characterize enhancers regulated by Ngn and NeuroD in their target genes. Finally, we used the sequence features of these enhancers for genome‐wide identification of additional Ngn/NeuroD target genes relevant to neurogenesis.
Identification of direct targets of Ngnr1 and NeuroD in Xenopus ectoderm
To identify Ngnr1 and NeuroD direct targets, we expressed Ngnr1‐GR and NeuroD‐GR fusion proteins in Xenopus laevis ectodermal explants. In the presence of a protein synthesis inhibitor, Ngnr1 and NeuroD activity was induced for 2.5 h from the early gastrula stage and gene expression was analyzed on Affymetrix Xenopus laevis Genome Arrays. Ngnr1 or NeuroD target genes were determined by comparing DEX‐treated versus ‐untreated Ngnr1‐GR‐ or NeuroD‐GR‐injected samples. β‐Galactosidase‐injected, DEX‐treated samples (control) were also compared to exclude genes induced by DEX (see Materials and methods and Supplementary data). We validated the microarray results using quantitative real‐time RT–PCR (qRT–PCR) to define 57 and 62 genes as direct targets of Ngnr1 and NeuroD, respectively (Table I).
Of these, 26 genes were shared direct targets of both Ngnr1 and NeuroD. In total, 52 of the 93 target genes were described previously, while 41 were novel or uncharacterized. Most known genes (41 of 52; 79%) were previously reported to either have a role in or be expressed in neural tissues (including this study; Figure 1). Therefore, as predicted, neural genes are highly enriched in our list. We also obtained genes previously shown to respond to Ngn or NeuroD in various species. For example, Math3, NeuroD, HEN1 (Nhlh1), Dll1 (Delta‐1), and Hes‐6 expression was reduced or lost in Ngn1 and Ngn2 single‐mutant mice (Fode et al, 1998; Ma et al, 1998; Koyano‐Nakagawa et al, 2000) and Ebf2, MTGR1, Elavl3, Gadd45g, MyT1, Hes‐6, and ESR1 were previously defined as either direct or indirect targets of Ngnr1 and/or NeuroD in Xenopus (Bellefroid et al, 1996; Koyano‐Nakagawa et al, 2000; Lamar and Kintner, 2005; Logan et al, 2005). Thus our approach effectively determined direct transcriptional targets of Ngn and NeuroD and demonstrated that several genes previously shown to act downstream of Ngn and/or NeuroD are under their direct transcriptional control.
Analysis of direct targets of Ngnr1 and NeuroD
In total, 33 of 57 Ngnr1 targets were either characterized previously or had strong sequence homology to known genes. Fifteen of these (45%) were transcription factors, including NeuroD, NeuroD4 (Ath3), Ebf2, MTGR1, and Hes6, which were previously shown to regulate neurogenesis (Lee et al, 1995; Bellefroid et al, 1996; Dubois et al, 1998; Perron et al, 1999; Koyano‐Nakagawa et al, 2000; Pozzoli et al, 2001; Garcia‐Dominguez et al, 2003; Koyano‐Nakagawa and Kintner, 2005). Ngnr1 also induced seven direct targets (21%) involved in signal transduction. These data indicate that the major role of Ngnr1 in neurogenesis is to induce regulators of gene expression and signaling, consistent with Ngnr1's role as a proneural gene that acts at the commitment/determination step.
For NeuroD, 37 of 62 target genes had known functions and, similar to results for Ngnr1, 18 genes (49%) were transcription factors, while five genes (14%) were involved in signal transduction. In contrast, we did not find genes directly involved in neuronal function such as neuropeptides, neurotransmitter receptors, or channels. These data suggest that NeuroD may regulate neuronal differentiation by inducing transcription factor targets, such as Xath3, Ebf3, HEN1, and MTGR1, which then subsequently induce genes controlling neuronal function. Alternatively, the cell context used for our screen may have enriched for targets regulated by NeuroD during its early expression in committed and differentiating primary neuronal progenitors, while being unfavorable to detect NeuroD's later targets in neurons (see also Discussion).
We additionally identified 43 Ngnr1 and 43 NeuroD direct targets that were either novel or had not been previously characterized as having a role in neurogenesis. These included transcriptional regulators, signaling molecules, regulators of cell migration, and/or the actin cytoskeleton, and genes with no attributed homology or function. Interestingly, we defined Calponin2, Frizzled homolog 7, Prickle, Gas‐6, Lims1, Myosin 10, Dbn1, Plk3, and Amotl2 as direct targets of Ngnr1 and/or NeuroD. All of these molecules or closely related proteins have defined roles in regulating the actin cytoskeleton and/or mediating morphological and migratory events in cells including neurons (Supplementary Figure S3). During neurogenesis, neuronal cell migration and differentiation temporally coincide but it was unclear whether these processes were regulated independently or coordinately. Our finding of direct targets that regulate both neuronal cell fate and morphology/migration suggests that Ngn and NeuroD directly coordinate these processes at the transcriptional level. This is consistent with results of Ge et al (2006), suggesting that Ngn can control the expression of several cell migration regulators.
We next examined expression patterns of novel or uncharacterized Ngnr1 and NeuroD targets. As expected, expression of the majority of genes tested (16 of 20) was predominantly in the embryonic nervous system (Figure 1). Ngnr1 and NeuroD target genes were commonly expressed in the primary neurons at neurula stages (Figure 1A, E, F, H, J, M, O, and Q) and in the central nervous system (e.g. brain, spinal cord, and eye) at tailbud stages (Figure 1B–D, G, I, K, L, N, and T). Expression was also frequently found in the neural plate at neurula stages (Figure 1A, Q–S) and, at tailbud stages, and in the cranial placodes and branchial arches, which contain placodal ectoderm or neural crest derivatives (Figure 1G, I, K, L, N, P, and T). In an embryonic context, NeuroD strongly induced all 10 target genes that we tested (Figure 2A). Conversely, lowering Ngnr1 activity in embryos reduced or eliminated the neural‐specific expression of 9 of the 11 target genes we tested (Figure 2B). These data demonstrate that Ngnr1 and NeuroD induce or are required to activate the expression of these target genes in vivo.
Taken together, these results demonstrate that the approach of using hormone‐inducible forms of Ngnr1 and NeuroD and naive ectoderm effectively identified direct targets of these transcription factors and that primary roles of Ngn and NeuroD include inducing regulators of gene transcription, signal transduction, and cytoskeletal rearrangement for neuronal differentiation and migration. Additionally, we found many novel target genes, which are expressed in the nervous system and may play roles in neuronal differentiation, migration, and/or function.
Induction of Ngn and NeuroD targets in mammalian cells
We examined whether Ngnr1 and NeuroD direct target genes identified in Xenopus ectoderm were also regulated by their mouse orthologs in mammalian cells. For this purpose, we used P19 embryonic carcinoma cells, which are induced to differentiate into neurons by overexpressing proneural bHLH proteins (Farah et al, 2000). We transfected P19 cells with mouse Ngn2 or NeuroD and examined the expression of mouse orthologs or closest homologs of 16 Ngn and 30 NeuroD Xenopus target genes by qRT–PCR (Table II). Mouse Ngn2 induced 7 of 16 (43.8%) and NeuroD 14 of 30 (47%) genes tested by >2‐fold in P19 cells. Four genes (25%) for Ngn2 and six genes (20%) for NeuroD were also induced 1.5‐ to 2.0‐fold. Most Xenopus target genes that were not or were only weakly induced were already highly expressed in P19 cells before transfection (see Table II, Ct values). Thus, Ngn2 and NeuroD overexpression may not further increase their expression in P19 cells but those genes may be induced in other cellular contexts. With respect to this, we compared NCBI expression profiles for 12 targets that were strongly induced by NeuroD and eight targets that were not induced (Supplementary Figure S4). Most strongly induced genes were highly enriched in brain, dorsal root ganglion, eye, and/or spinal cord in vivo and were expressed in few other tissue types. In comparison, many non‐induced target genes were more broadly expressed. This suggests that the P19 cell context may be more conducive to Ngn2 and NeuroD induction of targets with neural restricted expression in vivo relative to those with broader expression profiles.
Therefore, approximately half (and probably more) of the target genes identified in Xenopus showed conserved induction in mammalian cells, suggesting that Ngn and NeuroD regulate similar target genes in multiple vertebrates and that these targets are evolutionarily conserved core mediators of Ngn and NeuroD activities in neurogenesis.
Determining consensus sequences for Ngn and NeuroD binding
To identify Ngn/NeuroD regulatory enhancers in the target loci defined above, we focused on genes induced in both Xenopus and mouse and initially used the criteria that potential enhancers should show some cross‐species conservation and contain >1 E‐box, since transcription factors including bHLH proteins often utilize multiple, clustered binding sites (Weintraub et al, 1990). We used the evolutionarily conserved region (ECR) browser to compare genomic sequences for Dll1, Ebf3, Elavl3, Gadd45g, NeuroD4 (Ath3), and HEN1 (Nhlh1) between multiple vertebrates and detect E‐box containing ECRs, which represent potential enhancers (Ovcharenko et al, 2004). We then tested 2–3 candidate enhancers per locus (16 total ECRs) for Ngn2 and/or NeuroD responsiveness by luciferase assay in P19 cells (Supplementary Figure S5, A and B). We set a three‐fold cutoff for responsiveness because enhancers showing >3‐fold induction were consistently enriched for Ngn2 and NeuroD binding in chromatin immunoprecipitation (ChIP) assays, while those with lower induction fold‐change values were not (Supplementary Figure S8 and data not shown). By these criteria, Ngn and NeuroD induced expression of the same reporter constructs (Ebf3 ECR1, Ebf3 ECR2, HEN1 promo, and Dll1 ECR1mini2), except for Dll1 ECR1mini2, which was only induced by Ngn. While Ngn regulation of NeuroD could indirectly contribute to Ngn's ability to induce these enhancers, we found that Ngn bound these enhancers by ChIP (Supplementary Figure S8), indicating that Ngn can directly regulate these enhancers. These data suggested that Ngn and NeuroD may use similar binding sites and enhancers to regulate their target genes.
We next analyzed which E‐boxes in these ECRs were required for Ngn and NeuroD‐mediated expression. There are 10 possible E‐box sequences, considering four nucleotides for each N in the E‐box (CANNTG) and disregarding orientation (Supplementary Figure S2C). The 16 ECRs tested above contained in total 42 E‐boxes, 29 in non‐responsive ECRs and 13 in the four Ngn2/NeuroD‐responsive ECRs. However, only 3 of 10 possible E‐box sequence types (CAGCTG, CAGATG, or CAAATG) were present in the responsive ECRs, suggesting Ngn and NeuroD binding preferences (Figure 3A). Other types of E‐boxes were found only in non‐functional ECRs (Supplementary Figure S2C). We next disrupted individual E‐boxes: disrupting CAGATG, CAGCTG, and CAAATG E‐boxes abolished or reduced induction by Ngn2, while induction by NeuroD was affected by mutating CAGCTG and CAGATG but not CAAATG E‐boxes (Figure 3B–E).
As a parallel, independent approach to assess Ngn‐ and NeuroD‐binding sequences, we built 10 artificial enhancer‐luciferase reporter constructs. Each contained a different E‐box sequence, present as three copies with the same intervening sequences for all constructs. In luciferase assays, Ngn2 and NeuroD robustly induced the three E‐box types defined experimentally above (CAG[A/C]TG and CAAATG). In addition to these three E‐box types, CATATG was also responsive to Ngn and NeuroD while the six other E‐box types were unresponsive (Supplementary Figure S6). Taken together with the results from 16 ECRs, these data suggest that Ngn and NeuroD recognize similar E‐box sequences.
Finally, we used CompareProspector (Liu et al, 2004) to computationally define conserved, enriched sequence motifs within 25 ECRs from 14 Ngn and NeuroD target genes in human and mouse (Ascl1, Dll1, Ebf2, Ebf3, Elavl3, Gadd45g, HEN1, MTGR1, Myo10, MyT1, NeuroD4, PDK2, Pou3f1, and Zfp238; for details, see Supplementary data). In this search, the E‐box sequence CAG[A/C]TG (degenerate consensus considering bases with more than 25% abundance) was identified as the second rank. The Ngn/NeuroD E‐box consensus sequence logo and position weight matrix (PWM) are shown in Figure 3G. These sites were also frequently found in a clustered pattern (Supplementary Figure S7B). The top‐ranking motif was GATTTGCA (Figure 3G) (G[A/C]TT[G/T]GC[A/T]: degenerate consensus), which resembles a consensus‐binding sequence for class 2 POU transcription factors (e.g. Oct‐1 and Oct‐2) (TESS: Transcription Element Search System, http://www.cbil.upenn.edu/tess, weight matrix: M00210). CAAATG and CATATG E‐boxes, which were able to respond to Ngn2 and NeuroD in the context of an artificial enhancer (Supplementary Figure S6), were also found in this analysis but they were under‐represented relative to CAG[A/C]TG E‐boxes. When we manually searched eight Ngn/NeuroD target loci (Dll1, Ebf2, Ebf3, MTGR1, MyT1, NeuroD4, HEN1 (Nhlh1), and Zfp238) for conserved E‐boxes of these four types, we found that CAG[C/A]TG accounted for 69%, while CAAATG and CATATG comprised 22 and 9% of the E‐boxes present in these loci, respectively (Figure 3F and Supplementary Figure S7). These data suggest that while CATATG E‐boxes can mediate Ngn‐ and NeuroD‐transcriptional responses, this sequence is not frequently used in the genome for Ngn‐ and NeuroD‐mediated target activation. In summary, E‐boxes with the consensus CAG[C/A]TG were over‐represented in our target loci and were required for Ngn and NeuroD activation of these loci, with minor contributions from CA[A/T]ATG E‐boxes.
Identifying Ngn/NeuroD regulatory enhancers in target genes
Previously, defining ECRs with clusters of generic E‐boxes (CANNTG) had successfully predicted some enhancers within target loci but was not efficient (Supplementary Figure S5). Therefore, we tested whether using the Ngn/NeuroD consensus sequences improved enhancer prediction. We used the Ngn/NeuroD PWM, which was obtained from CompareProspector and reflects frequencies of the four E‐boxes able to respond to Ngn2 and NeuroD (Figure 3G), in combination with the Enhancer Element Locator (EEL) (Hallikas et al, 2006) and ECR Browser. This approach greatly reduced the ‘noise’ of generic E‐boxes (CANNTG), allowing us to analyze wider regions of each locus. Using a generic E‐box sequence, as above, we had frequently obtained more than three candidate enhancers within 10 kb of the transcription start site. In contrast, using the Ngn/NeuroD PWM usually predicted only a few high‐scoring enhancers per locus, including introns, untranslated regions and 50 kb each of 5′ and 3′ flanking sequences in the analysis. We applied this approach to the Dll1, Ebf2, Ebf3, HEN1, MTGR1, MyT1, NeuroD4, and Zfp238 loci. The approach successfully identified two previously defined enhancers within the Ebf3 locus (Ebf3 ECR1 and ECR2) and also predicted 12 new candidate enhancers within the Dll1, Ebf2, MTGR1, MyT1, NeuroD4 and Zfp238 loci (Supplementary Figure S2).
We used luciferase assays to test whether these 12 putative enhancers could mediate transcriptional responses to Ngn and NeuroD. Nine of the 12 enhancer constructs (75%) were induced >3‐fold by both Ngn2 and NeuroD (Figure 4A and B). This is a major improvement in enhancer prediction compared with our prior results, where only four (25%) or three (19%) candidate enhancers of 16 were induced >3‐fold by Ngn2 and NeuroD, respectively (Supplementary Figure S5, A and B). Ngn2 induced enhancer activation to a greater degree than NeuroD, suggesting that although Ngn2 and NeuroD can regulate shared target genes through the same enhancers, Ngn activates transcription more efficiently than NeuroD at these enhancers.
We next used quantitative ChIP (qChIP) to test whether Ngn2 and/or NeuroD bound these enhancers. In P19 cells transfected with myc‐tagged Ngn2 or NeuroD, all eight functional enhancers we tested within the Ebf2, Ebf3, MTGR1, NeuroD4, and Zfp238 loci showed enrichment for Ngn2 and NeuroD, while the non‐functional HEN1 ECR1 was not enriched above background levels (1.1‐fold) (Supplementary Figure S8). Furthermore, six of the eight functional enhancers (75%) were occupied by endogenous NeuroD in telencephalon tissue from e14.5 mouse embryos (Figure 4C). Thus, Ngn2 and NeuroD directly bind these enhancers in vivo and can induce their expression.
We also tested whether these conserved Ngn‐ and NeuroD‐responsive enhancers drove endogenous expression patterns of the target genes in embryos. We generated transgenic Xenopus embryos carrying 10 enhancer constructs, which efficiently mediated Ngn2 and NeuroD induction in luciferase assays, and analyzed transgene expression by in situ hybridization for the luciferase reporter. All 10 enhancers tested partly recapitulated their endogenous expression patterns (Figure 4D–R). Expression patterns of Xenopus Dll1, Ebf3, MTGR1, and NeuroD4 are shown for comparison (Figure 4E, G, I, and K). Interestingly, all 10 enhancers drove transgene expression within the brain and eye in a very similar pattern (Figure 4D, F, H, J, L–R), while embryos carrying Ebf2 EEL3 or Dll1 EEL1 additionally expressed the transgene in the tailbud (Figure 4F and L). This suggests that our Ngn/NeuroD regulatory elements, which were selected for enrichment for Ngn/NeuroD consensus sites, might be sufficient to drive gene expression in the brain and eye, while expression in other territories requires additional information. As negative controls, we generated transgenic embryos carrying the same TATA‐luciferase vector without an introduced enhancer (Figure 4S) or the pBS vector (data not shown) and we also compared transgenic versus non‐transgenic embryos (for example, Figure 4, compare L and T); these did not show localized transgene expression. Figure 4Q shows a ‘half‐transgenic’ embryo, where the transgene integrated after the first cleavage division and thus is expressed only in one side of the embryo (Kroll and Amaya, 1996), further confirming that this in situ pattern is specific to transgene expression. Taken together, these data demonstrate that these conserved Ngn‐ and NeuroD‐responsive regulatory enhancers contain sequence information sufficient to direct restricted expression to the embryonic brain and eye in vivo.
Genome‐wide prediction of Ngn and NeuroD direct target genes
Above, we found that evolutionary conserved non‐coding sequences with greater than one Ngn/NeuroD consensus site were robustly activated by Ngn and NeuroD. To determine whether this regulatory signature could predict additional Ngn and NeuroD target genes, we used the Promoter Analysis Pipeline (PAP) program (Chang et al, 2006) and Ngn/NeuroD PWM (Figure 3G). PAP uses evolutionary conservation and enrichment for a PWM for genome‐wide prediction of coregulated genes. PAP predicted 347 potential Ngn/NeuroD target genes (cutoff, P<0.01; Supplementary Figure S9A), including many experimentally defined targets (13 of the 30 targets in Table II). To test the specificity of PAP predictions, we also analyzed two heterologous E‐boxes in TRANSFAC that are diverged from the Ngn/NeuroD PWM (E‐box, M01034 and c‐Myc/Max, M00118, PWMs in Supplementary data) and predicted 365 and 127 targets, respectively (Supplementary Figure S9, B and C). These CACGTG‐type E‐boxes did not respond to Ngn or NeuroD (Supplementary Figures S2 and S6).
We used functional clustering to compare PAP‐predicted target genes for the Ngn/NeuroD PWM versus the heterologous E‐box PWMs (Supplementary Figure S10 and Supplementary data). For PAP‐predicted Ngn/NeuroD target genes, nervous system development was the top gene ontology (GO) term, followed by cell differentiation, tyrosine kinase signaling pathway, and development. In contrast, neither of the heterologous E‐boxes predicted nervous system development as a top GO term (Supplementary Figure S10). Ngn/NeuroD target genes related to neural development or development were also largely non‐overlapping with targets predicted for these heterologous E‐boxes (M01034 or M00118; five and zero overlapping targets, respectively). For each PWM, functional clustering and genes in top‐scoring clusters are listed in Supplementary Figure S10. Therefore, PAP predicted a distinct set of putative Ngn/NeuroD targets genome‐wide and loci involved in neural development are most frequently enriched for conserved Ngn/NeuroD consensus sites. To test whether PAP‐predicted target genes were indeed Ngn/NeuroD targets, we tested whether NeuroD induced their expression. We tested 51 PAP‐predicted targets not previously analyzed in our experimental work. Of these, 20 (39%) responded to NeuroD >2‐fold and another 10 (20%) responded >1.5‐fold (Table III). As for the experimental targets, most PAP‐predicted targets that were uninduced in P19 cells were already highly expressed and NeuroD transfection did not increase expression. Also as before, most genes that were robustly induced by NeuroD in P19 cells had highly neural‐restricted expression in vivo (Supplementary Figure S4). These results verify that the computational whole genome prediction effectively identified additional NeuroD target genes.
Identification of direct transcriptional targets of Neurogenin and NeuroD
Transcription factors are key regulators of many cellular processes, but defining how they perform their functions has been limited by a lack of knowledge of their direct target genes. Microarray technologies enable genome‐wide transcription factor target gene identification by comparing mRNA abundance between two sample types (e.g. wild‐type mice versus those with a targeted gene disruption) but this technology does not distinguish between direct and indirect targets. Recently, ChIP coupled with microarray (ChIP‐on‐chip) has also been used to define direct transcription factor targets. However, this approach does not incorporate information regarding gene expression changes, and only a fraction of genes identified in ChIP‐on‐chip show changed expression upon transcription factor introduction or removal. This suggests either that some sites represent false positives or could reflect target locus misidentification, since current genome annotation is imperfect and vertebrate enhancers are also frequently located many kilobases away from transcription start sites or even within other loci.
As an alternative to these approaches, we employed a modified expression profiling approach to identify direct transcription factor targets. By using hormone‐inducible forms of transcription factors in Xenopus naive ectoderm, we could acutely induce target gene expression and detect only rapid transcriptional responses occurring within a 2.5‐h window. We also induced transcription factor activity in the presence of a translational inhibitor, which avoided detection of indirect targets expressed in response to primary target activities or following cell‐fate changes. Using this approach, we successfully identified direct targets, which were validated as Ngn and NeuroD responsive at high frequency. Ngn and NeuroD shared many target genes, and most were responsive in both Xenopus embryonic tissue and mammalian cells. We then analyzed the genomic loci of these targets to define Ngn and NeuroD consensus‐binding sequences and regulatory enhancers that drove expression both in mouse P19 cells and in the Xenopus nervous system. We further used these enhancer features for genome‐wide computational screening and successfully predicted additional Ngn/NeuroD direct target genes. Since Xenopus embryonic ectoderm is multipotent and can differentiate into most ectodermal, endodermal, and mesodermal cell derivatives, we propose that this approach could be used to identify direct targets for a wide range of other transcription factors.
A Ngn‐ and NeuroD‐regulated molecular network for neurogenesis
Ngn and NeuroD induced 26 shared targets, including most of the transcription factors, suggesting that they act through a common set of transcription factors to induce neuronal differentiation. In developing embryos, Ngn1 and 2 are transiently expressed in the ventricular zone at the commitment stage, and this is followed by NeuroD expression during differentiation. Our data indicate that Ngn and NeuroD recognize very similar E‐box sequences. Thus, Ngn activity could initiate target gene expression during neuronal commitment, while NeuroD activity maintains the later expression of these genes using the same enhancers, to sustain commitment to neuronal fates. It has been a long‐standing enigma that NeuroD is only expressed after the time of neuronal commitment in vivo, yet can activate the entire neurogenesis program upon overexpression. Our finding that NeuroD directly binds enhancers in genes encoding key transcriptional regulators of neurogenesis may account for this perhaps surprising ability of NeuroD to broadly regulate neurogenesis when overexpressed.
We propose that primary transcription factor targets of Ngn and NeuroD (Ebf2, Ebf3, HEN1 (Nhlh1), Hes6, MTGR1, MyT1, NeuroD, NeuroD4 (Ath3), and potentially Znf238) represent a core transcriptional network mediating Ngn‐ and NeuroD‐regulated neurogenesis (Figure 5). Ngn initiates neuronal differentiation, and NeuroD is a key Ngn‐regulated transcriptional node. There are many reciprocal and redundant regulatory relationships between these transcription factors. For example, we found that NeuroD induces Ebf2, Ebf3, MyT1, and NeuroD4, but Ebf2 also activates Ebf3 and NeuroD (Dubois et al, 1998; Pozzoli et al, 2001) and NeuroD4 (Ath3) induces Ebf2, MyT1, and NeuroD (Perron et al, 1999). This may account for the limited defects observed in NeuroD‐null mice. In contrast, to our knowledge, reciprocal regulation of Ngn expression by its primary targets has not been observed. These data suggest that Ngn acts at the top of a regulatory cascade to initiate neurogenesis, while at least two or three primary Ngn transcription factor targets (Ebf2, NeuroD, and NeuroD4) then act reciprocally and potentially redundantly to generate a robust network controlling neuronal differentiation and the expression of transcription factors, such as Ebf3, Hes6, MTGR1, and MyT1 also supports this differentiation program.
Although we focused on shared targets of Ngn and NeuroD in this study, our microarray experiment identified many distinct, non‐shared targets. We found that Ngn and NeuroD recognize very similar E‐box sequences, but their binding preferences were not identical as shown in the mutagenesis analyses in Figure 3. Slight differences in the DNA‐binding properties of Ngn and NeuroD may explain their differential induction of these targets. Alternatively, Ngn and NeuroD may interact with distinct cofactors in different cellular contexts and this may account for the induction of these non‐shared targets. Understanding the nature of genes induced only by Ngn or only by NeuroD and how these targets are differentially activated by these two transcription factors will help to elucidate how Ngn and NeuroD fulfill their distinct biological functions in vivo.
A minimal enhancer signature for activation of Ngn and NeuroD targets in neural tissue
Our approach defined direct transcriptional targets, which should contain Ngn and/or NeuroD regulatory elements. We also found many shared targets, suggesting that Ngn and NeuroD may recognize common enhancers or consensus sites in those targets. We initially focused on targets induced in both Xenopus and mouse, which may employ conserved Ngn/NeuroD regulatory elements, and we tested clustered E‐boxes around target loci for enhancer activity, since transcription factors often utilize clustered binding sites. This defined some enhancers but was inefficient since E‐boxes occur frequently by chance. We then experimentally determined that Ngn and NeuroD bind a similar consensus sequence and we used these PWMs to predict a few high‐scoring enhancers within large genomic regions encompassing and surrounding each target gene locus. These predicted enhancers frequently responded to Ngn and NeuroD in mouse cells and most were also occupied by endogenous NeuroD in embryonic brain tissue. In addition, all 10 enhancers drove reporter gene expression in the brain and eye of transgenic Xenopus embryos. Thus, enhancers containing conserved, clustered consensus sites are sufficient both to respond to Ngn and NeuroD in mammalian cells and to drive neural‐restricted expression in embryos.
We further used these enhancer features for computational genome‐wide prediction of other Ngn/NeuroD target loci and defined 347 putative direct targets, many of which were induced by NeuroD in P19 cells. Our experimental approach defined a core Ngn and NeuroD‐regulated neurogenesis program in Xenopus and mammals but was unable to define some classes of Ngn/NeuroD target genes, including those induced only in specialized neuronal contexts (e.g. the mammalian neocortex) or whose induction required a cofactor absent in Xenopus neural ectoderm. As described previously, we potentially also missed detecting some NeuroD targets induced only during later neuronal differentiation or Ngn/NeuroD targets whose induction occurred in a temporally delayed manner, for example through a feed‐forward mechanism as is used during myogenic bHLH activation of target genes (Tapscott, 2005). Interestingly, our computational approach identified 6 of 26 genes previously identified as misregulated in Ngn2 or Ngn1/2 mutant mouse neocortices, including AKT3, Dcc, Elavl4, Negr1, NeuroD2, Robo1, as well as the previously defined Ngn2 direct target Dcx (Mattar et al, 2004; Schuurmans et al, 2004; Ge et al, 2006). Our data suggest that these genes are direct Ngn/NeuroD targets and that our computational approach defined additional target genes missed in our microarray experiments.
In our search for the over‐represented sequence motifs within 14 Ngn and NeuroD target genes using CompareProspector, the top‐ranking motif was GATTTGCA (Figure 3G), which resembles a consensus‐binding sequence for class 2 POU transcription factors (e.g. Oct‐1 and Oct‐2). Indeed, many of our enhancers also contain this motif in close proximity to Ngn/NeuroD E‐boxes (Supplementary Figure S2A). This is reminiscent of the recent finding that Mash1 synergizes with POU class 3 proteins by direct binding of Mash/Brn proteins to adjacent cis‐sequences in target genes (Castro et al, 2006). We found that disrupting POU sites did not affect Ngn and NeuroD responsiveness of our enhancers in P19 cells (data not shown). This may indicate differences in Ngn and NeuroD versus Mash1 target gene regulation. Alternatively, POU sites may contribute to Ngn/NeuroD‐mediated target activation in specific tissues in vivo, although they are not required for enhancer activation in our cell‐based assay. Further study is required to clarify this.
We propose that clustered, conserved consensus sites constitute a ‘minimal neural enhancer signature’ through which Ngn and NeuroD selectively activate target genes in neural tissue. Enhancers with these features robustly responded to Ngn and NeuroD in mammalian cells and drove expression in the brain and eye of transgenic Xenopus embryos. Furthermore, neural development‐related genes throughout the genome were preferentially enriched for this enhancer signature and many were indeed regulated by NeuroD. In vivo, this enhancer signature may allow Ngn and NeuroD to activate their target genes specifically in neural tissue, so that these targets can mediate Ngn and NeuroD's general neuronal commitment and differentiation promoting activities.
While this enhancer signature is sufficient for Ngn‐ and NeuroD‐mediated gene expression in neural tissues, it is insufficient to drive expression in some other endogenous expression domains. For example, in transgenic Xenopus embryos, our 10 Ngn/NeuroD‐responsive enhancers all drove gene expression in the brain and eye, where corresponding genes are commonly expressed, but could not drive expression in territories like the cranial placodes where the endogenous genes show differential expression. Thus, these regulatory elements, which were selected for enrichment for Ngn/NeuroD consensus sites, may lack other transcription factor‐binding sites needed to drive expression in these tissues.
Our data also suggest that non‐neural bHLH target genes do not contain the same regulatory signature. For example, neither our experimental or computational approaches predicted NeuroD's targets in the pancreas (insulin (INS), PDX‐1/IPF1, secretin, glucokinase (GCK), IGRP/G6PC2, Pax6, and SUR1; Chae et al, 2004 and references therein; Naya et al, 1995; Mutoh et al, 1997; Sharma et al, 1997). Therefore, NeuroD appears to use different enhancer signatures to induce targets in non‐neural tissues. Likewise, there was little overlap between our computationally predicted NeuroD targets and previously defined targets of the myogenic bHLH transcription factors, MyoD and Myogenin, although these have a somewhat similar E‐box core nucleotide preference (Bergstrom et al, 2002; Cao et al, 2006).
In summary, our data indicate that clustered, conserved consensus sites represent a minimal neural enhancer signature that is used for Ngn and NeuroD recognition of neural development‐related target genes. Ngn and NeuroD bind these enhancers to activate targets that execute core programs regulating neurogenesis.
Materials and methods
Capped RNA preparation
Xenopus Ngnr1, NeuroD and GR‐fusion variants were previously described (Lee et al, 1995; Ma et al, 1996; Perron et al, 1999; Pozzoli et al, 2001). Capped RNAs were made by in vitro transcription with the mMessage mMachine SP6 kit (Ambion) and these templates: CS2MT‐XNeuroD, CS2MT‐XNeuroD‐GR, CS2MT‐XNgnr1‐GR, and CS2NLS‐βGal.
Xenopus embryos were obtained by in vitro fertilization and raised as described (Seo et al, 2005). To minimize nonspecific target gene induction, we predetermined minimal doses of Ngnr1 and NeuroD RNAs that moderately induced neurogenesis but did not overtly affect gastrulation or morphology of embryos. Both blastomeres of two‐cell‐stage pigmented embryos were injected with X‐Ngnr‐GR (10 pg), NeuroD‐GR (30 pg), or β‐galactosidase (50 pg) RNAs and raised until stages 8–9. Xenopus animal caps (50–60) per sample were isolated and raised in 0.7 × MMR at 25°C. When sibling embryos were at stages 10–10.5, explants were pretreated with cycloheximide (final concentration 10 μg/ml) for 30 min and DEX (final 10 μM) was added. Explants were incubated at 25°C for 2.5 h and frozen in liquid nitrogen (when sibling embryos were at stage 12.0–12.5). Total RNA (20 μg) per sample, prepared with Trizol (Invitrogen), was used for probe synthesis and hybridization to Affymetrix Xenopus laevis Genome Arrays (Washington University Genechip facility). Result was analyzed with dChip software (http://biosun1.harvard.edu/complab/dchip/). See Supplementary data for details.
Xenopus explants and total RNAs were obtained as above. P19 cells were cultured as described (Seo et al, 2005) and transfected with 400 ng of US2MT‐mNgn2 or US2MT‐mNeuroD and 2.1 μg of US2MT in six‐well plates with FuGene6 (Roche). In negative controls, cells were transfected with 2.5 μg of US2MT. At 44–48 h after transfection, total RNA was extracted with Trizol (Invitrogen) and 1 μg/20 μl reaction used for cDNA synthesis with oligo(dT) primers (Invitrogen) and SuperScript II reverse transcriptase. qRT–PCR was performed with the MyiQ real‐time PCR Detection System (Bio‐Rad) and iQ SYBR Green Supermix (Bio‐Rad) or Platinum SYBR Green Supermix (Invitrogen). Primer sequences are in Supplementary Figure S1. Relative gene expression was calculated following normalization with EF1α for Xenopus or RPL19 for P19 cells. PCR was carried out in triplicate and entire experiments were repeated three times with independently prepared RNA samples. Tables I, II and III show an average of three experiments. For each primer pair, the melt curve was analyzed and the PCR product was examined on a 2% agarose gel to ensure that a single fragment of the predicted molecular weight was amplified.
Microinjection and whole‐mount in situ hybridization
To examine target gene induction by NeuroD, one blastomere of two‐cell‐stage albino embryos was injected with 30 pg each of NeuroD and β‐galactosidase RNAs. Embryos were grown to stages 16–17 (Nieuwkoop and Faber, 1967), fixed, X‐Gal stained, and in situ hybridized as described (Seo et al, 2005). cDNA clones indicated in Figure 1 were purchased from OpenBiosystems or ATCC.
Cloning of target promoter/enhancer regions, luciferase constructs, and mutagenesis
Enhancers were amplified with KlenTaqLA (DNA Pol. Tech. Inc.) and mouse genomic DNA as a template and inserted into the E1X3‐TATA luciferase reporter plasmid (Huang et al, 2000) after removing its E‐boxes. Site‐directed mutagenesis used the QuikChange Site‐Directed Mutagenesis protocol (Stratagene) and Pfu Turbo polymerase (Stratagene). E‐box sequences (CANNTG) were changed to CTCGAG (XhoI site) and mutants were screened by XhoI digestion and sequence confirmed. For primer sequences and ECR/EEL sequences, see Supplementary Figures S1 and S2. See text and Supplementary data for further information.
P19 cells were transfected in 12‐well plates with 0.5 μg of indicated luciferase reporters and 50 ng of CS2‐βGal with or without 250 ng of US2MT‐mNgn2 or US2MT‐mNeuroD. pUS2MT plasmid was added as needed to adjust total DNA to 1.2 μg. After 44–48 h of incubation, lysates were analyzed with the luciferase and β‐galactosidase enzyme assay systems (Promega) according to the manufacturer's instructions. Samples were assayed in duplicate and experiments were repeated three times.
Telencephalon tissue was dissected from e14.5 mouse embryos and 50–60 mg of material was immunoprecipitated with 5 μg of NeuroD (Santa Cruz (sc)‐1084) or isotype‐matched IgG antibodies, after standard protocols (Upstate; details in Supplementary data).
Transgenic Xenopus embryos
Supplementary data are available at The EMBO Journal Online (http://www.embojournal.org).
We thank Drs Tomas Pieler, Timothy Grammer, and David Turner for kindly providing plasmids (IS‐EF1α‐eGFP, IS‐pBS, and US2) and Jim Skeath and David Gottlieb for their thoughtful comments on the manuscript. We are also grateful to Dominique Thomson for his help in qRTP–CR and cell culture. LWC was supported by NIH grants HG00249 and GM63340 to GD Stormo. This work was supported by grants from the NIH (GM66815‐01), the American Cancer Society (RSG‐06‐148‐01‐DDC), and the March of Dimes (#1‐FY06‐374) to KLK.
- Copyright © 2007 European Molecular Biology Organization