A κB sequence code for pathway‐specific innate immune responses

Matthew S Busse, Christopher P Arnold, Par Towb, James Katrivesis, Steven A Wasserman

Author Affiliations

  1. Matthew S Busse1,
  2. Christopher P Arnold1,
  3. Par Towb1,
  4. James Katrivesis1 and
  5. Steven A Wasserman*,1
  1. 1 Section of Cell and Developmental Biology, University of California at San Diego, La Jolla, CA, USA
  1. *Corresponding author. Section of Cell and Developmental Biology, University of California, San Diego, Bonner Hall Rm 4402, MC 0349, 9500 Gilman Drive, La Jolla, CA 92093‐0349, USA. Tel.: +1 858 822 2408; Fax: +1 858 822 3201; E-mail: stevenw{at}
View Full Text


The Toll and Imd pathways induce humoral innate immune responses in Drosophila by activating NF‐κB proteins that bind κB target sites. Here, we delineate a κB site sequence code that directs pathway‐specific expression of innate immune loci. Using bioinformatic analysis of expression and sequence data, we identify shared properties of Imd‐ and Toll‐specific response elements. Employing synthetic κB sites in luciferase reporter and in vitro binding assays, we demonstrate that the length of the (G)n element in the 5′ half‐site and of the central (A,T)‐rich region combine to specify responsiveness to one or both pathways. We also show that multiple sites function to enhance the response to either or both pathways. Together, these studies elucidate the mechanism by which κB motifs direct binding by particular Drosophila NF‐κB family members and thereby induce specialized innate immune repertoires.


In vertebrates and invertebrates alike, innate immune systems mediate pathogen recognition, direct expression of defensive molecules, and trigger secondary responses. The signaling pathways of innate immunity are broadly conserved, utilizing as primary effectors members of the NF‐κB family of transcription factors, for example, vertebrate p50 and p65 (Ghosh et al, 1998; Karin and Ben‐Neriah, 2000; Brennan and Anderson, 2004; Xiao and Ghosh, 2005). In the absence of a stimulus, NF‐κB proteins are retained in the cytoplasm through direct interaction with IκB‐family inhibitors. Signals from receptors in the Toll, Interleukin‐1, and TNF receptor families trigger proteolytic events that allow NF‐κB factors to enter the nucleus and direct expression of antimicrobial agents and secondary signals.

NF‐κB proteins function as sequence‐specific DNA‐binding proteins. Their target sequences, termed κB sites for the context in which they were first described, have a consensus in mammals of GGG G/A NN T/C T/C CC (Baeuerle, 1991). NF‐κB proteins bind as dimers to κB sites, with each half‐site containing a cluster of guanines (5′ end) or cytosines (3′ end).

In Drosophila melanogaster, three NF‐κB family members govern antimicrobial responses under the control of the Toll and Imd signaling cascades. Two of the NF‐κB proteins, Dorsal and DIF, each interact with Cactus, an IκB‐related inhibitor. Degradation of Cactus in response to signaling by the Toll receptor releases Dorsal or DIF for nuclear translocation and interaction with target gene regulatory sites. The third family member in flies is Relish. Like mammalian p105, Relish is a composite protein, containing both an NF‐κB‐like domain and an IκB‐like inhibitory domain (Dushay et al, 1996; Stoven et al, 2000). Activation of the Imd pathway triggers endoproteolytic cleavage of Relish. Relish processing, carried out at least in part by the caspase Dredd, releases an active NF‐κB‐like fragment (Stoven et al, 2003).

Acting through Relish, Dif, and Dorsal, flies respond to infection by expressing a plethora of defense molecules, including antimicrobial peptide genes and pathogen recognition proteins. For the most part, the Toll pathway mediates the Drosophila response to Gram (+) bacteria and fungi, whereas the Imd pathway activates defenses against Gram (−) bacteria. Flies mutant for both pathways fail to induce any of the known antimicrobial peptides (Tzou et al, 2002). To effect responses specific to pathogen class, the two pathways each direct expression of a set of immune response loci; some loci are pathway‐specific, whereas others can be induced by either the Toll or Imd pathway (Lemaitre et al, 1997; De Gregorio et al, 2001, 2002; Irving et al, 2001; Hedengren‐Olcott et al, 2004).

It has been thought that variation in the affinity of distinct NF‐κB dimers for particular κB sequences might underlie differential transcription of innate immune loci in insects and mammals. Consistent with this hypothesis, studies in vitro have revealed small but significant differences in binding specificity among vertebrate NF‐κB proteins (Kunsch et al, 1992). Furthermore, the IKKβ‐independent pathway for NF‐κB activation in mammals acts via target sites that are distinct from canonical κB sites (Bonizzi et al, 2004). However, studies of the classical pathway in mammals have suggested that κB site sequence does not direct binding by particular NF‐κB proteins (Leung et al, 2004). Similarly, in Drosophila, NF‐κB protein‐binding specificity in vitro has not for the most part been found to correlate with in vivo requirements for gene activation (Gross et al, 1996; Han and Ip, 1999; Senger et al, 2004). Indeed, there is evidence in the mammalian system that variations in κB sequence determine which combination of NF‐κB protein and coactivator can productively interact at a particular site (Leung et al, 2004).

We report here a molecular genetic, bioinformatic, and biochemical dissection of the κB code for innate immunity in Drosophila. These experiments reveal an unanticipated interplay between κB motif elements in determining the specificity of response and delineate the logic linking signaling pathways, transcription factors, binding sites, and effector loci.


A transfected attacin A construct faithfully reports Toll and Imd signaling

To explore how pathway responsiveness is encoded at the DNA level, we used a molecular genetic analysis of a single locus as the foundation for a global bioinformatic approach. Attacin A (AttA), an antimicrobial peptide gene, is responsive to both Toll and Imd signaling (Asling et al, 1995; Lemaitre et al, 1997; De Gregorio et al, 2002; Hedengren‐Olcott et al, 2004). Using cultured Drosophila Schneider (S2*) cells, which express all three fly NF‐κB proteins and mediate robust Toll and Imd responses (Samakovlis et al, 1990; Han and Ip, 1999), we assayed transcription of an AttA reporter construct (Figure 1A) upon innate immune signaling.

Figure 1.

An AttA reporter system recapitulates endogenous innate immune responses. (A) AttA reporter construct. The AttA genomic fragment extending from the 3′ UTR of the drosocin gene (white box) to the AttA translational start site was fused to the luciferase gene (bar). The locations of four potential κB motifs (Dushay et al, 2000; Senger et al, 2004), numbered from proximal to distal relative to the transcriptional start site, are indicated. Arrows designate the orientation of each κB motif, with the forward sequence shown below. (B) RNAi against Toll and Imd pathway components, followed by EGF or B. subtilis PG stimulation. EGFR‐Toll cells were treated with the indicated dsRNA and transfected with the AttA reporter. For comparison, cells were incubated without dsRNA (none) or treated with dsRNA for easter (control), which acts upstream of Toll in embryonic patterning. Values were normalized to the induction measured without dsRNA and are each the average of at least four independent experiments. Capped lines indicate standard deviation. (C) Role of fly NF‐κB proteins in AttA regulation. EGFR‐Toll cells were treated singly or in combination with dsRNA for Dif, dorsal, or Relish; transfected with the AttA reporter; and subjected to Toll stimulation (EGF) or Imd stimulation (PG). Controls and analysis were as in (B). A full‐color version of this figure is available at the EMBO Journal online.

To stimulate the Toll pathway, we applied epidermal growth factor (EGF) to cells expressing EGFR‐Toll, a chimera fusing the extracellular and transmembrane domains of the human EGF receptor to the intracellular domain of Toll (Sun et al, 2004). To specifically induce the Imd pathway, we used a peptidoglycan (PG) preparation from Bacillus subtilis (Leulier et al, 2003; Stenbak et al, 2004). The AttA reporter construct exhibited a robust response to either inducer, with 5‐ to 11‐fold activation on EGF treatment and a 12‐ to 28‐fold increase on exposure to PG.

To further validate the cultured cell system, we used RNA interference (RNAi) to inactivate innate immune response loci in the Toll or Imd signaling cascades. RNAi against genes in the Toll pathway—MyD88, tube, or pelle—specifically blocked induction by EGF, but not PG (Figure 1B). Similarly, inactivation of Imd pathway components—imd, Tak1, or key (IKKγ)—eliminated induction by PG, but not EGF. AttA induction by either innate immune pathway was strictly dependent on endogenous NF‐κB factors (Figure 1C). For the Toll response, Dif and Dorsal had overlapping function, with either factor alone being sufficient for some signaling. For the Imd response, Relish alone was necessary and sufficient. The S2* cell system thus effectively recapitulates endogenous regulation of the AttA gene.

Promoter proximal κB sites govern AttA induction by the Toll and Imd pathways

The transcriptional start site for AttA lies approximately 650 bp downstream of the 3′ end of the neighboring transcription unit (Figure 1A). Within the intergenic region lie four potential κB sites (Dushay et al, 2000; Senger et al, 2004). As is characteristic of prototypical κB motifs, the 5′ half‐site in each case contains either GGG or GGGG.

Using site‐directed mutagenesis to inactivate or reposition the κB‐related motifs, we assayed motif function in directing expression of the AttA reporter gene. Inactivation of pairs of sites revealed that only the κB motifs at positions −46 and −118 were necessary for activation by either Imd or Toll (Figure 2A). Furthermore, each of these two sites, κB1 and κB2, preferentially mediates signaling by one innate immune pathway (Figure 2B). Constructs containing one or more copies of κB1, but not κB2, responded more strongly to Imd than to Toll. Likewise, the presence of κB2, but not κB1, directed a stronger response to Toll than to Imd.

Figure 2.

κB sites determine pathway‐specific transcriptional responses. (A) Proximal κB sites govern AttA induction. Mutational inactivation of κB motifs was achieved by converting both the second and third G residues in the 5′ core element to C residues. AttA reporter constructs in which pairs of κB motifs were inactivated (−) were transfected into EGFR‐Toll cells. Following Toll or Imd pathway stimulation, the induction of the reporter construct was measured and normalized to the wild‐type level. Data are the average of at least six independent transfections. (B) Effects of κB motif sequence, context, and number on Toll and Imd induction of AttA. Site‐directed mutagenesis was used to inactivate (−) or replace κB sites. All values are normalized to the wild‐type construct (top row). (C) Imd and Toll responsiveness of consensus motifs identified by MEME. The −118 site was inactivated in the AttA reporter and the −46 site was replaced with a synthetic site. Normalization was as in (B). A full‐color version of this figure is available at the EMBO Journal online.

The sequence of a single κB site can thus determine the signaling pathway to which a Drosophila innate immune gene is preferentially responsive.

Bioinformatic analysis reveals that Toll‐ and Imd‐responsive loci differ in κB site structure and sequence

To establish whether there is a general κB sequence code for innate immunity, we carried out a bioinformatics analysis based on published microarray data sets for innate immune responses in wild‐type and mutant strains of Drosophila (De Gregorio et al, 2001, 2002). To identify loci responsive to Toll but not Imd, we screened for strong induction by fungal infection and by a constitutively active Toll receptor, as well as a dependence on a functional Toll pathway for induction by bacterial infection. Similarly, we identified genes specifically responsive to Imd as those that responded robustly to bacterial infection only in the presence of a functional Imd pathway, but were not appreciably induced by fungal infection. Using quantitative expressions of these criteria to define screening algorithms (see Materials and methods), we identified 16 Toll‐responsive genes and 11 Imd‐responsive loci. We noted good agreement of these two gene sets with a classification based on clustering of temporal expression patterns (Boutros et al, 2002), eleven of the Toll loci and nine of the Imd loci being common to both analyses.

To narrow the focus of our analysis, we postulated that the κB sites relevant to innate immune induction would typically lie within 200 bp upstream of the transcriptional start site, as for AttA and as reported for rapid response genes from both flies and mammals (Engstrom et al, 1993; Kappler et al, 1993; Whitley et al, 1994; Thanos and Maniatis, 1995). Extracting the corresponding genomic sequence for each locus (see Supplementary Figure 1), we used the MEME motif discovery program (Bailey and Elkan, 1994) to search for overrepresented sequence motifs.

The results of the bioinformatic analysis were striking. For both the Toll‐ and Imd‐responsive gene sets, MEME analysis identified a κB‐type sequence as the highest scoring motif (Table I). The Toll and Imd gene sets were decidedly different, however, with regard to κB site composition and number.

View this table:
Table 1. Pathway‐specific properties of innate immune κB motifs

Whereas nearly two‐thirds (62%) of κB motifs in the Imd set had a GGGGA 5′ half‐site, such a half‐site was absent from the κB motifs in the Toll set. The Toll κB motifs instead typically had either GGGA (12 examples) or GGAA (3 examples) as the 5′ half‐site (Table II). Moreover, the difference in half‐site sequence was not an accident of motif definition during MEME analysis. A scan of the entire 3.2 kb sequence space comprising the upstream regions for the Toll gene set detected only a single example of GGGGA or its reverse complement, TCCCC; a parallel scan of the smaller (2.2 kb) sequence space for the Imd upstream regions detected 15 such instances.

View this table:
Table 2. Innate immune κB motifs

Divergence between the potential κB sites in the Toll and Imd gene sets extended throughout the motifs. Representative Toll‐responsive κB motifs had four or five bases between the G cluster and the first C residue, for example, GGGAAAACCC. Conversely, those Imd elements containing strings of G's and C's typically had a two or three base separation, for example, GGGGATTCCT. Statistically, these differences were marked. Overall, a 4–5 bp (A,T)‐rich region separated G's and C's in 94% of the Toll motifs, but only 14% of the Imd motifs. Similarly, we found a 2–3 bp (A,T)‐rich region separating G's and C's in 52% of the Imd motifs, but only one of the Toll motifs.

In approximately half of the Imd motifs, the 3′ half‐site diverged significantly from a canonical κB motif. In place of two or more C residues, the 3′ half‐site of these motifs consisted largely or entirely of a string of T residues, for example, GGGGATTTTT. Studies in mammalian systems have demonstrated that there are motifs of this type, that is, having only a single cognate κB half‐site, that can nevertheless bind specifically to Rel proteins in vitro and exhibit cis‐regulatory activity in vivo (see, e.g., Whitley et al, 1994).

The Toll‐ and Imd‐responsive gene sets differed not only in κB site sequence, but also κB site number. Of the 14 Toll genes for which κB sites were detected, 12 had only a single presumptive κB site and none had more than two sites. In contrast, eight of the nine Imd genes with predicted κB sites had two or more sites.

A κB sequence code governs fly innate immunity

To determine whether the observed differences between the κB sites in the Toll and Imd gene sets correspond to a cis‐regulatory code, we returned to the cultured cell system. We inactivated the −118 site in the AttA reporter construct, introduced synthetic versions of the κB consensus motifs at the −46 position, and assayed responses to innate immune signaling.

The consensus κB motifs defined by the MEME analysis mediated pathway‐specific transcriptional responses in the context of the AttA reporter (Figure 2C). In response to Toll signaling, the Toll consensus sequence, GGGAAAACCC, directed reporter gene expression many times that seen for the wild‐type gene. In contrast, the response to Imd signaling was well below that of the wild type. The two Imd consensus sites—GGGGATCCCC and GGGGATTTTT—also discriminated between pathways (Figure 2C). On Imd induction, both sites provided a significant increase in reporter gene expression, as substantial as that seen for any single site introduced into AttA (Figure 2B and unpublished results). On Toll induction, the increase in reporter expression with either Imd consensus site was an order of magnitude less than with the Toll consensus site.

The κB sequence code functions by specifying NF‐κB protein binding

There are at least two distinct mechanisms by which a κB sequence code could dictate specific transcriptional responses. First, NF‐κB proteins activated by different pathways could vary significantly in their affinity for particular κB motifs. Available data implicate this mechanism in the IKKβ‐independent NF‐κB response in humans (Bonizzi et al, 2004). Second, a single κB sequence could bind a range of NF‐κB proteins indiscriminately, but motif sequence could determine which coactivators interact with bound NF‐κB proteins. Such a model can explain the regulation of several mammalian genes by pairs of κB motifs (Leung et al, 2004).

In exploring which mechanism is operative in flies, we used a gel shift assay to determine whether or not NF‐κB proteins differ in their target site‐binding preference. We expressed Relish and DIF from pGEX vectors in bacteria, purified the GST fusion proteins by affinity chromatography, and carried out binding assays with labeled oligonucleotides.

The gel shift assays demonstrated that the Relish and Dif GST fusion proteins bind synthetic κB motifs in vitro with a specificity identical to that observed in vivo for signaling by Imd and Toll, respectively (Figure 3A). Using GST–Relish, we observed a strong gel shift signal for the Imd‐specific sites GGGGATCCCC and GGGGATTCCC. Relish also bound well to the noncanonical Imd site, GGGGATTTTT, but not to the Toll consensus sequence, GGGAAAACCC. Results with GST–DIF were the inverse: strong binding to the Toll consensus site, but not the Imd consensus motifs.

Figure 3.

Dif and Relish exhibit selective binding to synthetic and endogenous κB sites. GST‐tagged Relish (R) or DIF (D) was incubated with labeled oligonucleotides and the resulting nucleoprotein complexes resolved by native polyacrylamide electrophoresis. (A) Synthetic κB sites. (B) Endogenous κB sites.

The correlation between interaction seen in the gel shift assay and pathway specificity in vivo held true not only for synthetic κB sites, but also for κB sites found upstream of innate immune loci (Figure 3B). For example, GST–Relish gave a strong gel shift signal with the κB site from an Imd‐specific locus, diptericin, but had no detectable interaction with the κB site from a Toll‐specific gene, IM1. Furthermore, GST–Relish appeared to interact to a much greater extent with the −46 motif from AttA than with the −118 motif, consistent with the specificity observed in cells. GST–DIF exhibited complementary specificity, binding to the greatest extent with the IM1 motif and the −118 motif from AttA.

Given the consistent relationship between pathway responsiveness and Rel protein‐binding specificity, we extended our gel shift analysis to all 38 of the sites identified by the MEME program. Thirty‐four of the 38 sites bound to either DIF or Relish (Table II). Of the 16 κB sites identified in the Toll gene set, 15 had significant binding to GST–DIF and none bound to GST–Relish. The results with the Imd motifs were likewise very clear. Of the 18 potential κB sites in the Imd gene set, 15 bound well to Relish and none bound to DIF.

DIF and Relish tolerate κB site variation

Together, the bioinformatic, cell culture, and gel shift analyses revealed that the differential response to the Toll and Imd pathways of Drosophila was governed by a κB sequence code that directs binding of DIF and Relish. We noted, however, variation in both the sequence and length of the active κB sites bound by DIF and Relish. Work in mammalian systems had demonstrated that NF‐κB proteins interact with a variety of related target sites via alternative patterns of hydrogen bonding DNA binding mediated by flexible polypeptide loops (Chen et al, 2000). To determine the extent to which DIF and Relish accommodate κB sequence variation, we extended our DNA binding and cell culture studies of synthetic κB motifs.

Illustrative examples from our parallel studies of κB site activity in vivo and in vitro are presented in Figure 3A and Table III. The preferred half‐site for Toll responsiveness and DIF binding was 5′GGGAA, with an additional G at the 5′ end being well tolerated. The preferred 5′ half‐site for Imd responsiveness and Relish binding was GGGGA, with a GGGAA tolerated in palindromic sites. Both pathways tolerate some variation in motif length (Table III, compare rows 1 and 2 and rows 3 and 4). For example, the sites that bind Dif and are most responsive to Toll can have either one or no central residue separating half‐sites based on GGGAA.

View this table:
Table 3. Assays with synthetic sites delineate the κB sequence code

The κB sequence code can specify dual responsiveness

Expression studies have shown that many innate immune loci are dual‐responsive, that is, activated by either Toll or Imd signaling. The cell culture studies revealed that dual responsiveness can reside in a single κB site. As shown by the examples in Table III (rows 6 and 7), some κB sequences bind strongly to Relish and DIF. Such sites were not identified in our bioinformatic study, as expected for an analysis restricted to loci responsive only to a single pathway.

We envision two mechanisms for dual responsiveness. One would be regulation by a single site that responds to either pathway. Alternatively, a dual‐responsive gene could have multiple cis‐regulatory sites, with at least one responsive to each pathway. Regulation of metchnikowin (Mtk), a locus that has a robust response to both Toll and Imd (De Gregorio et al, 2001), appears to reflect an amalgamation of the two strategies. The Mtk gene contains three potential κB sites upstream of the transcriptional start. As predicted from bioinformatic analysis and confirmed by gel shift studies (Supplementary Figure 2), one, GGGAAGTCCCC, binds to both DIF and Relish, whereas two others, GGGGACTTTTT and GGGGAACCC, bind only to Relish.

κB‐site sequence and number regulate the immune response

Turning to the questions as to how κB site number influences transcriptional responses, we investigated two additional innate immune loci, IM1 and AttD (Figure 4A). Using luciferase reporter genes and site‐directed mutagenesis, we demonstrated that the Toll‐specific IM1 gene has a single functional κB site, whereas the Imd‐specific AttD gene contains three κB sites that are each required for a wild‐type response (Supplementary Figure 3). We generated a pair of additional constructs for each gene, replacing one endogenous site with a synthetic site specific for Relish or DIF binding.

Figure 4.

(A) The IM1 and AttD reporters. The 2 kb upstream of IM1 contains the transcript of IM23, oriented in the opposite direction. The 2 kb upstream of AttD contains two tRNA genes oriented in the opposite direction. Functional κB sites were identified through mutational analysis as described in Figure 2A. Imd and Toll responses were calculated as described in Materials and methods, then presented as a ratio of Imd:Toll response × 100. Luciferase assays were performed as described in Figure 2. Values indicated are for ratios measured with synthetic binding site (left, middle) or endogenous sites (right). (B) Simultaneous stimulation of Imd and Toll pathways results in greater than additive activation of AttA and AttD reporter constructs. Toll and Imd responses were calculated as described in Materials and methods. To directly compare Toll, Imd, and simultaneous induction, responses were not normalized to wild type. (C) Multiplicative effects require two functional κB sites. The mutated κB sequences are indicated in lower case letters. An asterisk marks the substitution at −139 with the GGGAATTCCC synthetic motif. (D) A mechanistic model for transcriptional regulation of innate immunity by the Toll and Imd pathways. (i, ii). Pathway‐specific transcriptional responses. (iii, iv) Alternative strategies used for dual responsiveness. A full‐color version of this figure is available at the EMBO Journal online.

Within the divergent contexts of the IM1 and AttD loci, the Relish‐ and Dif‐specific κB sites had behaviors comparable to those observed in the AttA reporter. In particular, when we compared expression induced by each pathway, the ratio of the Imd response to the Toll response was in each case greater for the Relish site than for the Dif site. The κB sequence code thus functions predictably in the context of loci varying in pathway responsiveness and κB site number.

In the course of our investigation of both the AttA and AttD genes, we observed a greater than additive effect of inducing the Toll and Imd pathways simultaneously (Figure 4B). For each gene, the fold‐increase in reporter gene expression on activating both Toll and Imd was significantly greater than the sum of the increases seen upon activating the two pathways individually. Furthermore, concomitant Toll and Imd activation had an apparently multiplicative effect not only in these loci, but also in drosomycin, which in S2 cells have a significant response only to Toll (Figure 4C).

To explore the origin of these effects in drosomycin expression, we mapped and mutated the functional κB sites (Supplementary Figure 3). There were two such sites, varying significantly in their behavior. The site at −303 bound DIF in vitro; inactivation of this site eliminated responsiveness to Toll. The site at −139 bound to Relish; inactivation of this site left the response to Toll signaling largely intact, but abrogated the synergistic effect of activating both pathways. We interpret these results to indicate that Relish bound at −139 is in itself insufficient to direct significant gene expression, but that the presence of Relish at the −139 site enhances the transcriptional response directed by Dif bound at the −303 site. Consistent with this hypothesis, we observed only additive effects for the pathways with a drosomycin construct having a single, albeit dual‐responsive, site (Figure 4C).

Taken together, the studies of AttA, AttD, and Drs reveal that pairs of κB sites can act together to enhance the strength of a transcriptional response. Furthermore, the greater than additive effect results from pairs of sites responsive to the same pathway (Figure 2B) or to different pathways (Figure 4B and C). The most parsimonious explanation for these observations is an interaction between κB protein homodimers bound to distinct κB sites. Based on similar findings, Ip and co‐workers suggest the formation of active heterodimers when both Dif and Relish sites are present (Tanji et al, 2007), an explanation we cannot rule out.


Although innate immune loci vary with regard to their responsiveness to the Toll and Imd pathways, sorting out the regulation of individual loci has been difficult. Experiments in the past typically involved flies injected with a mixture of Gram (+) and Gram (−) bacteria, leading to the activation of at least three defensive systems—the Toll, Imd, and wound‐response pathways (Lemaitre et al, 1996; Rutschmann et al, 2000; Khush et al, 2001; De Gregorio et al, 2002). To avoid this complication, we have employed a cell culture system in which we can reproducibly and rapidly activate individual response pathways. We have shown here that such an approach, specifically the use of transiently transfected reporter constructs and endogenously expressed Rel family members, recapitulates the response specificity seen in whole flies.

Taken together, the data from our molecular genetic, bioinformatic, and biochemical studies lead us to a general model for innate immune gene regulation in Drosophila (Figure 4D). The critical components of this model are as follows: first, pathway responsiveness is dictated by a small number of κB sites located 5′ to binding sites for the basal transcriptional machinery; second, the sequence of each site dictates specific responses through differences in binding affinity for the Imd effector Relish and the Toll effector DIF (and, by extrapolation, Dorsal); third, genes that respond to both Toll and Imd possess either a combination of pathway‐specific sites or at least one site capable of binding both to DIF and to Relish. Such a scheme allows the fine‐tuning of response strength by the influence of selective pressures on site sequence and number. Thus, for example, a gene that responds detectably only when both pathways are active might have a pair of weak binding sites, one for DIF and one for Relish.

Our results, as well as previously published studies with cecropin and diptericin (Engstrom et al, 1993; Kappler et al, 1993), indicate that κB sites within several hundred base pairs of the transcriptional start site control induction of Drosophila innate immune loci. Such findings are reminiscent of earlier work on the β‐interferon and IFN‐β promoters (Whitley et al, 1994; Thanos and Maniatis, 1995). Those studies defined the critical function of κB motifs in regulating human acute phase genes and demonstrated that induction is mediated by a combination of binding factors with promoter‐proximal targets. A compact arrangement of cis‐regulatory sequences thus appears to be a conserved property of rapid response loci.

By systematic site‐directed mutagenesis and bioinformatic analysis, we have demonstrated that κB motifs differ from one another in their intrinsic ability to support signaling by the Toll or Imd pathway. That is, κB motifs are not only necessary for induction, but they are also sufficient to determine the specificity of induction. Whereas such a result is not unanticipated, evidence for this mechanism has been lacking, particularly on a global scale. Our results suggest a reason for the paucity of supporting data. The underlying scheme, while simple, involves variation in motif length as well as sequence. Such a code is to some extent transparent to an analysis based on varying sequence at particular positions within a motif of fixed length and structure. Moreover, the κB sites of endogenous innate immune loci vary substantially in sequence, masking the general characteristics of active sites that emerged from the bioinformatic analysis. Lastly, the presence of both pathway‐specific and dual‐responsive motifs in known innate immune gene sets has likely further confounded previous attempts to discern the transcriptional response code.

Nature of the sequence code

Our findings demonstrate a remarkable agreement among the sequence rules derived from our bioinformatic studies, the binding specificity we detect in vitro, and the pathway responsiveness observed in cells. Moreover, the rules make sense. For example, it is at first puzzling that 5′GGGGA half‐sites are absent from the Toll‐responsive gene set, as the presence of this half‐site does not preclude a motif from supporting Toll‐directed transcription. Note, however, that a 5′GGGGA renders a site Imd‐responsive, whether or not it is Toll‐responsive. The key is therefore to recognize that a Toll‐specific motif must be responsive to the Toll pathway and unresponsive to the Imd pathway.

Although the sequence rules have allowed us to identify previously unannotated innate immune loci (SH Sze, PA Pevzner, and SA Wasserman, unpublished results), we have not identified κB sites upstream of every Toll‐ or Imd‐responsive gene. There are two likely explanations. First, as the structure and extent of the spliced transcript is not well established for some loci, the sequence examined may not in fact be immediately upstream of the promoter. Second, κB sites further removed from the promoter could contribute to gene expression in some cases. Indeed, the −303 Toll‐responsive site in drosomycin (see Figure 4C) lies outside of the boundaries within which we carried out our analysis. Furthermore, innate immune response loci are often tandemly arrayed or nested (e.g., CG15067), raising the possibility of control elements shared among genes.

Although the mean number of κB sites differs between Imd‐ and Toll‐responsive genes, a single κB site is sufficient to confer pathway specificity. In Imd‐specific genes, additional sites apparently function to increase the strength of the transcriptional response. It has been shown previously that the two κB sites upstream of diptericin, an Imd‐specific gene, act cooperatively to direct gene expression (Kappler et al, 1993). Some Toll loci may behave similarly. For example, the IM10 gene has two sites that bind DIF relatively weakly, but is strongly responsive to Toll activation. It may be, therefore that innate immune responses frequently rely on the cooperative effects of multiple sites for robust gene expression.

The scheme presented in Figure 4D invokes DIF and Relish homodimers, not heterodimers. Consistent with this idea, published studies indicate that only 1–2% of the Relish protein in Drosophila cells is in a complex with either DIF or Dorsal (Han and Ip, 1999). Although the formation of functional heterodimers might underlie cooperative effects observed upon co‐transfection of Relish and Dif constructs, cooperativity might also reflect the interaction of dissimilar homodimers bound to pairs of specific κB sites. Supershift experiments demonstrating the presence of both Relish and DIF at a given promoter (Han and Ip, 1999) are similarly open to alternative explanations. We cannot at present say, therefore, whether or not NF‐κB protein heterodimers have a role in fly innate immunity.

A κB code in mammals?

With regard to potential conservation of the sequence code, the closest mammalian counterparts to Relish and Dif are p50 and p65, respectively (Silverman and Maniatis, 2001; Friedman and Hughes, 2002). Data on p50 and p65 from biochemical and biophysical analyses (Kunsch et al, 1992; Ghosh et al, 1995; Muller et al, 1995; Chen et al, 1998) reveal striking parallels with the fly system. The most frequently selected targets for p50/p50 dimers are GGGGATTCCC and GGGGATCCCC. Both sites contain a GGGGA half‐site and a central region of three or fewer base pairs, just as observed for Relish binding in Imd pathway targets. Similarly, p65/p65 dimers select motifs typified by GGGAATTCCC, which contains the GGGAA half‐site and four‐residue long central region typical of motifs responsive to Toll and Dif. However, the fact that in vivo these two mammalian proteins exist almost exclusively as p50/p65 heterodimers argues against conservation of the sequence code. The types of synergistic effects we see in flies may nevertheless contribute to the observed role of pairs of κB sites in regulating some mammalian innate immune loci (Leung et al, 2004).

Transcriptional control of innate immunity

Whereas our data suggest that pathway specificity resides exclusively in κB sites, a range of studies has identified other sequence elements as bona fide or potential regulators of innate immune gene transcription. The best characterized of these at the functional level are the GATA and R1 elements (Kadalayil et al, 1997; Uvell and Engstrom, 2003; Senger et al, 2004). Such sequences, or others identified on the basis of similarity to mammalian regulatory elements (Georgel et al, 1993; Levashina et al, 1998), might contribute to discrimination between signaling pathways by providing contextual features required for NF‐κB binding and activity. It appears more likely, however, that the major function of these additional motifs is to modulate the level or location of the response, as indicated by in vivo experiments with GATA elements (Senger et al, 2004). As the promoter‐proximal region of AttA contains a well‐conserved GATA element, as well as an R1 element, it should be possible to begin to address these questions by the types of experimental approaches presented here.

Materials and methods

Reagents and site‐directed mutagenesis

Murine EGF was purchased from Calbiochem and the B. subtilis PG preparation was purchased from Sigma. All luciferase reporter constructs are based on the pBL3‐Basic vector (Promega). PCR amplification of the AttA promoter utilized a 5′ primer that introduced a KpnI site and a 3′ primer that introduced an NheI site.

Mutagenic oligonucleotides and the Expand High‐Fidelity PCR System (Roche) were used to introduce site‐directed alterations into the AttA reporter construct. The construct length was kept invariant and all motifs were introduced such that the position of the last G in the 5′ core element was kept constant.

Cell culture and transient transfections

S2*/EGFR‐Toll cells were maintained and transfected as described previously (Sun et al, 2002). Six‐well plates were seeded with 3 ml of 106 cells/ml. After 6 h, transfections were performed as described for the Drosophila Expression System (Invitrogen), with 100 ng each of reporter plasmid and pAc‐LacZ. After 16–20 h, each transfection was split into thirds in fresh media and transferred to 12‐well plates. One‐third (1 ml) was treated with 10 μg/ml PGN, one‐third was treated with 0.4 μg/ml EGF, and one‐third was left untreated. After 4 h, cells were washed with 500 μl PBS and then lysed in 100 μl Luciferase Reporter Buffer (Promega). A 15 μl portion of lysate was assayed for luciferase activity (Luciferase Assay System, Promega), and 15 μl was assayed for β‐galactosidase (β‐gal) activity (Galacto‐Light Plus, Tropix). Luciferase activity was normalized against β‐gal activity to control for transfection efficiency. The increase in expression on induction was calculated by subtracting the uninduced value from the induced value. Dividing this number by the uninduced value yielded the percentage increase in expression. As indicated in the figure legends, the percentage increase was in many cases normalized to a value of one for the response of a wild‐type construct to a particular pathway. RNAi experiments were carried out as described previously (Sun et al, 2002).

Bioinformatic analysis of Drosophila innate immune response loci

Using published microarray data (De Gregorio et al, 2002), we defined a set of loci that were strongly induced in adult flies by mixed (Gram(+) and Gram(−)) bacterial infection or by fungal infection. We then selected loci for which induction was substantially diminished by a mutation in spätzle, blocking Toll signaling, or a mutation in relish, disrupting Imd signaling. The specific criteria for assigning loci to each class are presented in Supplementary Table 2.

For each locus, we used available cDNA sequence data, as well as sequence profiles for TATA box and initiator elements (Ohler et al, 2002), to define the core promoter region. For 11 genes, this resulted in assignment of a transcriptional start site more than 10 bp away from the annotated gene start (; the sequences for the 5′ ends of all genes are presented in Supplementary Figure S1. For MEME analysis (Bailey and Elkan, 1994), we extracted 160 bp of genomic sequence extending from −190 to −40 relative to the calculated transcriptional start site of each gene.

Protein purification

The pGex‐DifNX plasmid (Ip et al, 1993) was a gift from Tony Ip. A pGex‐Rel construct was generated by inserting bp 1–1635 of Relish (amplified from cDNA) into the BamHI and EcoRI sites of the poly‐linker site of pGex. This fragment corresponds to amino acids 1–545, ending at the natural proteolytic cleavage site of Relish (Stoven et al, 2003). Plasmids were transformed into BL21 and expressed proteins purified as described previously (Ip et al, 1992), with the following modifications. A 500 ml culture was grown at 30°C to OD550 ∼0.5, then induced for 1 h with 1.6 mM IPTG. Cells were pelleted by centrifugation at 6000 g for 15 min at 4°C. The pellet was resuspended in 10 ml cold lysis buffer (25 mM HEPES (pH 7.5), 20 mM KCl, 2.5 mM EDTA, 1% Triton‐X 100, 1 mM DTT, 0.5 mM PMSF, 1 × EDTA‐free protease inhibitor cocktail (Roche)). Cells were lysed by performing three freeze‐thaw cycles, followed by three cycles of 10 s sonication and 30 s incubation on ice. To 5 ml of bacterial cell suspension, 500 μl of 5 M NaCl was added, and the suspension was incubated on ice for 15 min. Cellular debris was pelleted by centrifugation at 12 000 g for 15 min at 4°C. Next, 750 μl of a 50% slurry of cleaned glutathione‐Sepharose 4B (Amersham Biosciences) in PBS was added to the lysate and rocked gently at 4°C for 1 h. Sepharose beads were pelleted by centrifugation at 500 g for 5 min at 4°C, then washed three times with 10 ml cold lysis buffer and once with 10 ml cold 50 mM Tris (pH 8.0). Three elutions were performed with 500 μl of 50 mM Tris (pH 8.0) plus 10 mM glutathione and gentle rocking at 4°C for 1 h.

Gel‐shift assays

Electrophoretic mobility shift assays were performed as described previously (Ip et al, 1991) with the following modifications. Reactions were carried out in a volume of 10 μl containing 0.1 μg/μl poly(dI‐dC), 10 mM HEPES, 50 mM NaCl, 1 mg/ml BSA, 3 mM MgCl2, 6 mM βME, 10 mM EDTA, and 10% (v/v) glycerol. End‐labeled probe (0.8 pmol) was combined with protein, and incubated for 10 min at room temperature. Protein was used at a concentration of 1.5 μg per reaction in experiments with synthetic binding sites and 7.5 μg protein per reaction with endogenous binding sites. Binding complexes were resolved by electrophoresis in a 4% polyacrylamide gel in TBE buffer for 140 min at 7.5 V/cm.

All oligonucleotides were of 24 bp in length, centered on the κB motif. Synthetic κB motifs were substituted for the −46 motif in AttA such that the position of the last G in the 5′ core element was kept constant. The κB motifs from AttA (−46 and −118), diptericin (−145), and IM1 (−86) were each analyzed in their endogenous context.

Supplementary data

Supplementary data are available at The EMBO Journal Online (

Supplementary Information

Supplementary Information [emboj7601798-sup-0001.doc]

Supplementary Table S1 [emboj7601798-sup-0002.doc]

Supplementary Figure S2 [emboj7601798-sup-0003.pdf]

Supplementary Figure S3 [emboj7601798-sup-0004.pdf]

Supplementary Figure S1 [emboj7601798-sup-0005.pdf]


We thank S Ho and C Glass for use of luminometers in their laboratories, K Pogliano for supplying an initial PG preparation, the laboratory of J Posakony for assistance with gel shift experiments, and Y Ip for both plasmid DNA and informative conversations. E Bier, G Ghosh, A Hoffmann, A Letsou, W McGinnis, and L Zipursky provided helpful comments on a draft manuscript. This work was supported by NIH Grant 5R01‐GM50545 to SAW. MSB was supported by funds from the NIH Training Program in Basic Clinical Genetics, 2T32GM008666‐06. CPA was supported in part by Cal (IT)2.


View Abstract