The SeqA protein binds to the post‐replicative forms of the origins of replication of the Escherichia coli chromosome (oriC) and the P1 plasmid (P1oriR) at hemimethylated GATC adenine methylation sites. It appears to regulate replication by preventing premature reinitiation. However, SeqA binding is not exclusive to replication origins: different fragments with hemimethylated GATC sites can bind SeqA in vitro when certain rules apply. Most notably, more than one such site must be present on a bound fragment. The protein appears to recognize individual hemimethylated sites, but must undergo an obligate cooperative interaction with a nearby bound protein for stable binding. SeqA contacts both DNA strands in a discrete patch at each hemimethylated GATC sequence. All four GATC bases are contacted and are essential for binding. Although the recognized sequence is symmetrical, the footprint on the methylated strand is always broader, suggesting that the bound protein is positioned asymmetrically with its orientation dictated by the position of the unique methyl group. Studies of alternative spacings and relative orientations of adjacent sites suggest that each site may be recognized by a symmetrical dimer with an induced asymmetry in one of the subunits similar to that seen with certain type II restriction endonucleases.
The replication origin of Escherichia coli, oriC, is regulated by a process known as sequestration (Ogden et al., 1988; Campbell and Kleckner, 1990). Sequestration regulates the timing of initiation by recognizing newly replicated origins, and imposes a block to further replication until the system is reset for the next cell cycle (Russell and Zinder, 1987; Boye et al., 1988; Ogden et al., 1988; Bakker and Smith, 1989; Boye and Løbner‐Olesen, 1990; Campbell and Kleckner, 1990). Newly replicated DNA is recognized by monitoring the methylation state of the adenine bases in GATC sequences (Russell and Zinder, 1987). GATC sequences are normally methylated at the N6 position of the adenine bases on each strand by Dam methyltransferase. However, immediately after semi‐conservative replication, the newly replicated DNA is only methylated on one strand. The hemimethylated GATC sequences in the origin are recognized and bound by the SeqA protein (Brendler et al., 1995; Slater et al., 1995), and the protein is required for sequestration (Lu et al., 1994; von Freiesleben et al., 1994). Sequestration imposes a negative regulation of DNA replication by the formation of a complex, probably consisting of hemimethylated oriC sequences, SeqA protein and other factors. Sequestration is accompanied by binding of the hemimethylated origin sequences to the cell membrane (Ogden et al., 1988) and membrane fractions can inhibit in vitro replication of DNA from a hemimethylated oriC template (Landouisi et al., 1990). The hemimethylated oriC binding activity of the membrane can be resolved into two components, one of which is SeqA protein and the other, a second protein component, SeqB (Shakibai et al., 1998). SeqB does not bind to oriC independently, but enhances the ability of SeqA to do so in a hemimethylation‐specific manner. A third protein, HobH (NapA) (Herrick et al., 1994; Reshetnyak et al., 1999), may also be involved in the formation of this membrane complex.
Although other components may be involved in origin sequestration, purified SeqA has an efficient in vitro DNA binding activity that recognizes certain DNA sequences in a hemimethylation‐specific fashion in the absence of other components. These sequences include portions of oriC (Brendler et al., 1995; Slater et al., 1995), P1 plasmid oriR and a random sequence with multiple GATC sites (Brendler et al., 1995). Binding can be detected at DNA concentrations of <10 pM under the appropriate conditions (T.Brendler, unpublished data) and is highly specific for the hemimethylated forms (Brendler et al., 1995), except in the case of the intact oriC, where a weaker binding to fully methylated DNA has been reported (Slater et al., 1995).
Although it is generally assumed that the replication origin is the principle target of the SeqA protein and its regulatory activities, its ability to recognize hemimethylated targets other than oriC suggests that it may be profitable to consider potential sites of action other than at the origin. Recent fluorescent antibody staining studies reveal discrete foci of SeqA protein in positions in the cell that are distinct from the positions of chromosomal origins and which persist when oriC is deleted. It has been suggested that these correspond to special sites involved in chromosome segregation (Hiraga et al., 1998). Here, we investigate further the DNA binding specificity of the SeqA protein, both to attempt to understand its role in binding to origins and to explore its potential for recognizing other sequences.
SeqA binding in vitro requires at least two hemimethylated GATC sites on a single DNA fragment
SeqA binding studies were carried out using a 71 bp double‐stranded synthetic oligonucleotide containing three GATC sequences (Figure 1). Apart from the GATC bases, the sequence of the fragment was chosen at random (Brendler et al., 1995). The fragment was made by annealing complementary single‐stranded oligonucleotides 91 bases in length. The resulting DNA duplex was digested with XbaI to generate the 71 bp fragment. By substituting one or more of the GATC adenine bases in one or both of these oligonucleotides with N6‐methyladenine, a variety of double‐stranded fragments was produced with different numbers of hemimethylated sites.
Figure 2 compares the binding of two series of related hemimethylated constructs that have three, two or one hemimethylated sites. A retarded gel band indicating efficient binding was detected with fragments with three or two hemimethylated sites. However, no binding was detected when only one site was present, irrespective of the position of the single hemimethylated site. The results shown used substrates with all methyl groups in the bottom strand of the sequence (Figure 1), and the GATC sites either hemimethylated or unmethylated. Essentially the same results were obtained with a series of fragments in which all methyl groups were on the top strand or with a series in which combinations of hemimethylated and fully methylated sites were used (data not shown). In all cases, at least two hemimethylated sites were needed to detect binding by gel band mobility shift.
Failure of the single‐site hemimethylated fragments to give a retarded gel band could be due to lack of SeqA binding, or alternatively, to the inability of the protein bound to a single site to alter the mobility of the DNA. To address this, the ability of an unlabeled fragment with a single hemimethylated site to compete for binding with labeled fragments with three sites was studied. No competition was observed, even when a 100‐fold excess of the unlabeled species was used (Figure 3). Unlabeled fragments containing two (data not shown) or three hemimethylated sites competed efficiently (Figure 3). Thus, no measurable SeqA binding occurs unless two or more hemimethylated sites are present on the same molecule.
Two hemimethylated sites can be recognized when the methyl groups are more widely spaced
In the experiments described in Figure 3, fragments had hemimethylated GATC sites adjacent to each other and the methyl groups were 11 or 12 bp apart. By annealing the appropriate oligonucleotides, fragments that have any combination of hemimethylated sites can be produced (Figure 4), including those with the non‐adjacent sites hemimethylated such that the two methyl groups are 23 bp apart. These fragments also bind SeqA (Figure 4). Therefore, binding does not require one fixed spacing between hemimethylated sites. It was irrelevant whether the intervening site was fully methylated or unmethylated (data not shown).
Two hemimethylated sites can be recognized when the methyl groups are on opposite strands
Figure 5 shows the results obtained with constructs in which two hemimethylated sites are present with the methyl groups on opposite strands. Although the efficiency of binding varies with these constructs, all showed significant SeqA binding.
SeqA requires intact GATC sequences in addition to the N6‐methyladenine
Although the cellular Dam methyltransferase can only methylate adenine residues in the context of a GATC sequence, the method used to produce the synthetic oligonucleotides allows the placement of the methylated base in any context. The starting sequence for these experiments consisted of an oligonucleotide shown in Figure 1 (bottom) with only the first two GATC sequences present. As expected from the results above, this construct bound SeqA efficiently when both GATC sequences were hemimethylated (Figure 6). Variants of this sequence were then made where the G‐C base pair in the first position of each hemimethylated GATC was changed to a T‐A pair (Figure 1, bottom). No SeqA binding was observed with this variant (Figure 6). Similar changes were made altering the T or C bases in each GATC sequence to A (Figure 1). Neither of these variants bound SeqA (Figure 6). Thus, in addition to the essential N6‐methyl A‐T base pair, the rest of the GATC bases are also recognized and are essential for SeqA binding.
DNA footprinting of bound SeqA
The results of 1,10‐phenanthroline‐copper footprinting of SeqA protein bound to DNA are shown in Figure 7. In this and similar experiments, protection of all three GATC sites appeared complete, especially at higher SeqA concentrations, suggesting that all three sites can be bound by SeqA simultaneously on the same molecule. The DNA substrate consisted of the entire 91 bp sequence shown in Figure 1. When SeqA was bound to this standard sequence with three hemimethylated GATC sites, three discrete areas of protection covering the GATC bases were seen (Figure 7A). Figure 7A follows the footprint on the bottom strand. When the bottom strand was methylated, the footprints included the GATC bases, the two bases immediately to the 5′ side, and, in two of the three positions, the two bases immediately to the 3′ side of the GATC bases. When the top strand was methylated, the bottom‐strand footprint was narrower, with just the four GATC bases being well protected and some partial protection of one adjacent base sometimes evident (Figure 7A).
Note that sites I and III show some enhanced cutting near one boundary of the footprint. This occurs only when the methylation is on the top strand and is only seen on the bottom‐strand footprints (data not shown). This enhanced cutting may indicate some distortion of the bound complex due to the particular sequence configuration of these sites.
Figure 7B shows the footprint obtained when the substrate had only two hemimethylated sequences. This DNA had the third GATC sequence unmethylated. The third position showed no footprint, confirming that SeqA binding is directed specifically to hemimethylated sites. When the two methyl groups were both on the bottom strand (the cis configuration), the bottom strand showed two of the broader footprints typical of recognition of the methylated strand. When the methyl groups were on opposite strands (the trans configuration), the bottom strand showed a broader footprint directed to the methylated position (site I) and a narrower one at site II where this bottom strand is unmethylated (Figure 7B).
Figure 8 summarizes the data shown in Figure 7 and some data (not shown) obtained by assaying the protection of the upper strand. A pattern is evident: the methylated strand always has a broader footprint than the unmethylated strand, where the footprint is generally limited to the four GATC bases. This pattern holds irrespective of the position or orientation of the hemimethylated site concerned and is conserved when one of the sites is removed or when two adjacent sites have methyl groups on opposite strands (Figure 7B). Thus, the contacts of the SeqA protein are asymmetric with respect to each site. SeqA appears to bind to the otherwise symmetrical GATC sequence in a polar fashion, with its orientation dictated by the polarity of the methyl group.
SeqA recognizes the symmetrical sequence GATC and has no other specific sequence requirement. It forms a discrete footprint over hemimethylated GATC sequences and requires at least two such sequences to be present on a DNA fragment in order to bind. It can tolerate at least two widely different spacings between the two sites. When three sites are present, all three appear to be occupied on the same molecule. Thus, the consensus sequence for SeqA does not appear to be an extended one that encompasses two hemimethylated GATCs. Rather, SeqA appears to recognize the individual hemimethylated GATC sequence and shows an obligate cooperativity between SeqA units bound to two or more individual binding sites. Presumably, stable binding requires side‐to‐side contact between proteins bound to individual sites (Figure 9). Occupation of three sites simultaneously (Figure 7A) is consistent with parallel contacts between adjacent units. The fact that the spacing between two sites can be varied by at least 11 bp suggests that adjacent proteins can communicate at a distance. The proteins could be highly flexible, the region between two sites could be filled with additional proteins or the excess DNA between them could be looped out. We are currently investigating further the effects of spacing changes to clarify this point.
We have shown that footprints of the protein complexes on the DNA show some asymmetry, suggesting that the orientation of the protein on each individual GATC site is dictated by the position of the methyl group (Figure 9A). If this is the case, when the methyl groups of two closely spaced sites are in cis, the two protein complexes should be in a parallel configuration, and when in trans at the same spacing, they should be symmetrically disposed with respect to each other (Figure 9A and B). Two different types of cooperative contact would have to be involved (Figure 9A and B). However, if the basic unit of binding is a symmetrical homodimer where each subunit recognizes one strand, the interdimer contact could be satisfied irrespective of the relative orientation of the dimers (Figure 9C and D). This type of configuration would resemble that of type II restriction endonucleases bound to their symmetrical sites (Pingoud and Jeltsch, 1997). In the case of BamHI, binding appears to induce a conformational change in one of the two subunits (Newman et al., 1995). It is possible that SeqA also undergoes a change in conformation of one of two monomers, and that this change is induced by the presence of the methyladenine on one strand (Figure 9C). This could account for the methyl‐directed asymmetry of the SeqA footprint. If such an asymmetry were essential for stable dimer formation, the specificity for hemimethylated sites could also be explained.
Materials and methods
Aldrich (Milwaukee, WI) supplied the following chemicals: 1,10‐phenanthroline; 2,9‐dimethyl‐1,10‐phenanthroline, 3‐mercaptopropionic acid and copper(II) sulfate pentahydrate. T4 polynucleotide kinase was obtained from Amersham Pharmacia Biotech. (Piscataway, NJ), Klenow enzyme was obtained from Roche Molecular Biochemicals (Indianapolis, IN) and Thermo Sequenase DNA polymerase was obtained from Amersham Pharmacia Biotech. (Arlington Heights, IL). New England Biolabs (Beverly, MA) supplied all other enzymes. [α–32P]dATP and [γ‐32P]ATP were obtained from Amersham Pharmacia Biotech (Arlington Heights, IL). All chromatography columns and media were purchased from Amersham Pharmacia Biotech (Piscataway, NJ).
Buffer A was 25 mM HEPES–KOH pH 7.5, 2 mM dithiothreitol (DTT) and 10% (w/v) sucrose; buffer B was 2.5 M potassium chloride, 10 mM magnesium acetate, 200 mM spermidine and 20 mM DTT; buffer C was 40 mM HEPES–KOH pH 7.5, 1 mM magnesium acetate, 0.1 mM EDTA, 2 mM DTT and 15% (v/v) glycerol; and buffer D was 25 mM HEPES–KOH pH 7.5, 1 mM magnesium acetate, 0.1 mM EDTA, 2 mM DTT and 15% (v/v) glycerol.
Preparation of synthetic DNA fragments
Synthesis of unmethylated and adenomethylated oligonucleotides was previously described (Brendler et al., 1995). Crude oligonucleotides were purified using the QIAquick Nucleotide Removal Kit (Qiagen Inc., Santa Clarita, CA) according to the manufacturer's instructions. Subsequent annealing of oligonucleotides and polyacrylamide gel purification of the duplex DNA was also previously described (Brendler et al., 1995). The sequences of the DNA fragments used are shown in Figure 1. The top sequence is the 91 bp parent sequence. This sequence was randomly chosen except for the three GATC sites. The spacing of these sites was modeled on the P1 phage Pac protein binding site which is known to bind SeqA when hemimethylated (Brendler et al., 1995). For DNA footprinting experiments, the entire 91 bp sequence was used. For gel electrophoretic mobility–shift assays, the DNA was digested with XbaI, and the resulting 3′ recessed ends were labeled by filling in with Klenow enzyme. The resulting 71 bp fragment is shown in upper case letters (Figure 1). The bottom sequence represents the XbaI‐digested fragment used as the framework for studies that simultaneously mutated GATC sites I and II. GATC site III was removed by transposing the central AT to TA, which results in the sequence GTAC.
Radioactive labeling of DNA fragments
The XbaI‐digested DNA fragments were radiolabeled using Klenow enzyme and [α‐32P]dATP (Sambrook et al., 1989). Following the labeling reactions, the DNA fragments were extracted with phenol and chloroform and then purified by Sephacryl S‐100 HR chromatography. The uniquely 5′‐end labeled DNA used for footprinting studies was made by radiolabeling a single‐stranded oligonucleotide using T4 polynucleotide kinase and [γ‐32P]ATP (Sambrook et al., 1989). The labeled oligonucleotide was annealed to the unlabeled complementary oligonucleotide and the resulting DNA duplex was purified by polyacrylamide gel electrophoresis (Brendler et al., 1995).
Electrophoretic mobility‐shift assays
SeqA protein binding to hemimethylated DNA was determined by gel electrophoretic mobility‐shift assays (Brendler et al., 1995). Each 25 μl assay contained a total of 1 ng (0.025 pmol) of labeled DNA (10 000 c.p.m.). The SeqA protein was diluted into buffer containing 200 mM potassium glutamate, 1 mM magnesium acetate, 20 mM Tris acetate pH 7.5, 0.1 mM EDTA, 1 mM DTT, 0.1% (v/v) Nonidet–P 40 (NP‐40) and 15% (v/v) glycerol prior to use. Unless otherwise stated, each assay contained 50 ng of SeqA protein.
Purification of the SeqA protein
Strain BL21(DE3/pLysS) was transformed with pSS1, which contains the seqA gene in the pET‐11a vector (Studier et al., 1990; Slater et al., 1995). A 5 l culture of the transformed strain was grown in M9/ZB media plus glucose, ampicillin (20 μg/ml) and timentin (16 μg/ml, Smith Kline Beecham Corp.). The culture was grown at 37°C to an OD600 of 0.6–0.8. The cells were then induced for 1 h by adding IPTG to a final concentration of 0.4 mM. The cells were harvested by centrifugation for 15 min at 4500 g at 4°C. The pellets were rinsed with 50 ml of buffer A, and resuspended in an equal weight of buffer A. A 0.11 volume of buffer B was added. The cell suspension was frozen in liquid nitrogen and stored at −70°C. The frozen cells were lysed by thawing on ice for 30 min, and the lysate was centrifuged for 45 min at 4°C at 183 000 g in a Beckman 50Ti rotor. The supernatant was discarded after measuring its volume. An equal volume of A containing 500 mM potassium chloride, 10 mM magnesium acetate, 20 mM spermidine and 0.1% (v/v) NP‐40, was added to the pellet. The mixture was sonicated four times with a microprobe at 105 W for 20 s with 1 min cooling intervals on ice (Branson Sonifier, Model 185). The sonicate was centrifuged 45 min at 4°C at 183 000 g in a Beckman 50Ti rotor. The supernatant was saved (fraction I). The protein was precipitated from fraction I by the addition of 0.35 g/ml of ammonium sulfate. The precipitate was dialyzed against buffer C containing 200 mM ammonium sulfate and 0.1% (v/v) NP‐40 (fraction II). Fraction II was diluted 20‐fold in buffer D plus 150 mM sodium chloride and then loaded onto a 20 ml heparin–Sepharose CL‐6B column equilibrated in the same solution. The column was developed with a gradient of 150 mM to 1.5 M sodium chloride in buffer D. The SeqA protein eluted around 0.9 M sodium chloride. The fractions active for hemimethylated DNA binding were pooled and the protein was precipitated with 0.35 g/ml ammonium sulfate. The precipitated protein was dialyzed against buffer C containing 200 mM ammonium sulfate and 0.1% (v/v) NP‐40 (fraction III). The SeqA protein was >95% pure at this point. Fraction III was loaded onto a 5 ml SP‐Sepharose HP HiTrap column equilibrated with buffer C plus 100 mM potassium chloride. The column was developed with a gradient of 100 mM to 1.5 M potassium chloride in buffer B. The SeqA protein eluted at 0.8 M potassium chloride. The protein peak was pooled and dialyzed against buffer C containing 200 mM ammonium sulfate and 0.1% (v/v) NP‐40. The resulting SeqA protein was frozen at −70°C and was stable for several months.
DNA footprinting was done by a modification of a technique previously described (Kuwabara and Sigman, 1987; Sigman et al., 1991). 250 000 c.p.m. of uniquely 5′‐end labeled DNA was incubated with 50 ng of SeqA for 20 min at 30°C. Then the reaction mixtures were treated with 160 μM 1,10‐phenanthroline‐copper and 5.8 mM 3‐mercaptopropionic acid for 30 s at 30°C. Reactions were quenched by the addition of 0.1 volume of 28 mM 2,9‐dimethyl‐1,10‐phenanthroline. Polyacrylamide gels (8% w/v) were used to separate bound DNA and free DNA (Brendler et al., 1995). Both the bound and unbound DNA bands were localized by autoradiography at 4°C and excised separately. The DNA from the excised bands was extracted and further purified using the QIAEX II Gel Extraction Kit (Qiagen Inc., Santa Clarita, CA). The manufacturer's instructions were used except that the crushed gel slices were soaked for 3 h at 37°C instead of 1 h at 50°C. The purified DNA was analyzed using denaturing gels containing 12% (w/v) polyacrylamide and 7 M urea (Sambrook et al., 1989). The relative positions of nucleolytic cleavages were determined from a DNA sequence ladder generated by the dideoxy method using Thermo Sequenase DNA polymerase (Tabor and Richardson, 1995). The radiolabeled primer cycle sequencing protocol of the Thermo Sequenase Cycle Sequencing Kit (Amersham Pharmacia Biotech., Arlington Heights, IL) was used.
We thank Marilyn Powers of the Oligonucleotide Synthesis Laboratory, SAIC, for synthesizing the oligonucleotides used in this study. This research was sponsored by the National Cancer Institute, DHHS, under contract with ABL. The contents of this publication do not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government.
- Copyright © 1999 European Molecular Biology Organization