Genomic ribonucleotides incorporated during DNA replication are commonly repaired by RNase H2‐dependent ribonucleotide excision repair (RER). When RNase H2 is compromised, such as in Aicardi‐Goutières patients, genomic ribonucleotides either persist or are processed by DNA topoisomerase 1 (Top1) by either error‐free or mutagenic repair. Here, we present a biochemical analysis of these pathways. Top1 cleavage at genomic ribonucleotides can produce ribonucleoside‐2′,3′‐cyclic phosphate‐terminated nicks. Remarkably, this nick is rapidly reverted by Top1, thereby providing another opportunity for repair by RER. However, the 2′,3′‐cyclic phosphate‐terminated nick is also processed by Top1 incision, generally 2 nucleotides upstream of the nick, which produces a covalent Top1–DNA complex with a 2‐nucleotide gap. We show that these covalent complexes can be processed by proteolysis, followed by removal of the phospho‐peptide by Tdp1 and the 3′‐phosphate by Tpp1 to mediate error‐free repair. However, when the 2‐nucleotide gap is associated with a dinucleotide repeat sequence, sequence slippage re‐alignment followed by Top1‐mediated religation can occur which results in 2‐nucleotide deletion. The efficiency of deletion formation shows strong sequence‐context dependence.
Genomic ribonucleotides misincorporated during DNA replication can be processed by RNase H2 or topoisomerase I (Top1). In vitro reconstitution shows that Top1‐mediated ribonucleotide removal comprises two consecutive catalytic steps and formation of a covalent Top1–DNA complex, which is repaired in either an error‐free or error‐prone manner.
Top1 reversibly cleaves genomic ribonucleotides with formation of a ribonucleotide 2′,3′‐cyclic phosphate.
Top1 cleaves two nucleotides upstream of the ribonucleotide 2′,3′‐cyclic phosphate to yield a covalent Top1–DNA complex.
The Top1–DNA complex can be repaired by proteolysis and degradation of the modified 3′‐phosphate by Tdp1 and Tpp1.
Religation after slippage realignment at a dinucleotide repeat sequence can result in 2‐nt deletions.
The incorporation of ribonucleotides during DNA replication and its consequence for genome instability have been the focus of several recent studies [reviewed in Williams & Kunkel (2014)]. Ribonucleotide incorporation is more frequent than previously surmised because the cellular rNTP pools are 10‐ to 100‐fold higher than the dNTP pools (Nick McElhinny et al, 2010b). At dNTP and rNTP concentrations similar to those found in the yeast cell, replicative DNA polymerases typically incorporate 1–2 ribonucleotides per kilobase (Nick McElhinny et al, 2010b; Sparks et al, 2012). These incorporated ribonucleotides (rNMPs) constitute the most abundant replication errors in cells. Their removal from the genome is important to maintain genomic stability. RNase H2‐initiated ribonucleotide excision repair (RER) is the main pathway that removes ribonucleotides from the genome (Rydberg & Game, 2002; Kim et al, 2011; Sparks et al, 2012). In humans, mutations in each of the genes for this three‐subunit enzyme are responsible for a genetic syndrome known as Aicardi‐Goutières syndrome (AGS) (Crow et al, 2006). The mutations are associated with a reduction in RNase H2 activity, compromising the efficiency of RER (Chon et al, 2013). A complete defect of RNase H2 in mouse causes embryonic lethality (Reijns et al, 2012). And similar to yeast RNase H2 deletion strains (Nick McElhinny et al, 2010a), mouse RNase H2‐null cell lines also accumulate ribonucleotides in their genomic DNA (Reijns et al, 2012).
In yeast strains lacking a functional RNase H2, the accumulation of genomic ribonucleotides is partially suppressed by the action of DNA topoisomerase 1 (Top1) (Kim et al, 2011; Williams et al, 2013). However, these RNase H2‐deficient yeast strains show a distinct mutational pattern that is characterized by an increase in 2‐ to 5‐nt deletions that occur specifically at repeat sequences, and are dependent on Top1 (Nick McElhinny et al, 2010a; Kim et al, 2011). Top1 is a type IB topoisomerase whose action proceeds through the reversible formation of a DNA nick bounded by a DNA‐3′‐phosphate‐Top1 covalent complex through its active site tyrosine, and a 5′‐hydroxyl group. Failure to reverse this step, through the action of drugs such as camptothecin that inhibit this re‐ligation step, or because of loss of the adjacent 5′‐hydroxyl group, can result in the persistence of the covalent complex (Champoux, 2001). This covalent complex is termed the Top1‐cleavage complex (Top1‐cc).
Based on genetic studies, a model has been proposed in which a persistent Top1‐cc can be resolved using several different pathways. Generally, repair of a Top1‐cc is initiated by partial proteolytic degradation of the DNA‐bound Top1 (Desai et al, 1997), followed by removal of the remaining peptide by Tdp1 (tyrosyl‐DNA phosphodiesterase) (Pouliot et al, 1999). Tdp1 activity is specific for small tyrosyl peptides and does not hydrolyze native Top1 attached to the 3′‐phosphate (Debethune et al, 2002). Subsequently, the resulting 3′‐phosphate can be redundantly removed either by Tpp1 (three prime phosphatase) or by the abasic endonucleases Apn1 or Apn2, which should produce a small gap (Vance & Wilson, 2001). Subsequent gap repair synthesis should proceed in an error‐free manner. Alternatively, the structure‐specific endonuclease Rad1–Rad10 can excise the Top1‐cc as an initiating step in recombinational repair (Vance & Wilson, 2002). A third pathway for resolving Top1‐cc has recently been proposed, and it depends on proteolysis of Top1 by Wss1 protease, followed by DNA polymerase ζ‐dependent translesion synthesis past the lesion (Stingele et al, 2014). To what extent each of these pathways is utilized is currently not obvious, but the importance of the Tdp1 pathway is apparent from the observation that while neither deletion of TDP1, RAD1, or WSS1 is particularly sensitive to camptothecin, either the tdp1Δ rad1Δ double mutant or the tdp1Δ wss1Δ double mutant is exquisitely sensitive to camptothecin, indicating that the repair of Top1‐cc has been severely compromised when both pathways are inactivated (Vance & Wilson, 2002; Stingele et al, 2014).
Human and vaccinia viral Top1 proteins show endoribonuclease activity toward DNA‐embedded ribonucleotides (Sekiguchi & Shuman, 1997; Kim et al, 2011). When Top1 is transiently linked to the 3′‐phosphate of the ribonucleotide moiety, nucleophilic attack by the neighboring 2′‐hydroxyl group can release Top1 with the formation of a 2′,3′‐cyclic phosphate nick (Fig 1A). Currently, it is not known whether the cyclic phosphate intermediate can be resolved by DNA repair enzymes. However, DNA with long‐lived nicks are substrates for Top1, which cleaves most prominently a few nucleotides upstream of the nick (Christiansen & Westergaard, 1999). When this occurs, the resulting small oligonucleotide that is generated can readily dissociate, leaving a gap bounded by the Top1‐cc and a 5′‐hydroxyl group (Fig 1A). The gapped Top1‐cc can be repaired by the Tdp1 or Rad1‐10 pathway. Alternatively, if the small gap occurs in a repetitive DNA sequence, a slippage re‐alignment by extrusion of the non‐cleaved DNA strand would place the 5′‐hydroxyl in proximity of the 3′‐phosphate‐Top1 and therefore may permit religation of the DNA with release of Top1 (Cho et al, 2013). If the bulge is not processed by mismatch repair, the following round of replication would lead to a 2‐ to 5‐nt deletion in one of the daughter cells. The Top1‐dependent rNMP‐initiated deletion mutagenesis phenomenon shows strong similarity with transcription‐associated mutagenesis (TAM) in yeast, which also occurs specifically at repetitive sequences in regions of highly transcribed sequences, and is also dependent on Top1 (Lippert et al, 2010; Takahashi et al, 2010). TAM is thought to occur by collision between RNA polymerase with a Top1‐cc that must be resolved.
The resolution of Top1‐cc has been an intense focus of study for some time, as several classes of therapeutic cancer drugs target Top1 and stabilize the cleavage complex leading to DNA damage (Pommier et al, 2010). However, the mechanistic details of how the activity of Top1 at genomic ribonucleotides can lead to deletion formation remain to be determined. The model proposed by Cho et al (2013) requires that Top1 acts independently at two steps, first at the site of ribonucleotide incision and, when this incision results in the formation of a 2′,3′‐cyclic phosphate nick, in a second step upstream of the cyclic phosphate nick. Therefore, genetic studies with Top1 deletion mutants are uninformative since these eliminate both steps. However, a biochemical analysis should allow a probing of each step, thereby providing an opportunity to test the proposed model.
In this paper, we have investigated the activities of Top1 at genomic ribonucleotides and its subsequent activity at the intermediates that have been formed as a result of Top1 action at rNMPs. Our studies highlight the catalytic activity of Top1 at various types of RNA–DNA and DNA–DNA nicks. Surprisingly, the 2′,3′‐cyclic phosphate‐terminated nick is the best substrate for Top1, leading both to a reversal process through nucleophilic attack of Top1 on the cyclic phosphate and to a forward reaction that produces Top1‐cc. We have reconstituted the error‐free pathway for the repair of Top1‐cc using purified enzymes. We also describe the sequence requirements for the formation of 2‐nt deletions in a dinucleotide repeat sequence containing a single ribonucleotide, which is the most common type of Top1‐dependent deletion mutations produced in the cell (Lippert et al, 2010; Nick McElhinny et al, 2010a).
Saccharomyces cerevisiae Top1 possesses endoribonuclease activity that produces a 2′,3′‐cyclic phosphate
We began our study using an oligonucleotide with a (GA)3 dinucleotide repeat sequence, also named the (TC)3 hotspot, derived from the yeast CAN1 gene, which was identified as a Top1‐dependent 2‐nt deletion hotspot (Fig 1B) (Lippert et al, 2010; Nick McElhinny et al, 2010a). In order to separate error‐free repair from mutagenic repair involving deletions, we screened several oligonucleotides with the rNMP situated at different positions and selected one substrate that yielded no detectable deletions for our studies of error‐free repair. The same 32‐mer oligonucleotide, with an rUMP at position 16 on the top strand (Fig 1B), was shown by others to produce a uridyl‐2′,3′‐cyclic phosphate intermediate when treated with human Top1 (Kim et al, 2011).
The addition of S. cerevisiae Top1 to the substrate, 5′‐labeled on the rUMP‐containing strand, yielded two products: the first is consistent with Top1 cleaving on the 3′‐side of the rUMP, and the second is a high molecular weight product that barely migrated into the gel (Fig 1B). These products are dependent on the presence of the rNMP in the substrate (Supplementary Fig S1A, compare lane 7 with 13). Additional control experiments with 3′‐labeled substrate, which are detailed in Supplementary Fig S1A, show that the downstream product produced by Top1 action has a 5′‐hydroxyl group, as expected from the known mechanism of Top1 (Champoux, 2001).
Vaccinia virus topoisomerase 1 cleaves DNA at an embedded rNMP with the formation of a 2′,3′‐cyclic phosphate (Sekiguchi & Shuman, 1997). A logical model for this reaction is that during the course of Top1 action, the transient covalent RNA‐3′‐phosphoryl‐Top1 intermediate is attacked by the vicinal 2′‐hydroxyl to release Top1 and produce a 2′,3′‐cyclic phosphate product (Fig 1A). We show that the product produced by yeast Top1 also has a 2′,3′‐cyclic phosphate terminus. First, it is resistant to alkaline phosphatase treatment, which can remove a 2′‐ or 3′‐phosphate, but not a 2′,3′‐cyclic phosphate (Fig 1C, lane 2); it is also resistant to yeast Tpp1 (three‐prime phosphatase), an enzyme involved in the repair of DNA strand breaks (Vance & Wilson, 2001) (lane 3). Third, the cyclic phosphate is resolved by the combined action of yeast Trl1 and alkaline phosphatase (lane 6). Trl1 is a tRNA ligase involved in tRNA splicing (Sawaya et al, 2003). Trl1 is the only yeast enzyme known to process a ribonucleotide‐2′,3′‐cyclic phosphate, but it produces a 2′‐phosphate. The activity of Trl1 on the cyclic phosphate‐terminated DNA substrate could not be directly demonstrated (lane 4), because both substrate and product migrate similarly on the gel. However, the hydrolysis of the cyclic phosphate by Trl1 was revealed when alkaline phosphatase was also included (lane 6). Inclusion of Tpp1 with Trl1 was ineffective (lane 5), because Tpp1 is specific for 3′‐phosphates and does not hydrolyze the 2′‐phosphate produced by Trl1. Finally, T4 polynucleotide kinase, which has both 2′,3′‐cyclic phosphatase and 3′‐phosphatase activity (Das & Shuman, 2012), also removed the cyclic phosphate moiety (lane 7).
Top1 processes the rNMP‐2′,3′‐cyclic phosphate nick into a Top1‐cc
A very slowly migrating product was also observed, its abundance increasing with increasing Top1 concentrations (Fig 1B). When analyzed on a SDS–polyacrylamide gel, this product migrated as a single species (Supplementary Fig S1B). We hypothesized that this product was the Top1‐cc, in which Top1 was covalently linked to the DNA through a 3′‐phospho‐tyrosyl bond. Proteinase K treatment of the Top1‐cc, produced by incubation of the rNMP‐containing DNA substrate with a 2‐fold molar excess of Top1 for increasing times, showed a series of products reflecting covalent complexes of tyrosyl‐oligopeptides with the DNA, with either the peptide or DNA or both varying in size (Fig 1D). These products, but not the ratio between the various products, increased with time. Importantly, the 2′,3‐cyclic phosphate product was most abundant at early times, 5% after 1 min, but then slowly decreased with time, while the formation of DNA‐peptide products, derived from Top1‐cc, proceeded more slowly and accumulated over time, up to 81% after 60 min (Fig 1D and E). These data are consistent with the model that the formation of the uridyl‐2′,3′‐cyclic phosphate‐terminated nick precedes the formation of the Top1‐cc (Fig 1A).
Repair of Top1‐cc by proteolysis, Tdp1 and Tpp1, and repair synthesis
One pathway for the removal of Top1‐cc is through the Tdp1‐dependent pathway, which hydrolyzes DNA‐3′‐tyrosyl peptides to DNA‐3′‐phosphate (Pouliot et al, 1999). We tested the ability of purified S. cerevisiae Tdp1 to remove the covalently linked Top1 from our DNA substrate. As previously reported for the human enzyme (Debethune et al, 2002), the Top1‐cc containing yeast Top1 is also resistant to Tdp1 treatment. However, the tyrosyl‐peptide‐DNA products isolated after proteinase K digestion of Top1‐cc (Fig 1D and F lane 1) were readily processed by Tdp1 to give predominantly a 14‐mer‐3′‐phosphate (lane 2), which was converted into the 3′‐hydroxyl 14‐mer upon additional treatment with Tpp1 (lane 4). In addition, small amounts of 15‐mer, 13‐mer, and 12‐mer products were also formed. These data indicate that the second incision by Top1 had occurred predominantly at a position two nucleotides upstream of the nick, releasing a dinucleotide‐2′,3′‐cyclic phosphate and a 14‐mer Top1‐cc. Incision at other positions than the −2 position from the nick accounted for 11% of the different size products formed.
To determine whether proteolytic degradation of Top1‐cc and hydrolysis by Tdp1 and Tpp1 are both necessary and sufficient steps to produce a substrate suitable for gap repair, the Top1‐cc was subjected to proteinase K digestion and the 5′‐labeled DNA‐tyrosyl peptides were isolated on a denaturing gel and hybridized back to a 64‐mer template (Fig 2). This family of DNA substrates failed to be extended by DNA polymerase δ (Pol δ), with its processivity factor PCNA present (lane 2). Neither did treatment of the DNA with Tdp1 or Tpp1 alone provide a suitable primer terminus for extension by Pol δ (lanes 4 and 6). Pol δ was only capable of extending the primer terminus after treatment with both Tdp1 and Tpp1 (lane 8). The extended product is resistant to treatment with sodium hydroxide (lane 9), consistent with the model in Fig 1A showing removal of the ribonucleotide by two subsequent Top1‐mediated events. Our data demonstrate that a Top1‐cc produced by Top1 action on misincorporated ribonucleotides during DNA replication, or by other pathways such as treatment of cells with camptothecin, can be efficiently repaired by partial proteolysis, followed by Tdp1‐ and Tpp1‐dependent restoration of a proper 3′‐hydroxyl terminus for repair synthesis by Pol δ.
Deletion formation at a dinucleotide repeat requires base pairing at the ligation junction
Having reconstituted the error‐free pathway of the repair of rNMP‐provoked Top1 activity, we next investigated the determinants that would induce deletion formation. Genetic studies in yeast have indicated that 2‐nt deletions in the context of a dinucleotide repeat sequence are the most frequent events of Top1‐dependent deletions produced in RER‐defective RNase H2 mutants (Kim et al, 2011). Deletions are proposed to occur by two sequential Top1 cleavages of the rNMP‐containing strand to eventually form the Top1‐cc, which is followed by an extrusion of a dinucleotide in the repeat sequence on the complementary strand that re‐aligns the 5′‐hydroxyl of the downstream strand with the Top1‐3′‐phosphate of the upstream strand, thereby allowing for religation to occur (Fig 3A). The substrate we used for our initial studies in Fig 1, with uridine at position 16, did not yield deletions, presumably because a dinucleotide extrusion in the TC repeat on the complementary strand did not give proper base pairing at the junction [Fig 3B, substrate (a)]. However, when the ribonucleotide was moved to position 14, a properly base‐paired junction was generated by TC extrusion on the bottom strand, and a 2‐nt deletion product could be observed albeit inefficiently [substrate (b)]. Reasoning that increased base pairing near the junction might stabilize the extrusion and thereby stimulate deletion formation, we increased the repeat from (GA)3 to (GA)4 and observed a marked increase in deletion formation [substrate (c)]. Changing the rG to rC in the terminal GA repeat eliminated the possibility for proper junction base pairing after extrusion of a TC dinucleotide on the bottom strand [substrate (d)]. However, some deletion product was observed, which we suggest could originate from dinucleotide extrusion to the right of the junction, which also leaves a base‐paired junction. The stability of this ligation junction is probably poor because the extrusion is only one nucleotide from the junction. Therefore, we changed the rG to rU, which increases the base pairing to the right of the junction, and this substrate was highly active for deletion formation [substrate (e)]. The proposed slippage re‐alignment for substrate (e) indicates that deletion formation should be independent of the (GA)3 repeat sequence. To test this, we changed the ‐GAGAGArUAT‐ sequence to ‐AGArUAT‐, which eliminates the GA repeats but still retains a strong Top1 recognition site (see below). This substrate (f) is very proficient for deletion formation, indicating that both GA and TA repeats can induce 2‐nt deletion formation as determined in the original mapping studies (Lippert et al, 2010; Nick McElhinny et al, 2010a). To our knowledge, this is the first biochemical demonstration of Top1‐dependent deletion formation at dinucleotide repeats.
We carried out a kinetic analysis of the rates of accumulation of Top1‐dependent products, comparing substrate (d) with (e). Within a factor of 2, the consumption of the 34‐mer substrate proceeded at comparable rates (Fig 3C). Likewise, the rate of appearance of the cyclic phosphate product and the initial rate of the formation of Top1‐cc also proceeded with comparable rates. However, while with substrate (d) the Top1‐cc kept on accumulating with time, the Top1‐cc of substrate (e) was converted into the 32‐mer deletion product. An estimate of the rates shows that deletion formation proceeded more than 100‐fold slower for (d) than for (e). Several control experiments described in detail in Supplementary Fig S2 were carried out to ascertain that the deletion product has lost the dinucleotide, as predicted by the model. First, while the substrate is susceptible to cleavage by NaOH due to the presence of the rNMP, the Top1‐dependent religation product is resistant to NaOH (Supplementary Fig S2A). Secondly, we sequenced the substrate and the product of the reaction and showed that the dArG dinucleotide had been deleted as predicted (Supplementary Fig S2B).
The nature of the nick structure determines processing by Top1
The kinetic analysis in Fig 1E shows that the cyclic phosphate intermediate is formed very rapidly, whereas Top1‐cc formation was relatively slow. Yet, the cyclic phosphate intermediate accumulated to only a few percent. This suggested to us that the formation of the cyclic phosphate might be a reversible reaction. This proposed interconversion of substrate and product is shown schematically in Fig 4D. In order to test this hypothesis, we generated the series of substrates shown in Fig 4A (boxed substrate), with the same sequence and ribonucleotide position as substrate (e) in Fig 3B, but with all possible nick configurations with regard to phosphate status. The oligonucleotide with the uridine‐2′‐3′‐cyclic phosphate terminus (U>p) was generated by partial RNase A digestion of a longer oligonucleotide and was contaminated with ~25–30% of the uridine‐3′‐phosphate form (Up, Supplementary Fig S3A). Remarkably, the substrate containing the U>p nick was very rapidly converted by Top1 into a ligated 34‐mer product (Fig 4B, lanes 5–10). Conversely, formation of Top1‐cc by Top1 cutting 2 nt upstream of the U>p nick occurred at a lower rate, and so did subsequent formation of the 32‐mer 2‐nt deletion product (Fig 4C, inset). The 34‐mer but not the 32‐mer contained the ribonucleotide, as it was sensitive to alkali treatment (lane 15). The substrate remaining after 10 min is predominantly the contaminating uridine‐3′‐phosphate nicked form (Supplementary Fig S3C, lanes 24 and 25). The uridine‐3′‐phosphate nick did not form a ligated 34‐mer product, but rather, it slowly formed the 32‐mer deletion product (lanes 11–14), which was resistant to alkali (lane 16). The phosphate‐less nick was also converted to the 32‐mer deletion product (lanes 1–3). However, the normal 5′‐phosphate nick was converted very slowly into a Top1‐cc, and this Top1‐cc failed to convert into the 32‐mer deletion product, because it lacked the 5′‐hydroxyl group that is essential for religation (lane 4). We also tested three all‐DNA substrates and determined that each of their reactivities with Top1 was comparable to that of the analogous ribo‐containing substrate (Supplementary Fig S3C). The rates of conversion of the seven nick substrates are plotted in Fig 4C, along with two additional control substrates that lacked the downstream oligonucleotide (Supplementary Fig S3C, U > p + − and Up + −). Their conversion to the Top1‐cc product occurred much slower than that of the analogous nicked substrates. Two substrates stand out in this comparative study. First, a regular DNA nick (T + pA), which is formed as an intermediate during virtually all DNA transactions, is basically inactive for modification by Top1. Second, the U > p + A nick is converted an order of magnitude faster than other nicked substrates, indicating that the U > p + A nick, formed by Top1 activity on genomic ribonucleotides, is an unstable high‐energy intermediate, that is rapidly processed.
Influence of ribonucleotide position in Top1‐cc and deletion formation
In order to understand the rNMP positional determinants for the formation of Top1‐dependent products, we tested the DNA substrate (c) from Fig 3B, but with the single rNMP placed at eight different positions on the top strand within and near the (GA)4 repeat, as well as on the complementary positions on the bottom strand (Fig 5, Supplementary Fig S4). We designate the position prior to the (GA)4 repeat as the −1 position, and the first nucleotide of the repeat as the 1 position, etc. The opposite positions on the (TC)4 strand are designated as −1′, 1′, etc. Figure 5A shows the distribution of products after 90‐min incubation of the oligonucleotide substrates with a 2‐fold molar excess of Top1. The analysis of the 16 substrates shows a large variation in the formation of Top1‐cc and of 2‐nt deletions (Fig 5A, Supplementary Fig S4).
Topoisomerase 1 shows strong sequence preference for cleavage. The consensus sequence for mammalian Top1 is 5′‐(A/T)(G/C)(A/T)T with the covalent linkage occurring at the 3′‐T residue (Champoux, 2001). We determined the DNA cleavage site preference for yeast Top1 in the presence of camptothecin, which freezes Top1 in its cleavage complex, on the top strand of the main oligonucleotides used in this study (Fig 5B, Supplementary Fig S4C). Analogous to mammalian Top1, yeast Top1 also showed a strong cleavage preference for the T residue on position 9 of the sequence in Fig 5B, which is a perfect consensus site (AGAT). Therefore, it is not surprising that placing a rU at that position resulted in the rapid formation of the Top1‐cc (Fig 5A). Of equal interest, however, are the positions on the DNA that did not show cleavage by Top1, for example, A‐4 and A‐6. Since the Top1 recognition sequence is about four nucleotides long, both the A‐4 and A‐6 position would present the same sequence to Top1, that is, GAGA. When an rA was placed in each of these positions, the rNMP‐containing DNA was resistant to Top1 treatment. Therefore, if a ribonucleotide were inserted at that position during replication, and if RER were defective, as in an RNase H2‐deletion strain, this ribonucleotide would be expected to remain in the genome.
Consistent with the known sequence preference of Top1, the GA‐containing top strand showed higher activity than the CT‐containing bottom strand. This is not that clearly seen in Fig 5, but the kinetic analysis in Supplementary Fig S4A and B provides strong support for this conclusion. After 10 min of reaction with Top1, the top‐strand positions reacted (to cyclic phosphate, Top1‐cc, and 2‐nt deletions) at an average of 24% (0–59%), whereas the bottom‐strand positions reacted to an average of 7% (0–27%).
Eight out of the sixteen rNMP‐containing positions examined were resistant to Top1; more than 90% of substrate remained after 90 min of reaction (Fig 5A). The other eight substrates showed significant product formation. Only one substrate (with the rNMP at position 3) accumulated as much as 21% of the cyclic phosphate intermediate over time. For the other substrates, a steady‐state level of up to 5% cyclic phosphate was maintained, while other products (Top1‐cc and 2‐nt deletions) accumulated over time (Supplementary Fig S4A and B). Only four of the substrates that accumulated substantial Top1‐cc over time, yielded significant (>5%) 2‐nt deletions. One possible reason why deletion formation would not occur could be because Top1 cutting upstream of the cyclic phosphate nick produced a different size gap than the 2‐nt gap we observed so far. Therefore, we investigated at which position upstream of the cyclic phosphate nick the second Top1 cleavage occurred, as described in Fig 1F (Supplementary Fig S4D). For the seven substrates investigated, the predominant second cleavage site was 2 nt upstream of the cyclic phosphate nick, and second cleavages at 3, 4, and 5 nt upstream were very infrequent. No second cleavage at 1 nt upstream of the first cleavage was observed.
The action of Top1 on genomic ribonucleotides is unique because of the potential for release of Top1 through the formation of a 2′,3′‐cyclic phosphate intermediate. With one single exception (when the ribonucleotide was at position 3), these cyclic phosphate intermediates did not accumulate to a substantial percentage (Fig 5A). Rather, Top1 has evolved the ability to recognize the 2′,3′‐cyclic phosphate and mediate fast religation to the starting ribonucleotide‐containing DNA (Fig 4A and D). The initial kinetics of the reaction between the 2′,3′‐cyclic phosphate‐containing nick and Top1 indicate that reversal is favored over additional cleavage by Top1 (Fig 4B and C inset). The reversal reaction is particularly important in situations where RER is compromised because of defects in RNase H2. A subset of Aicardi‐Goutières syndrome patients show a reduction in RNase H2 activity, thereby compromising the efficiency of RER (Chon et al, 2013). The Top1‐catalyzed conversion of the 2′,3′‐cyclic phosphate nick into rNMP‐containing DNA allows for another opportunity for processing of genomic ribonucleotides by error‐free RER.
In addition to reversal, the 2′,3′‐cyclic phosphate nick can also be processed by a second Top1 cleavage, which occurs predominantly 2 nt upstream of the nick (Supplementary Fig S4D). This reaction is irreversible because the small oligonucleotide generated by cleavage readily dissociates, as indicated in the model in Fig 6. DNA nicks have previously been shown to promote formation of covalent Top1 complexes on the 5′‐strand (Henningfeld & Hecht, 1995; Lebedeva et al, 2008). Our kinetic analysis of different nick architectures suggests that natural DNA nicks are actually very poor substrates for Top1 cleavage, while the 2′,3′‐cyclic phosphate nick is 100–1,000 times more reactive (Fig 4B and C). Phosphate‐less nicks or 3′‐phosphate nicks show intermediate reactivity. Therefore, Top1 is relatively inactive to normal intermediates of DNA metabolism, but has evolved to specifically process the cyclic phosphate intermediates that it generates at genomic ribonucleotides. At a low rate, other unusual nick architectures are also processed by Top1.
The very small gaps that are formed as a result of tandem Top1 action at genomic ribonucleotides are most often the subject of error‐free DNA repair. This repair may be initiated by ubiquitin‐mediated proteolysis of the covalently bound Top1 (Desai et al, 1997); however, processing by a specialized protease Wss1 has also been reported (Stingele et al, 2014). Hydrolysis of the DNA‐tyrosyl peptide by Tdp1 (Pouliot et al, 1999; Debethune et al, 2002), and of the 3′‐phosphate by Tpp1 or other cellular 3′‐phosphatases such as Apn1 or Apn2 (Vance & Wilson, 2001), produces a small phosphate‐less gap that should be an apt substrate for gap repair. Here, we have reconstituted the error‐free repair pathway with purified proteins, starting with the tyrosyl peptide intermediate that was generated by proteinase K rather than by ubiquitin‐mediated proteolysis or Wss1 (Fig 2).
Two alternative modes of processing of the small 2‐ to 5‐nt gaps with covalently attached Top1 can be envisioned. First, when sequence slippage re‐alignment in a repeat sequence is favorable, the 5′‐hydroxyl of the downstream DNA strand can be brought in proximity to the Top1‐3′‐phosphate of the upstream strand allowing DNA ligation with release of Top1 (Fig 6). In this paper, we focused on the most common gap of 2 nt that can produce a 2‐nt deletion. This is an extremely rare event in the cell, but it is very sensitive to genetic detection because of the frame‐shift mutations that can result because of it (Kim et al, 2011; Cho et al, 2013). Second, these gaps can be further enlarged by 5′‐nucleases such as Exo1. Gap enlargement would protect the Top1‐cc containing DNA from the slippage re‐alignment and ligation pathway that yields deletions. In fact, a recent study showed that yeast Exo1 and the helicase Srs2 may collaborate in this gap‐enlarging process, as mutations in either gene enhance the rate of Top1‐dependent frame‐shift mutagenesis (Potenski et al, 2014). In that study, the proposal was made that Exo1 and Srs2 may act at the 2′,3′‐cyclic phosphate‐terminated nicks that are generated after the first Top1 cleavage. However, since these nicks are very transient and are rapidly processed by a second Top1 cleavage, we propose that the Top1‐cc with the small gap is a more appropriate substrate for Exo1 and Srs2. In principle, these alternatives can be distinguished by using separation of function mutations in TOP1. We think that generating such separation of function mutations should be feasible because of a significant difference in DNA structures used by Top1 for each of the two steps. In the first step, Top1 recognizes upstream dsDNA and cuts 3′ of its binding site, while in the second step, Top1 recognizes a specialized nick and cuts a few nucleotides upstream of that nick. If such mutants were available, one would expect that a TOP1 mutant, which cuts the first but not the second time, would accumulate 2′,3′‐cyclic phosphate products and therefore would be predicted to have a greater reliance on Exo1/Srs2 if these enzymes act on the cyclic phosphate nick, as proposed (Potenski et al, 2014).
Repeat sequences are generally favorable sequences for slippage re‐alignment, which is a prerequisite for Top1‐mediated deletion formation. Our analysis of the influence of different sequence environments on deletion formation has given us an insight into the rules that govern this. Seven out of sixteen rNMP‐containing sequence environments examined produced a substantial fraction of Top1‐cc product with predominantly a 2‐nt gap (Fig 5A and Supplementary Fig S4D). For four of those, the Top1‐cc reacted further by slippage re‐alignment and ligation to give a 2‐nt deletion products. These four DNAs are characterized by their ability to realign either the upstream or downstream top strand to form a nick in which perfect base pairing occurs, similar to shown in Fig 3B. In addition, one of the two base pairs at the nick junction is a strong G‐C base pair. The substrate with the rNMP at position 9 did not produce a deletion because perfect base pairing at the nick cannot be accomplished. Interestingly, the substrate with the rNMP at position 7′ also did not yield a deletion, even though perfect base pairing was possible. However, in that case, the nick was anchored by two A‐T base pairs which may not have been able to generate a slippage re‐alignment product with enough stability for re‐ligation. Together with the data in Fig 3, we can conclude that deletion formation requires a re‐aligned sequence with perfect base pairing at the nick, which is anchored with at least one G‐C base pair.
Our studies provide evidence for the proposal that repetitive DNA sequences may allow for slippage re‐alignment of the DNA ends to achieve perfect base pairing with the bulging out of several nucleotides upstream or downstream of the site of religation (Cho et al, 2013). We have also reconstituted the predominant repair pathway of Top1‐initiated processing of genomic ribonucleotides, in which the Top1‐cc is processed by the Tdp1‐dependent pathway (Fig 2). Based on genetic data (Kim et al, 2011), we can make a rough estimate of the frequency of these three pathways in a wild‐type cell. RER repairs the vast majority (~0.95–0.99) of ribonucleotides, while error‐free repair via Top1‐cc occurs at a frequency of 0.01–0.05 and repeat‐induced deletion formation at a frequency of ~10−8–10−9. In rnh201Δ cells, processing by Top1 is dramatically increased, and so are the rates of deletions. These numbers are based on studies of the CAN1 locus (Kim et al, 2011) and may not accurately reflect rates in the entire genome. The increased burden of Top1‐mediated processing of ribonucleotides in an RER‐defective strain also results in genomic stress, in particular if the frequency of rNMP incorporation is increased, as in a pol2‐M644G mutant of Pol ε. The rnh201∆ pol2‐M644G double mutant shows constitutive checkpoint activation and an increased sensitivity to replication stress that is dependent on Top1 (Williams et al, 2013).
While this manuscript was under review, several papers appeared describing genome‐wide mapping data of genomic ribonucleotides in yeast RNase H2 mutants (Clausen et al, 2015; Koh et al, 2015; Reijns et al, 2015). These data should provide a basis for determining which genomic ribonucleotides are subject to attack by Top1 and which may result in the formation of 2‐ to 5‐nt deletions.
Materials and Methods
Proteins and oligonucleotides
RPA (Henricksen et al, 1994), PCNA (Eissenberg et al, 1997), RFC (Gomes et al, 2000), and FEN1 (Gomes & Burgers, 2000) were purified from E. coli overexpression systems, while Pol δ (Fortune et al, 2006) was purified from a yeast overexpression system. Tdp1, Tpp1, and Trl1 proteins were overexpressed from the Open Biosystems S. cerevisiae open reading frame library in the vector backbone BG1805 and purified as described (Gelperin et al, 2005).
Oligonucleotides were purchased from IDT (Coralville, IA) and purified by HPLC and/or PAGE chromatography. They are based on the 32‐mer yeast CAN1 sequence ACTCGTCACGAGAGATGCCACGGTATTTCAAA in which 2‐nt deletions have been detected (Kim et al, 2011). Sequence alterations and ribonucleotide modifications are as indicated. Oligonucleotides were either 5′‐ or 3′‐labeled by treatment with [γ‐32P]‐ATP and T4 polynucleotide kinase or [α‐32P]‐dATP and terminal transferase, respectively, or purchased with either a 5′‐ or 3′‐Cy3 fluorophore. Most rNMP‐containing oligonucleotides are contaminated with the 2′,3‐cyclic phosphate product, up to ~0.5%. This contamination increased when they were 5′‐32P labeled with polynucleotide kinase. For this reason, we preferred using Cy3‐labeled oligos, which have lower background levels. Quantification of cyclic phosphate products included subtraction of the background present at t = 0. Each labeled rNMP‐containing oligonucleotide was hybridized to a 2‐fold excess unlabeled complementary DNA oligonucleotide.
Top1 overexpression and purification
Top1 was overexpressed using plasmid pRS425‐GAL‐GST‐TOP1 containing the Schistosoma japonicum glutathione S‐transferase (GST) gene fused to the N‐terminus of the TOP1 gene in vector pRS425‐GALGST‐term (Walker et al, 1993; Bylund et al, 2006). The GST tag is separated from the N‐terminus of the Top1 by a recognition sequence for the human rhinoviral 3C protease (LEVLFQ/GP). Following cleavage by the protease, the N‐terminus of Top1 is extended with the GPEFDIKL sequence. Top1 overproduction was carried out in S. cerevisiae strain FM113 (MATa ura‐3‐52 trp1‐289 leu2‐3112 prb1‐1122 prc1‐407 pep4‐3), transformed with plasmid pRS425‐GAL‐GST‐TOP1. Growth, galactose induction, extraction preparation, and ammonium sulfate precipitation (0.3 g/ml) were similar to the procedures described previously (Bylund et al, 2006). The ammonium sulfate precipitate was resuspended in buffer A0 (buffer A: 60 mM HEPES‐NaOH [pH 7.4], 10% glycerol, 1 mM dithiothreitol, 1 mM EDTA, 0.01% polyoxyethylene (10) lauryl ether, 1 mM sodium bisulfite, 1 μM pepstatin A, 1 μM leupeptin; subscript indicates the mM sodium chloride concentration), until the lysate conductivity was equal to that of buffer A400. The lysate was then used for batch binding to glutathione‐Sepharose 4B beads (GE Healthcare), equilibrated with buffer A400, and gently rotated at 4°C for 2 h. The beads were collected at 1,000 rpm in a swinging‐bucket rotor, followed by batch washes (3 × 20 ml of buffer A400). The beads were transferred to a 10‐ml column and washed at 2.5 ml/min with 100 ml of buffer A400. The second washing was with 50 ml buffer A400 containing 5 mM Mg‐acetate and 1 mM ATP. And the third washing used 50 ml of buffer A400 and 30 ml of buffer A200. Elution was carried out at a flow rate of 0.2 ml/min with buffer A150 containing 30 mM glutathione (pH adjusted to 8.1). Fractions containing Top1 were combined and incubated overnight at 4°C with 30 U of rhinoviral 3C protease. The following day, the Top1 protein was loaded on a heparin column in buffer A150 without protease inhibitors. The column was washed with 10 column volumes of buffer A300, and the protein was eluted with buffer A750. Fractions containing pure Top1 were collected and dialyzed overnight to A200 without protease inhibitors.
The standard 10 μl assay mixture contains 20 mM Tris–HCl (pH 7.8), 100 μg/ml bovine serum albumin, 1 mM DTT, 5 mM Mg‐acetate, 50 mM NaCl, 10 nM of 32P‐end‐labeled oligonucleotide substrate or 25 nM of Cy3‐end‐labeled oligonucleotide, and enzyme. Incubations were carried out at 30°C for the indicated time periods. Deviations for the standard assay conditions are indicated in the legends of the figures. Reactions were stopped with stop buffer containing 10 mM final concentration of EDTA, 0.05% SDS, and 40% formamide. Samples were analyzed on a 7 M urea, 17% polyacrylamide gel. Cy3 gels were directly imaged on a Typhoon fluorescence imager, while 32P gels were dried and subjected to phosphorimaging. ImageJ was used for quantification. Images were contrast‐enhanced for better visualization.
Repair assays were generally carried out in two stages. In the first stage, the rNMP‐containing substrate was treated with a twofold excess concentration of Top1 to produce a mixture of full‐length substrate, rNMP 2′,3′‐cyclic phosphate nicked substrate, and Top1‐covalently linked substrate. The reactions were stopped with stop buffer as previously described. To purify the 2′,3′‐cyclic phosphate‐containing substrate, the DNA was ethanol‐precipitated and used for subsequent reactions. The Top1‐covalent linked substrate was further processed by treatment with 0.2 mg/ml proteinase K at 42°C for 30 min, followed by phenol–chloroform extraction and ethanol precipitation. These products were rehybridized to complementary DNA for further studies or treated consecutively with Tdp1 and Tpp1, as described in legends to figures, in order to generate the free 3′‐hydroxyl DNA.
All experiments were carried out three independent times or more, with the exception of the experiment in Fig 4, which was carried out twice, but showed high reproducibility. The errors of quantification were small for most products (< 5%). Errors were higher in the quantification of Top1‐cc and of complexes isolated by phenol extraction and ethanol precipitation (10–20%). The graphs in Figs 1 and 3 are representative experiments with errors of quantification shown. The graphs in Figs 5 and 6 show averages of two and three experiments, respectively, and standard deviations are shown.
JLS and PB designed and carried out the experiments and wrote the manuscript.
Conflict of interest
The authors declare that they have no conflict of interest.
Supplementary Figure S1
Supplementary Figure S2
Supplementary Figure S3
Supplementary Figure S4
The authors thank Carrie Stith for help with protein purification. This work was supported by grant GM032431 from the National Institutes of Health.
FundingNational Institutes of Health GM032431
- © 2015 The Authors