Twin‐ribozyme introns contain a branching ribozyme (GIR1) followed by a homing endonuclease (HE) encoding sequence embedded in a peripheral domain of a group I splicing ribozyme (GIR2). GIR1 catalyses the formation of a lariat with 3 nt in the loop, which caps the HE mRNA. GIR1 is structurally related to group I ribozymes raising the question about how two closely related ribozymes can carry out very different reactions. Modelling of GIR1 based on new biochemical and mutational data shows an extended substrate domain containing a GoU pair distinct from the nucleophilic residue that dock onto a catalytic core showing a different topology from that of group I ribozymes. The differences include a core J8/7 region that has been reduced and is complemented by residues from the pre‐lariat fold. These findings provide the basis for an evolutionary mechanism that accounts for the change from group I splicing ribozyme to the branching GIR1 architecture. Such an evolutionary mechanism can be applied to other large RNAs such as the ribonuclease P.
The list of naturally occurring ribozymes comprises a few that are fundamental for cellular life (the ribosome, RNase P, and possibly the spliceosome), two types of splicing ribozymes that are abundant in organellar and microbial genomes (within group I and group II introns), and a number of cleavage ribozymes with a sporadic occurrence in viroids, plant satellite RNAs, bacteria and, more recently, within the human genome (hammerhead, hairpin, VS, HDV, glmS, and the CPEB3 ribozymes). Apart from the ribosome, all naturally occurring ribozymes catalyse phosphor transfer reactions (Ditzler et al, 2007; Scott, 2007; Serganov and Patel, 2007). A recent addition to the list is the GIR1 branching ribozyme. This ribozyme (Figure 1) catalyses cleavage of the RNA chain by transesterification resulting in the formation of a 2′,5′ phosphodiester bond between the first and the third nucleotide of the 3′‐cleavage product. The downstream cleavage product is an mRNA encoding a homing endonuclease (HE) that is thereby capped with a lariat containing 3 nt in the loop (Nielsen et al, 2005). Both GIR1 and the downstream HE mRNA are inserted into a peripheral domain of a regular splicing ribozyme (GIR2) making up the characteristic configuration of a twin‐ribozyme intron. Such introns have so far only been found in the SSU rDNA genes of a unique isolate of Didymium iridis and in several Naegleria strains where it has been vertically inherited from a common ancestor (Johansen et al, 2002; Wikmark et al, 2006; Nielsen et al, 2008). The biological function of GIR1 appears to be in the formation of the 5′ end of the HE mRNA during processing from the spliced out intron and the resulting lariat cap seems to contribute by increasing the half‐life of the HE mRNA (Vader et al, 1999; Nielsen et al, 2005), thus conferring an evolutionary advantage to the HE.
One of the interesting features of GIR1 is that the sequence and the secondary structure are very similar to that of eubacterial group IC3 introns at the second step of splicing (Figure 2), suggesting an evolutionary relationship with this specific subgroup of splicing ribozymes (Johansen et al, 2002). The secondary structure of GIR1 displays paired segments numbered P3–P10, similar to what is known in group I introns (Figure 2). The paired segments are generally shorter than those observed in group I introns consistent with the fact that the shortest form of DiGIR1 shown to catalyse branching in vitro is only 179 nt (Nielsen et al, 2005). Both ribozymes are organized as a compact bundle of three helical stacks (Figure 2; domains P3–P9, P4–P6, and P10–P2 (group I ribozyme) or P10–P15 (GIR1)). Group I intron classification is based on structural variation of peripheral elements organized around a very well‐conserved catalytic core (Michel and Westhof, 1990). In contrast, the main distinctive features of GIR1 towards group I ribozymes occur within the catalytic core. Several characteristic single‐stranded junctions tether the helices of a catalytic core containing a double pseudoknot in a way that leads to significant topological modifications (Figure 2). GIR1 harbours a substrate domain different from the canonical P1 and P2 elements. However, the biochemical and mutational data presented in this study support the idea that they are replaced by a distinctive and unique 9‐bp P15 stem starting with a GoU pair that should be able to dock onto the catalytic core in a way similar to that observed for group I introns. The close resemblance of GIR1 to a splicing ribozyme in an unrelated group of organisms and the structural organization of twin‐ribozyme introns may be related to the propagation of group I introns by horizontal transfer. Group I introns are considered as mobile elements due to their sporadic occurrence in a wide variety of organisms, including protists, fungal mitochondria, bacteria, and phages (Haugen et al, 2005). Many lines of evidence point to reverse splicing and homing as mechanisms by which group I introns can transfer horizontally (Goddard et al, 2001; Bhattacharya et al, 2005; Haugen et al, 2005). The homing mechanism is well documented and appears to be particularly relevant to GIR1 because its activity is intimately related to the expression of a HE mRNA (H Nielsen, in preparation).
The intriguing observation that GIR1 and the group I splicing ribozymes are structurally related, yet carry out different reactions (splicing versus branching) prompted us to revise our previous structural model of GIR1. This model (Einvik et al, 1998b; Johansen et al, 2002) was based on structure probing and mutational studies. It predated the discovery of the branching reaction (Nielsen et al, 2005) and could not account for this reaction. The model presented in this study is based on new mutational data and furthermore benefits from the recent crystal structures of various group I ribozymes (Guo et al, 2004; Adams et al, 2004a, 2004b; Golden et al, 2005) in the sense that the GIR1 regions organized identically in group I introns could be modelled more accurately. In our new model, residues that are key to the branching reaction lie within a pocket formed at the interface of P10, P15, P7, and J5/4. All the distinctive features of GIR1 concentrate in this pocket and result in a topology very different from what is observed in group I ribozyme crystal structure models. The structure of the critical J8/7 segment of group I introns is dramatically changed and has been partly replaced by residues belonging to the GIR1 lariat fold J9/10. Other key features are the detachment of the nucleophile from the GoU pair at the catalytic site and a structural alteration of the GoU pair receptor. Taken together, these structural differences account for the different chemical reaction catalysed by GIR1. Comparison of the models of the Azoarcus tRNAIle intron at the second step of splicing and GIR1 suggests a relatively simple model for the conversion of the topology of one ribozyme to the other based on strand mispairing. Similar scenarios can apply to other RNAs, for example, RNase P, and could constitute a general way of viewing the evolution of RNA molecules.
In this section, the structure model of the Didymium GIR1 (DiGIR1) ribozyme is extensively compared with the Azoarcus group I ribozyme (Azo) crystal structure (Adams et al, 2004a). Hence, secondary structure elements and nucleotides corresponding to Azo are underlined throughout the text. The secondary structure of DiGIR1 and the similar ribozyme from Naegleria (NaGIR1) is generally supported by enzymatic and chemical probing (Einvik et al, 1998a; Jabri et al, 1997; Jabri and Cech, 1998). Furthermore, the Naegleria structure is supported by covariations observed in most of the helical stems when NaGIR1 from different strains are compared (Johansen et al, 2002; Wikmark et al, 2006). NaGIR1 performs a branching reaction similar to that of DiGIR1 (H Nielsen, unpublished data) supporting the notion that the two GIR1 ribozymes adopt similar secondary structures. DiGIR1 harbours an additional domain P2/P2.1 (Einvik et al, 2000) not found in NaGIR1. This domain is excluded from the model because it is currently impossible to discriminate between several different models.
Extension of the P15 stem
Modelling of GIR1 is facilitated by the presence of a double pseudoknot at the core. In addition to the P3–P7 pseudoknot also found in all group I ribozymes (Michel and Westhof, 1990), a second pseudoknot, P3–P15, is found as a characteristic feature of GIR1 (Einvik et al, 1998b). P15 arises from base‐pairing interactions between the 5′ strand of P2 with residues that could be derived from the 3′ strand of P8 and from J8/7, while the 3′ strand of P2 has been shortened and now makes up the J15/3 segment (Figure 3). Thus, one can visualize P15 as replacing the shallow/minor groove interactions taking place between J8/7 and P2, which are conserved in group I ribozymes (Strauss‐Soukup and Strobel, 2000; Soukup et al, 2002). Inspired by the comparison with Azo, we now propose an extension of P15 involving residues 205–207. Residues A205 and A206 appear to be equivalent to A residues in J8/7 responsible for recognition of the P1–P2 substrate (Figure 2). J8/7 is a highly conserved joining segment in group I ribozymes that is part of the active site and makes contacts with all of the three principal domains of the group I ribozyme. During the first and second steps of splicing, the two conserved adenosines at the 5′ end of J8/7 are involved in recognition of the P1–P2 interface (Pyle et al, 1992; Tanner et al, 1997; Strauss‐Soukup and Strobel, 2000; Adams et al, 2004a). In the original model of GIR1, a P1 was not included but could arise from a 3‐bp extension of P15 resulting from the base complementarity between residues A205–U207 from J15/7 and U111–G109 separating P10 from P15, respectively. The existence of these three base pairs could originate from the interaction between a P1 having lost its 5′ exon making the residues from the internal guide sequence (IGS) prone to base pair with J8/7. Indeed, the crystal structure of the Tetrahymena group I ribozyme (Guo et al, 2004) shows that residues in J8/7 are directed towards the solvent when the substrate domain P1–P2 is absent. It is therefore likely that some J8/7 residues could form Watson–Crick interactions with a substrate domain containing unpaired nucleotides as it occurs when the IGS is separated from the 5′ exon. To confirm this possibility, disruptive and restoring mutations of the central base pair U110–A206 were tested by kinetic cleavage analyses (Supplementary Figure S1). In vitro, GIR1 catalyses (i) a forward branching reaction in equilibrium with (ii) a very efficient reversed reaction, and (iii) an inefficient hydrolytic cleavage reaction (Nielsen et al, 2005; Nielsen and Johansen, 2007). The outcome of the reaction can be analysed by primer extension with stop signals at the branch nucleotide or at the cleavage site representing branching and hydrolysis, respectively. All disruptive mutations resulted in reduced cleavage rates. The double mutations that restored base pairing (U110A–A206U and U110C–A206G) performed branching at a rate comparable to that of wild type. The possibility for nucleotides A205 and A206 to engage in base pairing with U110 and U111 additionally suggests that G109 base pairs with U207 to form a continuous helical stack at the junction between P10 and P15.
The GoU pair at the P10–P15 interface
The secondary structure of DiGIR1 allows for two different possibilities of forming a GoU pair at the catalytic site in analogy with the GoU pair in P1. In both cases, G109 is involved but the pairing partner could either be U207 or the branch nucleotide U232, as in the original model (Einvik et al, 1998b) (Figure 4A). To distinguish between these two possibilities, the effect of mutations of the involved nucleotides on the cleavage rate (Figure 4B) and on the type of reaction (branching versus hydrolytic cleavage; Figure 4C) was assessed. The wild type is characterized by predominant cleavage by branching with only a small fraction (<10%) of stop signal, indicating hydrolytic cleavage after a 4 h incubation. Mutations that disrupt the catalytic pocket would be expected to affect both the branching and the hydrolysis rates. Such an activity loss has been previously observed following mutations of the ωG nucleotide (G229; Johansen et al, 2002) and disruption of the ωG‐binding site in P7 (G174C; Decatur et al, 1995). Conversely, mutations that affect the positioning of the U232 relative to the ωG without disrupting the catalytic pocket would be expected to shift the reaction from branching to hydrolysis. The mutation G109A maintains the base‐pairing potential with U232 or U207. The effect of the mutation is a moderate reduction in cleavage rate (Johansen et al, 2002). However, the reaction results in more hydrolysis than branching products. The mutations U232C and U207C similarly maintain the ability of these residues to base pair with G109. The effect on the cleavage rate is a comparable reduction in cleavage rate to that of G109A. However, cleavage in the U232C mutant results in more branching than hydrolysis product as in wild type, whereas cleavage in the U207C mutant results in more hydrolysis than branching product. The disruptive mutant U207A displays an even more reduced cleavage rate and cleaves almost exclusively by hydrolysis. The accumulation of more hydrolysis than branching product as seen in G109A and U207 mutants is an unusual phenotype as judged from our analysis of over 50 GIR1 mutants. Furthermore, mutants U232A and U232G that would likely disrupt base pairing involving this nucleotide cleave more by branching than by hydrolysis similar to U232C (H Nielsen, in preparation). These observations are in favour of base pairing of G109oU207 instead of G109oU232 as in the original model (Einvik et al, 1998b). In this way, the critical GoU pair at the active site belongs to a P1‐like helix (the extended P15) as in splicing group I ribozymes and not to P10 as in the original model (Einvik et al, 1998b). A further implication is that U207 forming the GoU pair does not provide the nucleophile for the branching reaction as it occurs in group I ribozymes. Rather, U232 lies in the shallow/minor groove of the G109oU207 pair, where it potentially interacts with the amino group of G109 to drive the branching reaction (Figure 5A). These mutational data furthermore validate the 3‐bp extension of P15, which contributes significantly to the re‐design of the catalytic core by forming a continuous helical stack between P10 and P15.
Recognition of the substrate domain P10–P15 by J5/4
In group I ribozymes, the GoU pair in P1 is recognized by a wobble receptor located at the interface of P4 and P5 (Michel and Westhof, 1990; Wang and Cech, 1992; Strobel and Cech, 1994; Strauss‐Soukup and Strobel, 2000). When viewed in secondary structure diagrams, the structure of this interface is a 3‐nt symmetrical internal loop. In DiGIR1, the interface between P4 and P5 is asymmetrical with a 4‐nt junction, 5′‐GUAA, as J5/4 and no intervening nucleotides at the 5′ strand (Supplementary Figure S2A). Furthermore, J5/4 is one of the most variable segments as deduced from the Naegleria GIR1 sequence alignment albeit with conserved features (Wikmark et al, 2006). To assess the importance of J5/4 in DiGIR1, systematic mutational analysis of J5/4 residues was performed. Major alterations of the structure, such as deletion of J5/4, substitution of J5/4 with 5′‐UUCG, or deletion of the bulged U156 all resulted in a complete loss of activity (data not shown). Substitution of the individual nucleotides resulted in decreased cleavage rates in all cases (Supplementary Figure S2B). The effect of mutating G150, U151, and A152 was moderate but the effect of the A153G mutant was dramatic pointing to this nucleotide as a key nucleotide for reactivity. Taken together, these results demonstrate an important function of J5/4 in GIR1 consistent with a preserved role of this structure in GoU recognition at the active site.
Alterations in the catalytic core do not affect the overall structure
The double pseudoknot provides a high level of constraint that guarantees confident model building of this region. The three stems P15, P3, and P8 together form a three‐way junction (3WJ) already constrained in the P3–P15 pseudoknot. In the present model, the extended P15 is docked along P3 and adopts a parallel orientation with the co‐axial stack occurring between stems P3 and P8. This conformation is promoted by the presence of the fairly long J15/3 stretch that forms a loop capping P15 and is able to interact in the shallow groove of P8 (Figure 3). A recent survey of 3WJ structures shows that J15/3 is part of a kind of 3WJ that occurs at 10 ribosomal RNA locations and in several other RNA crystal structures (Lescoute and Westhof, 2006). Moreover, the above‐mentioned survey shows that, when present in the longest loop, adenine residues are instrumental in stabilizing the junction architecture through the formation of A‐minor interactions in the narrow groove of the facing stem, hence mimicking the GNRA/tetraloop receptor inter‐domain interactions between P2 and P8 observed in group I introns crystal structures (see below).
Consequently, the single strands connecting P15 to the neighbouring helices can be considered as characteristic features distinguishing GIR1 from group I ribozymes. The constraints due to the 3WJ and to the double pseudoknot result in P15 occupying the same place as P2 (Figure 3). Furthermore, P15 directly stacks onto P10 by taking advantage of the 3‐bp extension of P15 that was not considered in the original model (Einvik et al, 1998b). Hence, P10 and P7 adopt a relative position to the 3‐bp P15 extension similar to what is observed for stem P1 in the crystal structures of group I introns (Adams et al, 2004b; Golden et al, 2005). This conformation is also supported by the fact that it leads to the formation of a pocket where all the structural elements necessary to form the catalytic site, namely ωG, U232, and the G109oU207 pair from P15 are gathered, a condition not satisfied by other tested models of 3WJ.
P8 and P9 were then directly connected to the double pseudoknot to form the catalytic domain. The connections between the P3–P9 catalytic core and the P4–P6 domain of GIR1 are similar to what is observed in Azo crystal structure (Adams et al, 2004a). In other words, J3/4 and J6/7 are modelled so as to weave the same contacts as those observed in Azo with the shallow groove of P6 and the narrow groove of P4, respectively (Figure 2). The last residue from J6/7 (A171) plays the same role as in Azo by providing stacking continuity between G229 (ωG) and the closest residue from J9/10 (C230), which corresponds to the last residue of J8/7 (A172) in Azo (Figure 5B). Since the P4–P6 domain is connected to the core as in group I introns, P6 consequently resides in the vicinity of P3, and the P4–P5 interface is able to contact the P10–P15 interface. Regarding the P7–P9 interface, a very discrete difference occurs. P7 is tethered to P9 without the intervening A residue frequently observed in group I ribozymes. This observation is important because J7/9 has been proposed to sequester ωG during the first step of splicing (Rangan et al, 2004), a condition not necessary in GIR1.
Apart from tertiary interactions specifically found in the core of the ribozyme, the group I intron architecture is stabilized by three sets of tertiary interactions (Figure 2). The first two interlock elements P2 and P6 with P8 and P3, respectively, and the third one allows L9 to contact P5 (Jaeger et al, 1996). The structural homology between group I introns and GIR1 ribozymes would plead for the existence of similar inter‐domain interactions. However, these interactions are necessarily affected by the fact that the secondary structure elements from GIR1 are different or shorter than in group I introns.
The double pseudoknot corresponds to a motif swap for the known interaction between P2 and P8 (Michel and Westhof, 1990; Salvo and Belfort, 1992; Costa and Michel, 1995). The resulting model suggests that J15/3 replaces the tetraloop located at the tip of P2 and interacts in the shallow groove of P8 (Figure 3). As in the Tetrahymena ribozyme (Guo et al, 2004), the loop receptor in GIR1 P8 consists of two consecutive G=C pairs instead of an 11‐nt receptor motif as observed in Azo (Adams et al, 2004b). J15/3 loops over itself to enter the 5′ strand of P3. U123 interacts with U187 and the resulting base pair intercalates at the P3–P8 interface. U122 base pairs with U118 and provides stacking continuity between P15 and J15/3. A120 and A121 form A‐minor interactions in the minor groove of P8. This conformation is supported by mutants of U residues to C that affect the structure of the 3WJ (data not shown).
In Didymium, P6a does not exist and J6/6a thus becomes the tetraloop L6 capping P6. Hence, L6 is perfectly located to form the recurrent interaction between J6/6a and P3 (Waldsich et al, 2002; Adams et al, 2004a; Golden et al, 2005) using A‐minor interactions (Doherty et al, 2001; Nissen et al, 2001) between A residues from L6 and two consecutive G=C pairs from P3 in the shallow/minor groove. Moreover, this tertiary contact is supported by mutational analysis of the two consecutive A residues from L6 (data not shown) and by the secondary structure of the NaGIR1 which displays a P6 element longer than in DiGIR1, albeit interrupted by an A‐rich internal loop that could presumably function as J6/6a (Einvik et al, 1998b). It is noteworthy that the two tertiary interactions described above are also supported by chemical probing experiments showing that A residues important for the described contacts are protected from DMS and DEPC (Einvik et al, 1998b).
In contrast to the previous interactions, the characteristic interaction formed in group I between L9 and P5 is not conserved in GIR1 ribozymes (Figure 2). In DiGIR1, P9 is short (4 bp) and does not contain any hinge point that allows it to bend towards P5. However, its 5′‐GAAA tetraloop could interact with a receptor embedded in the P2/P2.1 extension (Einvik et al, 2000; Nielsen et al, 2005) that was not included in the model (Figure 1).
Organization of the catalytic core
The next step consisted in understanding the architecture of the catalytic core in this unforeseen structural context. Around the catalytic region, the only fully identical feature shared by group I and GIR1 ribozymes resides in the organization of the ωG (G229) binding pocket (Figures 2 and 5). The Watson–Crick edge of G229 H‐bonds with the Hoogsteen edge of the second G=C pair of P7, and is stabilized by stacking interactions between the first C=G pair of P7 and A171 from J6/7.
On the opposite side of the ribozyme, the J5/4 junction on which the substrate domain docks is organized quite differently in GIR1 compared to Azo. The dramatic loss of activity in the A153G mutant is consistent with its protection from chemical modification by DMS (Einvik et al, 1998b), and justifies orienting this key residue towards the core of the ribozyme. To achieve this, the two 5′ nucleotides from J5/4 (G150 and U151) are placed so as to lie in the deep/major groove of P4 to improve the stacking continuity between P4 and P5. A kink performed around the phosphate group of A152 allows A153 to loop back into P5 ejecting A152 and A153 towards P10–P15. Thus, J5/4 becomes part of the catalytic pocket and shields P4 (Supplementary Figure S3). In such a situation, A153 can bind the GoU base pair from P15 as does A58 in J4/5 (Adams et al, 2004b). Moreover, A152 interacts with U110 to provide a tandem of A‐minor interactions. These A‐minor interactions account for the observed loss of activity in the A153G mutant since G residues are rarely observed in contact with G=C base pairs in this motif (Doherty et al, 2001; Nissen et al, 2001). We propose that the interaction between P1 and J4/5 is replaced by A‐minor tandem interactions involving the G109oU207 and U110–A206 pairs resulting from the extension of P15 with A153 and A152 from J5/4, respectively.
The lariat residues C230 and A231 replace important residues from the J8/7 junction in group I ribozyme
To suggest a relevant position for the residues involved in the lariat, a best‐fitting lariat model with a 3‐nt loop, obtained by an NMR study of an A2′‐pG branched RNA (Agback et al, 1993) was accommodated in the catalytic pocket between P7, P5, and P10. The NMR models of these lariat RNAs provide a starting model from which several structural features can be characterized. The short length of the lariat loop forces the ribose–phosphate backbone to form the inner ring of the loop while ejecting the base moieties on the outside. As a result, the base rings cannot stack together but occupy distinct volumes in which they could interact with other chemical moieties. A lariat harbouring the DiGIR1 5′‐CAU sequence while keeping the conformation of the RNA studied by NMR (Agback et al, 1993) was docked in the active site in a search for the best orientation. In the course of the refinement, the lariat was debranched to model a conformation corresponding to the pre‐cleavage state. The ring formed by the lariat is short and tight with a kink around the phosphate group of A231 forcing base moieties of residues C230 and A231 to point towards structural elements forming the catalytic pocket with which they can interact (Figure 4A).
Interestingly, the lariat residues could be placed within the pocket left free following the relocation of residues 207–209 from J15/7 that extend P15. In this position, A231 and C230 take over the role of residues C171 and A172 from J8/7 in their ability to interact towards P4 and P7, respectively. C230 stacks with A171 (J6/7) strengthening the deep groove 4‐nt stack including G229, and taking the place occupied by the 3′ A residue from J8/7 in Azo (Figure 5). A231 points towards J5/4 as a consequence of the lariat sharp turn and places this nucleotide at hydrogen bonding distance of A153. Although a base‐pairing interaction is implicated, a geometry explaining the deep effect of the A231G mutant could not be clearly deduced based on chemical footprinting data (H Nielsen, in preparation). In the course of the catalytic formation of the GIR1 U2′‐pC lariat, C230 and U232 are covalently attached following the nucleophilic attack of the O2′ group of U232 onto the phosphorus atom of C230. It is thus reasonable to place these chemical groups at H‐bonding distance (2.8 Å) by taking advantage of the closest oxygen atom of the phosphate group. In the conformation proposed, the 2′‐hydroxyl group of A231 interacts with the oxygen atom of the phosphate group of C230, which is not already in contact with the O2′ atom of U232 (Figure 4A). A deoxy substitution scan experiment (Nielsen et al, 2005) pointing out the important role of the 2′‐hydroxyl group of A231 comes to support the proposed architecture since nucleotides at the 3′ end of J8/7 are involved in coordinating the catalytic magnesium ions using phosphate oxygen atoms and/or hydroxyl groups. Hence, the model strongly suggests that magnesium ions are relocated in J9/10 in the vicinity of C230 and A231. J15/7 and J9/10 complement each other to stabilize the ribozyme catalytic core by forming a composite J8/7 junction that coordinates magnesium ions, and places the nucleophilic U residue in close vicinity of the targeted phosphate group.
Building a new structural model for the GIR1 ribozyme was prompted by the recent finding that the ribozyme catalyses the formation of a short lariat containing a 3‐nt loop by transesterification (Nielsen et al, 2005). The new model is consistent with previous biochemical and mutational data and incorporates new mutational data presented in this study. The model shows that the group I ribozyme substrate stems P1 and P2 are replaced in GIR1 by a distinctive and unique 9‐bp P15 stem starting with a GoU pair. The modelling strategy relied on the existence of a double pseudoknot involving stems P3 and P7 on the one hand (a mandatory feature of group I intron catalytic core structure), and stems P3 and P15 on the other hand (Figure 3). The proposed architecture of this highly constrained double pseudoknot is consistent with the conformation of the 3WJ additionally encompassing P8 (Lescoute and Westhof, 2006). Apart from the characteristic P15, the secondary structure of GIR1 is similar enough to canonical group I introns to unambiguously claim their phylogenetic relationship. Surprisingly, the elements distinguishing GIR1 from the group I splicing ribozymes lie within the usually very well‐conserved catalytic core (Michel and Westhof, 1990). The different topology results in a core that despite the marked similarity in base‐pairing scheme between GIR1 and the group I ribozymes at the second step does not carry out splicing. Rather, the position of the nucleophile is shifted from the last base pair in P1 to the interface between the analogous P15 extension and P10, thereby allowing for the branching reaction. Thus, the function of carrying the nucleophile is detached from the GoU pair and the catalytic reaction occurs in cis rather than in trans.
The topological differences between the catalytic cores of group I and GIR1 ribozymes resulting from the presence of the double pseudoknot heavily impacts the architecture of the catalytic pocket. GIR1 harbours a G‐binding pocket in P7 identical to the pocket observed in Azoarcus pre‐tRNAIle intron. The extended substrate helix P15 (analogous to P1–P2) is recognized by the protruding J5/4 using two consecutive A‐minor interactions (Doherty et al, 2001; Nissen et al, 2001) instead of the base‐pair tandem formed between the sugar and Hoogsteen edges (Leontis and Westhof, 2001) of A residues in J4/5. Residues C230 and A231 in the loop of the lariat fold replace key residues from Azo J8/7 (A172 and C171) in their ability to interact with P7 and P4, respectively. Since in Azo, C171 and A172 are involved in binding the two magnesium ions that are required for catalysis (Adams et al, 2004b), it is tempting to suggest that residues constituting the lariat fold provide some of the ligands for binding the active site metal ions.
Shortening of J8/7 may account for the appearance of the branching reaction
The most dramatic feature of the topological change in GIR1 is that the joining segment that connects to the 5′ strand of P7 comes from P15 (J15/7) rather than from P8 (J8/7) and that it has been shortened down to 3 nt as a consequence of the extension of P15. Hence, J15/7 is stretched and adopts a very different path compared to J8/7 in Azo. As a consequence, the two 5′ residues are excluded from residing inside the pocket and are located towards the outer shell of the molecule and the branching residue U232 is allowed to dock in the shallow groove of P15 and be accommodated into the catalytic pocket. The 3′ residue of J15/7 (A210) functionally replaces the fourth nucleotide of J8/7 (G170) in Azo by interacting with J7/3. Hence, the space left free by the absence of the two 3′ nucleotides of J8/7 can be occupied by the two first residues (C230 and A231) from the lariat fold. These nucleotides are followed by the branching U232, with a 3′‐hydroxyl group already tethered to the downstream RNA chain. U232 is positioned similarly with respect to the cleavage site as the 3′ U of the 5′ exon in Azo. The absence of a free 3′‐hydroxyl group in an environment prone to bind magnesium ions and generate nucleophilic oxygen atoms from hydroxyl groups may have driven the O2′ of U232 to attack the facing phosphate group of C230 and form the 3‐nt lariat characteristic of GIR1 ribozymes.
GIR1 topology may have arisen by drift of the 3′ strand of P2, the 5′ strand of P3, and J8/7 sequences of the ancestor intron
The three most notable features of GIR1 are that (i) it has so far only been found in the setting of twin‐ribozyme introns, (ii) it closely resembles the eubacterial IC3 introns, and (iii) despite this close similarity, GIR1 catalyses a branching reaction rather than splicing. Although it is generally difficult to demonstrate any evolutionary path, only a few discrete events would be required to account for the emergence of the GIR1 branching ribozyme from group I introns. For the emergence of the twin‐ribozyme configuration, it is plausible that a bacterial intron invaded a group I intron containing a HEG insertion. Myxomycetes are rich in nuclear rDNA introns with a relatively large proportion containing HEGs and the possibility of an invading bacterial intron is supported by the recent observation of a sister intron to Dir.S956‐1 in the myxomycete Diderma (SD Johansen, unpublished data). The Diderma intron is located at the exact same rDNA position as the Didymium intron and has an almost identical group I splicing ribozyme with a very similar HEG inserted into the P2 segment. However, the Diderma intron lacks a GIR1 ribozyme and thus may represent the pre‐existing receptor intron in the model. The configuration of an intron within an intron is reminiscent of the case of group II/III twintrons (Copertino and Hallick, 1993) and renders the invading intron in a situation with no stringent requirement to preserve the splicing activity. This could have set the stage for the subsequent transition of the invading intron into a branching ribozyme.
In the present study, we have shown that the difference between the Azoarcus intron representing the IC3 group I introns and the GIR1 branching ribozyme consists mainly of topological changes in the core. At the sequence level, we noticed that a single transposition event of the 5′‐GUGUUC stretch from the 3′ strand of P15 of the wild‐type GIR1 to A120 in J15/3 would restore the topology and the base‐pairing scheme found in Azo (Supplementary Figure S4). Further, at the 3D level, exchanges of phosphate bonds at positions that come in fairly close distance on the Azo crystal structure would result in the GIR1 topology (Figures 2 and 6A). From a mechanistic point of view, a model based on sequence drift and strand exchange promoted by the absence of a selection pressure for splicing can be envisaged. In this scenario, a gradual sequence change leads to sequence similarities between the 5′ strand of P3, the 3′ strand of P2 and J8/7 resulting in a strand pairing switch within the core (Figure 6B). This evolutionary pathway via alternative pairing is similar to the mutational drift experimentally demonstrated between the HδV ribozyme and the artificial class III ligase (Schultes and Bartel, 2000). In the evolutionary model of GIR1, misfolding has been promoted by the loss of the 5′ exon that has driven the J8/7 from GIR1 to form a pseudo P1 that expanded to form P15. It is noteworthy that this process gives rise to the double pseudoknot (P3–P7 and P3–P15) that contributes to the energetic stabilization of the core of the ribozyme. The higher stability conferred to the ribozyme core by the appearance of the double pseudoknot may have been important to allow the peripheral domains to evolve with only minor implications on the core structure explaining why they are reduced to short appendices. In a context with low selection pressure towards splicing, since GIR1 was already embedded in a self‐splicing intron, this process relied on sequence similarities between the segments of the ribozyme involved in mispairing (P1, P2, P3, and J8/7). Misfolding of a group I ribozyme around J8/7 and P3 has been experimentally observed (Pan and Woodson, 1998). The misfolding generating GIR1 may have been positively selected by allowing the branching reaction to occur, which conferred a selective advantage by increasing the half‐life of the HE mRNA (H Nielsen, in preparation; Johansen et al, 2007). The topological shuffle described here fully accounts for the topological changes observed between the core of group I ribozymes and GIR1, the appearance of the double pseudoknotted structure with the extended P15, and the redefinition of the role of J8/7 (now J15/7).
The proposed mechanism for evolution of new RNA molecules may apply to other RNAs. The two main families of ribonuclease P ribozymes (Darr et al, 1992a, 1992b) can be distinguished by secondary structure changes occurring in a single contiguous region: the path from family B to family A involves lengthening of stem P3, disruption of stem P5.1 with formation of a new pseudoknot P6 (Figure 7). In contrast, in the HδV/ligase case (Schultes and Bartel, 2000), all pairing stems are involved in strand exchange. Even though the above scenario leading to GIR1 evolution seems to be the most relevant, we cannot rule out that some unknown transposition events could have taken place at the RNA level with subsequent transfer to the DNA level by reverse transcription and integration into the genome or by the recently described RNA‐directed DNA repair mechanism (Storici et al, 2007).
In conclusion, we have provided a model that correlates the branching activity of GIR1 with its topological difference compared to that of group I splicing ribozymes. We suggest also an evolutionary mechanism for the emergence of GIR1 based on the shuffling between functional motifs promoted by sequence shift and alternative pairings. Additional proofs will be needed and could be inspired from studies of other GIR1 ribozymes, such as those found in Naegleria (Einvik et al, 1997; Jabri et al, 1997; Johansen et al, 2002; Wikmark et al, 2006).
Materials and methods
In vitro mutagenesis
Extension of P15. Mutations at U110 were introduced by PCR using Pfu DNA polymerase of a wild‐type GIR1 template (pDi162G1 Decatur et al (1995)) and oligos C377 (see Supplementary data for details) or C378 as the 5′‐oligo and OP12 as the 3′‐oligo. The PCR product was re‐amplified to make templates for in vitro transcription (see below). Mutations of A206 were introduced by in vitro mutagenesis using the Quick Change site‐directed mutagenesis kit (Stratagene) and oligos C405/C406 and C407/C408. To make double mutants, the U110 mutations were introduced into the A206 mutated templates.
GU pair. Construction of G109A and U232C were previously published (Johansen et al, 2002). Mutations at U207 were made as described above using oligos C477/C478 or C479/C480.
J5/4. Mutations were made in a wild‐type GIR1 template as described above and oligos C415/C416, C424/C425, C270/C271, and C417.
Cleavage analyses and primer extension
Templates for in vitro transcription were made from wild‐type and mutant templates by PCR using Pfu DNA polymerase (Fermentas) and oligos C287: 5′‐AATTTAATACGACTCACTATAGGGTTGGGAAGTATCAT and C288: 5′‐TCACCATGGTTGTTGAAGTGCACAGATTG. C287 carries a T7 RNA polymerase promoter. The run‐off transcript from the PCR template includes 162 nt upstream and 65 nt downstream of the cleavage site. All templates were transcribed in vitro using T7 RNA polymerase (Fermentas) with trace amounts of [α‐32P]UTP. Cleavage analysis was performed as described in Einvik et al (2000). Briefly, radioactively labelled in vitro transcripts were renatured in 1 M KCl, 25 mM MgCl2 at pH 5.5 for 10 min at 45°C. Then the reaction was jump started by increasing the pH to 7.5 by addition of Hepes‐KOH. Time samples were withdrawn and run on 6% denaturing (urea) polyacrylamide gels. The gels were analysed on storage phosphor screens and the data fitted to a nonlinear first‐order decay equation. The experiments shown are representative results of 3–5 independent experiments. Primer extension analysis was carried out as described (Einvik et al, 1998b) using end‐labelled oligo C291: 5′‐GATTGTCTTGGGAT. Sequencing ladders were made using the same primer and the plasmid pDi162G1 (Einvik et al, 1998b) as template. The reactions were analysed on 8% denaturing (urea) polyacrylamide gels.
Molecular modelling was performed as described in Masquida and Westhof (2005). The lariat model taken from Agback et al (1993) corresponds to the RNA lariat with a 3‐nt loop in which all residues presents a C2′‐endo conformation taken from http://www.boc.uu.se/boc14www/res_proj/final_struct/pictures/Welcome.html (file cGUAC_md25_A.pdb). The lariat was debranched to allow the phosphate group of the 5′ residue to be tethered to ωG. The sequence of J9/10 was applied to the lariat fold using the program fragment embedded in the manip software (Massire and Westhof, 1998). This program was also used to build in three dimensions (pdb file format) all GIR1 pieces similar to the corresponding group I intron regions. All the 3D elements were assembled interactively on a SGI Octane graphical workstation (IRIX64 v6.5, IP30) using the manip software. Each step of manual modelling was followed by several least‐square refinement step (Westhof et al, 1985). The modelling/refinement cycles were iterated until a model satisfying all the constraints was obtained. Figures were prepared using the PYMOL program (DeLano WL, The PyMOL Molecular Graphics System (2002) http://www.pymol.org). Secondary structure diagrams in Figure 2 were directly generated from the PDB files using the program S2S (Jossinet and Westhof, 2005).
Supplementary data are available at The EMBO Journal Online (http://www.embojournal.org).
This project was supported the Danish Natural Science Research Council. BB was supported by CEE BAC RNA program (LSHG‐CT‐2005‐018618) and the Lundbeck Foundation. We thank Pascale Romby for critical reading of the manuscript.
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
- Copyright © 2008 European Molecular Biology Organization