Advertisement

Crystal structure of a PIWI protein suggests mechanisms for siRNA recognition and slicer activity

James S Parker, S Mark Roe, David Barford

Author Affiliations

  1. James S Parker1,
  2. S Mark Roe and
  3. David Barford*
  1. 1 Section of Structural Biology, The Institute of Cancer Research, Chester Beatty Laboratories, London, UK
  1. *Corresponding author. Section of Structural Biology, The Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Road, London SW3 6JB, UK. Tel.: +44 20 7153 5420; Fax: +44 20 7153 5457; E-mail: david.barford{at}icr.ac.uk

Abstract

RNA silencing regulates gene expression through mRNA degradation, translation repression and chromatin remodelling. The fundamental engines of RNA silencing are RISC and RITS complexes, whose common components are 21–25 nt RNA and an Argonaute protein containing a PIWI domain of unknown function. The crystal structure of an archaeal Piwi protein (AfPiwi) is organised into two domains, one resembling the sugar‐binding portion of the lac repressor and another with similarity to RNase H. Invariant residues and a coordinated metal ion lie in a pocket that surrounds the conserved C‐terminus of the protein, defining a key functional region in the PIWI domain. Furthermore, two Asp residues, conserved in the majority of Argonaute sequences, align spatially with the catalytic Asp residues of RNase H‐like catalytic sites, suggesting that in eukaryotic Argonaute proteins the RNase H‐like domain may possess nuclease activity. The conserved region around the C‐terminus of the PIWI domain, which is required for small interfering RNA (siRNA) binding to AfPiwi, may function as the receptor site for the obligatory 5′ phosphate of siRNAs, thereby specifying the cleavage position of the target mRNA.

Introduction

RNA silencing mechanisms mediated by small RNAs regulate gene expression at both post‐transcriptional and transcriptional levels (Montgomery, 2004; Novina and Sharp, 2004). Currently, the most extensively characterised system is targeted mRNA degradation mediated by small interfering RNAs (siRNAs). siRNAs are 21–25 nt double‐stranded RNAs (dsRNAs) with 5′ phosphate groups and 3′ dinucleotide overhangs (Zamore et al, 2000; Elbashir et al, 2001a, 2001b; Nykänen et al, 2001). siRNAs are produced from dsRNAs through the action of the RNase III enzyme Dicer (Bernstein et al, 2001; Ketting et al, 2001; Knight and Bass, 2001; Lee et al, 2004; Zhang et al, 2004). This process may have evolved as a cellular defence response to viral infection (Li et al, 2002; Baulcombe, 2004) or the activity of transposable DNA elements (Tabara et al, 1999; Sijen and Plasterk, 2003). Related to the action of siRNAs are the mechanisms of silencing mediated by microRNAs (miRNAs) (Bartel, 2004; He and Hannon, 2004). miRNAs closely resemble siRNAs in size and structure. miRNAs are, however, encoded as stem–loop precursors in the genome, which are then processed through the action of the RNase III enzyme Drosha (Lee et al, 2002, 2003) together with Dicer (Hutvágner et al, 2001). Many miRNAs regulate gene expression by binding and inhibiting the translation of mRNAs (Wightman et al, 1993; Olsen and Ambros, 1999; Aukerman and Sakai, 2003), although in some instances miRNAs can also target mRNAs for destruction (Llave et al, 2002; Palatnik et al, 2003; Yekta et al, 2004). The mechanistic differences between siRNAs and miRNAs can be attributed partly to the degree of complementarity between the short RNA and its target (Hutvágner and Zamore, 2002; Doench et al, 2003). A third mechanism of silencing operates at the transcriptional level, by targeting chromatin remodelling factors to heterochromatic regions. As with the siRNA and miRNA pathways, the specificity of nucleic acid targeting (perhaps this time DNA) is mediated by small RNAs.

siRNAs (and many miRNAs) operate in large ribonucleoprotein complexes termed RISC (RNA‐induced silencing complex) (Hammond et al, 2000, 2001; Caudy et al, 2002, 2003). RISC assembly requires ATP (Nykänen et al, 2001) and proceeds via a series of intermediate subcomplexes (Pham et al, 2004; Tomari et al, 2004). During this process, the double‐stranded siRNA is unwound to incorporate only a single‐stranded RNA (ssRNA) molecule in the final active complex (Martinez et al, 2002). The relative stabilities of the two ends of the double‐stranded siRNA appear to determine which strand becomes incorporated: siRNAs are functionally asymmetric (Aza‐Blanc et al, 2003; Khvorova et al, 2003; Schwarz et al, 2003). Once assembled, siRNA‐RISC functions as a multiple‐turnover complex that recognises and cleaves mRNA strands complementary to the incorporated single‐stranded siRNA (Hutvágner and Zamore, 2002; Haley and Zamore, 2004). The mRNA is cleaved between the nucleotides paired to bases 10 and 11 of the siRNA—the RISC catalytic site appears therefore to be fixed relative to the binding position of the 5′ end of the siRNA (Elbashir et al, 2001a, 2001b). Cleavage is magnesium‐dependent and yields 3′ hydroxyl and 5′ phosphate groups on the resulting mRNA strands (Martinez and Tuschl, 2004; Schwarz et al, 2004).

As mentioned above, the method of miRNA‐directed translational inhibition shares many similarities with siRNA‐directed mRNA cleavage. It is proposed that for RISC to catalyse mRNA cleavage, a contiguous A‐form helix must be formed between the short guide RNA and the mRNA target in the region of the scissile bond and towards the 3′ end of the siRNA (Chiu and Rana, 2003; Haley and Zamore, 2004). Presumably, this is required to generate the correct reaction geometry in the active site. This would then be sufficient to explain the observed mechanistic differences between perfectly complementary siRNAs and imperfectly complementary miRNAs (Hutvágner and Zamore, 2002; Doench et al, 2003).

The function of short RNAs in chromatin remodelling is understood principally from studies in the yeast Schizosaccharomyces pombe. Orthologues of components of the siRNA/miRNA systems are required for methylation of lysine 9 on histone H3 (and thus the recruitment of the crucial remodelling factor HP1), gene silencing in heterochromatin and proper centromere and telomere function (Provost et al, 2002; Volpe et al, 2002; Hall et al, 2003). However, the effector complex in this case appears to be distinct from RISC (Sigova et al, 2004). Small RNAs are incorporated instead into a complex termed RITS (RNA‐induced initiation of transcriptional gene silencing) (Verdel et al, 2004). It is proposed that the RITS complex, guided by the small RNAs, localises remodelling factors to their sites of action. Recent genetic studies have uncovered potentially similar systems in Arabidopsis, Drosophila and mammals (Zilberman et al, 2003; Fukagawa et al, 2004; Morris et al, 2004; Pal‐Bhadra et al, 2004).

Argonaute proteins have been implicated by genetic and biochemical methods in all the mechanisms involving small RNAs discussed above (Carmell et al, 2002). A striking finding is that Argonaute family members are the only proteins common to all RISC‐related complexes and the RITS complex (Hammond et al, 2001; Hutvágner and Zamore, 2002; Martinez et al, 2002; Mourelatos et al, 2002; Liu et al, 2004; Tomari et al, 2004; Verdel et al, 2004). Indeed, it has been demonstrated recently that an Argonaute protein is the only polypeptide present in a highly purified active form of Drosophila RISC (Rand et al, 2004). This suggests that Argonaute proteins possess unique functions with regard to small RNAs involved in RNA interference. Argonaute proteins are characterised by two domains: an ∼20 kDa N‐terminal PAZ domain and an ∼40 kDa C‐terminal PIWI domain. Until recently, however, the biochemical functions of Argonaute proteins have remained obscure. A breakthrough was achieved with the determination of the structure of the N‐terminal PAZ domain (Lingel et al, 2003; Song et al, 2003; Yan et al, 2003). This structure suggested a role for PAZ in RNA binding, a proposal subsequently confirmed by the solution of a structure of PAZ in complex with an siRNA‐like RNA helix (Ma et al, 2004). This structure demonstrated that PAZ does indeed interact with RNA, and moreover that PAZ may serve as the binding module for the end of short dsRNAs and as the anchor for the 3′ end of short ssRNAs in RISC, and perhaps RITS. Phylogenetic analysis of eukaryotic Argonaute proteins delineates two subfamilies (Piwi and Ago), resembling Drosophila Piwi and Arabidopsis Ago1, respectively (Carmell et al, 2002). At present, no broad functional distinction exists between these two subfamilies.

Here, we present the crystal structure of a Piwi protein (AfPiwi) from the archaean Archaeoglobus fulgidus. The protein features a prominent positively charged channel reminiscent of RNA‐binding proteins, and we demonstrate that AfPiwi forms a distinct complex with an siRNA‐like RNA duplex. The identification of an RNase H‐like fold within the C‐terminal portion of the PIWI domain, together with the mapping of conserved residues onto the AfPiwi structure allows us to propose a mechanism for RISC‐mediated RNA cleavage. We identify two conserved regions in the molecule. The first set of residues cluster to a region of AfPiwi resembling the RNase H active site, suggesting that in eukaryotic Argonaute proteins this region may possess RNase activity. The second cluster of conserved residues, which coordinates a metal ion, and is rich in basic residues, is positioned within a pocket ∼20 Å from the putative RNase H‐like catalytic site. The distance between these two sites matches the dimensions between the obligatory 5′ phosphate of the guide RNA and the scissile phosphate between the nucleotides paired to bases 10 and 11 of the guide strand. These findings provide a molecular explanation for RISC‐mediated mRNA cleavage.

Results and discussion

Structure determination

The structure of full‐length AfPiwi (an isolated PIWI domain protein) was determined by means of MAD phasing using selenomethionine‐labelled crystals from data collected at BM14, ESRF. The polypeptide chain was traced using a combination of ARP/wARP and manual rebuilding, and the model was refined to an R‐factor of 19% at 1.9 Å resolution (Table I). The structure is well defined apart from the N‐terminal 10 residues and three surface loops, which are disordered.

View this table:
Table 1. Data collection and processing statistics

Description of overall structure

AfPiwi is roughly ellipsoid in shape with dimensions 70 Å × 50 Å × 50 Å. The molecule is organised into two major domains, which meet to form an extensive interface (Figure 1). Domain A (residues 44–170) has a relatively simple α/β sandwich architecture comprising a four‐fold repetition of a β‐strand–α‐helix structural motif creating a parallel four‐stranded β‐sheet, surrounded on both sides by a pair of parallel α‐helices. Domain B, the larger of the two domains (residues 171–427), also an α/β structure, is dominated by a central mixed β‐sheet consisting of eight β‐strands, flanked on both sides by two long α‐helices. The β‐sheet is divided into two roughly equivalent sized regions; a four‐stranded mixed β‐sheet is separated from a four‐stranded antiparallel β‐sheet by a structural discontinuity that creates a prominent crevice at the top of the β‐sheet. One face of the mixed β‐sheet is exposed to solvent. The connection between the two major domains is made by a β‐strand that links αD of domain A with the β5 strand of domain B. This β‐strand contributes to a small subdomain (termed the N‐domain) comprising the N‐terminal 40 residues of the protein that consists of a small two‐stranded antiparallel β‐sheet lying against a single α‐helix that spans the interface of domains A and B.

Figure 1.

Structure of A. fulgidus Piwi (AfPiwi). (A) Ribbon representation of AfPiwi coloured salmon (domain A), cyan (domain B) and orange (N‐domain). Positions of invariant residues are coloured red (group I, invariant in all Argonaute proteins), green (group II, Piwi and Ago subfamily switch residues), blue (group III, invariant to eukaryotic Argonaute) and yellow (group IV, highly conserved in all Argonaute proteins). (B) Surface representation of AfPiwi coloured coded as in (A). The three conserved regions are indicated CRI, CRII and CRIII, as is the cadmium ion bound to the C‐terminus. The interdomain crevice and domain B channel are labelled. (C) Stereoview showing a superimposition of AfPiwi and the PIWI domain of P. furiosus Argonaute PfAgo (pink) (PDB code: 1U04). The two proteins share identical architectures, although display small differences in relative orientations of domains A and B. The figure was produced using PYMOL (http://www.pymol.org).

Recently, the crystal structure of Pyrococcus furiosus Argonaute (PfAgo) was reported (Song et al, 2004). Unlike AfPiwi, which lacks a PAZ domain, PfAgo, in common with eukaryotic Argonaute proteins, features a PAZ domain N‐terminal to its PIWI domain. The A and B domains we observed in AfPiwi are exactly conserved in PfAgo; specifically, all secondary structural elements of these two domains share counterparts in PfAgo (Figure 1C). In their description of PfAgo, Song et al (2004) define the PIWI domain to a region of the protein equivalent to domain B of AfPiwi, whereas the region equivalent to domain A of AfPiwi is termed the ‘middle’ domain. Significantly, a structure‐based sequence alignment suggests that all eukaryotic Argonaute proteins comprise a fold resembling the A and B domains of AfPiwi (Figure 2 and Supplementary Figure 1). Because these domains converge on a crevice that is highly conserved among eukaryotic and bacterial Argonaute sequences (described below), suggestive of a key functional region in the protein, we propose that domains A and B represent the conserved and functional core of the PIWI domain, and have termed this structural unit the ‘PIWI fold’.

Figure 2.

Sequence alignment of AfPiwi with PIWI domains of eukaryotic Argonaute proteins from the Piwi and Ago subfamilies together with a structure‐based alignment of AfPiwi and PfAgo. Colour coding of conserved and invariant residues as in Figure 1. Residues coordinating the Cd2+ ion are indicated by red arrows, and the RNase H‐like catalytic Asp and Glu residues are indicated by blue arrows (AfPiwi residues 159, 254, 264 and 427). Argonaute proteins demonstrated to slice mRNA (HsAgo2, DmAgo2 and SpAgo) are bracketed in red. For a more extensive multiple sequence alignment, see Supplementary Figure 1. The figure was produced using ALSCRIPT (Barton, 1993).

The A and B domains resemble the lac repressor and RNase HII, respectively

Although the overall PIWI fold represents a novel protein architecture, the individual A and B domains display structural similarity to known protein structures. Domain A is most reminiscent of the core fragment domain of the lactose repressor. In the lac repressor, and related periplasmic arabinose‐binding protein, such domains occur as a tandem repeat, forming an interdomain cleft that functions to bind sugars. The finding that the topology of domain A of AfPiwi differs from that of the lac repressor, and also that domain A is not present within a tandem repeat, suggests that these proteins are not evolutionarily or functionally related.

Domain B adopts essentially an RNase H‐type fold, the fundamentals of which are three antiparallel β‐strands followed by two helix–loop–strand motifs (Figure 3) (Yang and Steitz, 1995). These secondary structural elements define a conserved catalytic core domain found in nucleases such as archaeal RNase HII, Escherichia coli and HIV RNase HI, ASV and HIV encoded integrases, and E. coli RuvC. Such enzymes catalyse the hydrolysis of phosphodiester bonds in their respective DNA or RNA substrates, and therefore the finding that PIWI domains share structural similarities to these proteins provides important insights into their function. In domain B of AfPiwi, the conserved core corresponds to the αE, αF and αH helices, and five adjacent β‐strands (β5–β9) (Figure 3). The three edge β‐strands (β10–β12) and the αG helix of domain B are accommodated as an insertion within this central core domain. Interestingly, the RNase H‐fold of RNase H and integrases incorporates two catalytic Asp residues located on adjacent β‐strands (β5 and β8 of AfPiwi). As discussed below, although the counterparts of these residues are not conserved in AfPiwi, significantly, with the exceptions of human Ago4 and Hiwi2, both of these residues are invariant in all eukaryotic Argonaute proteins.

Figure 3.

Structural similarities between domain B of AfPiwi and the RNase H‐type fold. (A) Structure of RNase HII from Methanococcus jannaschii (Lai et al, 2000; PDB code: 1EKE). (B) Domain B of AfPiwi. In both proteins, the RNase H‐like fold is coloured salmon. In (A), the catalytic Asp residues are displayed and the equivalent positions on the β5 and β8 strands of AfPiwi are indicated in red in (B).

The domain interface creates a prominent surface crevice surrounding the C‐terminus

An intriguing structural feature of AfPiwi is that the C‐terminal 10 residues of the protein thread themselves through the globular centre of the molecule, being clamped by the interface of the A and B domains (Figure 1). The C‐terminal four residues of Ago and Piwi proteins are always conserved as aliphatic and aromatic residues, and in the AfPiwi structure these side chains contribute to the hydrophobic core of the protein (Figures 1 and 2). The conformation of the C‐terminal Leu427 residue (invariant in all Piwi sequences and conserved as an aliphatic residue in Ago sequences) is tightly constrained by participation of its nonpolar side chain at this hydrophobic interface. Significantly, its C‐terminal carboxylate group projects onto the molecular surface, lying at the base of a prominent crevice created at the interface of the A and B domains. This crevice is the most noticeable surface feature of AfPiwi. One end is delineated by a ‘wall’ formed from the edge β4 strand and the αC and αD helices of domain A. Opposite to this, the crevice extends out into a channel ∼20 Å wide lying along the surface of domain B, generated by the structural discontinuity in the eight‐stranded β‐sheet. The sides of this channel are formed from the αH helix, the loop connecting the αG and αH helices, and the β5, β8 and β9 strands. It is likely that the crevice and channel perform crucial roles in mediating the functions of Argonaute proteins. Virtually all conserved and invariant residues of the PIWI fold of Argonaute proteins lie in this channel, which is also present in PfAgo, and since many of these are basic and polar residues, the channel has a pronounced positive electrostatic potential (Figures 1 and 2).

A metal ion‐binding site in the Piwi domain

In our crystal structure, a well‐ordered metal ion is bound to the C‐terminal carboxyl group of Leu427 at the base of the crevice between the two domains (Figure 4). The metal ion is hexa‐coordinated adopting an octahedral arrangement, with two ligands contributed by the carboxyl group of Leu427, and a third by the oxygen atom of the amide side chain of Gln159. Gln159 and Leu427 are invariant in all Piwi sequences (Figure 2). The three remaining ligands were modelled as chloride ions and water. Metal coordination by a C‐terminal carboxyl group is unusual, and the structure of this metal‐binding site suggests an interesting structural feature. Compared with the side chains of Asp and Glu residues, a C‐terminal carboxyl group is allowed reduced rotational freedom, and because Leu427 is clamped at the interface of the A and B domains, it is likely that the metal ion site in AfPiwi is quite rigid. AfPiwi was crystallised from 100 mM cadmium chloride, and crystallographic refinement was consistent with the assignment of this ion as cadmium.

Figure 4.

Metal‐binding site at the C‐terminus of AfPiwi involving Gln159 and Leu427. (A) Stereoview of a 2FoFc electron density map centred on the metal ion. (B) Details of the hexa‐coordination of the Cd2+ ion and surrounding conserved residues (colour‐coded as for Figures 1 and 2).

Conserved residues define three conserved regions

We constructed a multiple sequence alignment for AfPiwi, PfAgo and selected eukaryotic Argonaute proteins from the Piwi and Ago subfamilies, and classified conserved residues into four groups (Figure 2 and Supplementary Figure 1). Group I residues (coloured red in Figures 1, 2 and 4) are the universally invariant residues of AfPiwi and all eukaryotic Argonaute proteins. Group II residues (green) have been designated as class‐switch residues, that is, residues that define the Piwi and Ago subfamilies in eukaryotic Argonaute proteins. AfPiwi shares class‐switch residues with both subfamilies, making it difficult to assign AfPiwi to either. Group III residues (blue) are found in all eukaryotic Argonaute sequences but not AfPiwi. Finally, group IV residues are highly conserved in all sequences and highlighted in yellow in Figures 1 and 2. To gain insights into the location of functional regions of the PIWI domain, we mapped the positions of these conserved residues onto the AfPiwi structure (Figure 1).

Conserved region I (CRI) defines a putative phosphate‐binding site

Most conserved residues map to two distinct regions on the molecular surface of the protein, ∼20 Å apart, positioned within the channel and crevice of AfPiwi (Figure 1A and B). The largest conserved region is situated within the crevice at the domain interface, centred on the C‐terminal Leu427 residue and metal ion‐binding site. We have termed this region CRI. The αC and αD helices, and edge β4 strand of domain A, together with the αH helix and αG/αH loop of domain B contribute conserved residues to this site. The cadmium ion, coordinated by the carboxyl group of Leu427 and amide side chain of Gln159, is located at the centre of CRI. The physiological relevance of metal binding to AfPiwi is not known; however, our structural data indicate that AfPiwi has the capacity to bind a divalent metal ion at this site, and because Gln159 and Leu427 are invariant within all eukaryotic Piwi sequences, it is possible that the Piwi subfamily of Argonaute proteins may also share this capacity. Interestingly, the residue equivalent to Gln159 is an invariant Lys in all Ago subfamily members. In addition, whereas the C‐terminal residue is conserved as an aliphatic residue in Ago sequences, it is not an invariant Leu as seen in the Piwi subfamily, although the preceding stretch of ∼4–5 residues is well conserved in all Argonaute proteins (Figure 2). Together, these structural differences suggest that the putative metal ion‐binding site seen in AfPiwi may not be conserved in the Ago subfamily. The Lys of Ago proteins (equivalent to Gln159 in AfPiwi) may interfere with metal binding and could even substitute for the positive charge provided by a divalent ion, contacting the C‐terminal carboxyl group. The structural geometry of CRI is maintained in PfAgo, although a metal ion was not reported at this site (Song et al, 2004).

In the AfPiwi structure, the side chains of five highly conserved residues surround the metal ion and are within 5 Å of this site (Figures 1 and 4B). Two of these are basic residues, Lys127 and Lys163, which, together with Tyr123 and Gln137 of domain A, are invariant in all Argonaute sequences. Interestingly, the fifth residue, Arg380 of domain B, is shared with all Ago sequences but is an Asn in the Piwi subfamily. However, the structurally adjacent residue (155) is conserved as an Asn in AfPiwi and Ago sequences but is a Lys in all Piwi sequences, therefore maintaining a similar positive charge in both Piwi and Ago subfamilies (Figure 2). Together, these positively charged and polar residues result in a pronounced electrostatic positive charge at this site that is highly suggestive of a phosphate(s)‐binding site (Figure 5). Phosphates could be coordinated by the amino and guanidinium groups of Lys and Arg residues, respectively, and by the divalent metal at the centre of this site. Strikingly, all group II (class switch) residues map to this region.

Figure 5.

Model for an siRNA duplex associated with AfPiwi. (A) The molecular surface representation is coloured according to electrostatic potential, ramping from blue to red for positive to negative electrostatic potential. A 5′ phosphate of the guide strand RNA (yellow) is docked into CRI, placing the scissile phosphate of the target RNA strand (green) adjacent to the RNase H‐like catalytic site. (B) Ribbon diagram showing the RNA duplex docked into the groove of AfPiwi. CRI and CRII are indicated. Invariant basic residues are displayed in CRI. The distance between the 5′ phosphate of the guide RNA and the scissile phosphate of the target RNA is 18 Å, matching the distance between the metal ion bound to Leu427 and the side chain of the RNase H‐like catalytic Asp on β5. The catalytic Asp residues were modelled from the structural superimposition of M. jannaschii RNase HII onto AfPiwi domain B.

Conserved region II (CRII) defines an RNase HII‐like catalytic site

Basic residues of the αG/αH loop and αH helix extend the range of the positively charged area of CRI along the groove towards the second conserved region (CRII) (Figures 1B and 5). This second site is entirely contained within domain B and is centred on the adjacent β‐strands β5 and β8 (Figure 1). AfPiwi is less well conserved in this region than the eukaryotic Argonaute proteins. In the eukaryotic proteins and PfAgo, two conserved sequence motifs, the GxDV motif on β5 and the RDG motif at the C‐terminus of β8, define the structural conservation of this region (Figure 2). Strikingly, the two invariant Asp residues of these motifs are structurally equivalent to the catalytic site Asp residues of RNase H‐like nucleases (Figure 3) (Yang and Steitz, 1995). Their presence in eukaryotic Argonaute sequences suggests that a possible functional role of this domain is to catalyse RNA cleavage.

Conserved region III (CRIII) contains a Dicer‐interacting motif

A third conserved region (CRIII) that is confined to the eukaryotic Argonaute proteins, and not present in either AfPiwi or PfAgo, is located on the edge β‐sheet (strands β10–β12, residues 312–352 of AfPiwi) of domain B, sharing the same face of the molecule as CRI and CRII (Figure 1B). It has recently been shown that a fragment of this region (the ‘Piwi box’) mediates interactions with the RNase III‐A domain of Dicer (Tahbaz et al, 2004). It may be the case, therefore, that CRIII constitutes the docking site for Dicer, and as such would be crucial for loading small RNAs on to RISC or RITS. In the structure of PfAgo, residues of the N‐terminal domain and segment linking the PAZ and PIWI domains contact this region.

siRNA binding and the role of CRI

To test whether AfPiwi does indeed interact with RNA, we examined the interaction of the protein with an siRNA‐like RNA duplex (Figure 6A). This substrate contains the 5′ phosphate group (labelled) and 3′ dinucleotide overhang characteristic of double‐stranded siRNAs, and might also mimic the structure of a single‐stranded siRNA guide bound to a target mRNA, or DNA. AfPiwi originates from a thermophilic bacterium and is therefore active optimally at temperatures in the range of 60–85°C. However, to avoid melting of the RNA duplex at these elevated temperatures, we carried out the binding reactions at room temperature. Despite this, and the presence of a relatively high salt concentration (400 mM) required to keep the protein in solution, we observed the formation of a distinct protein–RNA complex using an electrophoretic mobility shift assay (EMSA) (complex 1; Figure 6B). The formation of this complex was confirmed using an ultraviolet (UV) crosslinking assay (Figure 6C), in which nucleic acid–protein complexes are captured in solution by exposure to 254 nm UV light. Additional higher‐order complexes were observed in the mobility shift assay at higher protein concentrations (Figure 6B); we attribute these to aggregation events triggered by the lower salt concentration in the electrophoresis buffer.

Figure 6.

AfPiwi forms a distinct complex with an siRNA‐like RNA duplex. (A) Sequence and structure of the self‐complementary RNA oligonucleotide used in this study. The RNA was labelled with 32P at the 5′ end (*) using T4 PNK. (B) EMSA assessing complex formation between the end‐labelled RNA (<0.5 nM) and increasing concentrations of AfPiwi (0, 0.2, 0.7, 2, 7, 20 and 60 μM; lanes 2–8). Binding reactions were analysed by nondenaturing PAGE. Lane 1 is a control demonstrating the absence of an interaction with a control protein (55 μM human protein kinase B). (C) UV crosslinking assay showing covalent complex formation between the RNA and AfPiwi upon UV irradiation. Samples were analysed by SDS–PAGE. Lanes 3–9 are equivalent to lanes 2–8 in (B). Lanes 1 and 2 are controls confirming the absence of covalent complex formation without UV irradiation (20 μM AfPiwi) or with a control protein (55 μM human protein kinase B), respectively.

We propose that CRI in the Piwi domain is conserved as a phosphate‐binding site, suggesting that this region would be important for RNA binding. To test this, we generated a mutant form of AfPiwi (AfPiwiMUT) extended by an additional residue (glycine) at the C‐terminus. Such a mutation appears not to be tolerated in evolution (Figure 2 and Supplementary Figure 1). Circular dichroism analysis could not detect any structural difference between AfPiwi and AfPiwiMUT, consistent with this region playing a key functional rather than structural role (Supplementary Figure 2). However, in both the UV crosslinking and mobility shift assays, we observed a substantial reduction in the affinity of AfPiwiMUT for the siRNA‐like RNA duplex (Figure 7A and B). We conclude that contacts via CRI play a key role in the interaction between AfPiwi and RNA.

Figure 7.

Mutation of CRI reduces the affinity of AfPiwi for RNA. (A) UV crosslinking assay assessing complex formation between the RNA duplex (<1 nM) and AfPiwi and AfPiwiMUT (each 7 μM). The autoradiograph and Coomassie‐stained gel are shown. (B) EMSA comparing complex formation by AfPiwi and AfPiwiMUT (7, 2 and 0.7 μM in lanes 1–3 or 4–6).

A model for PIWI domain–RNA interactions and mRNA cleavage

Our structural analysis of AfPiwi, coupled to the finding that AfPiwi binds an siRNA‐like RNA, allows us to present a model for the function of the PIWI domain of Argonaute proteins in mediating eukaryotic RNAi. In summary, we present a model in which CRI anchors the phosphate group at the 5′ end of the guide RNA strand. This would then position the scissile phosphate of an associated mRNA within CRII, adjacent to the catalytic Asp residues of the RNase H‐like active site. Cleavage would occur if the mRNA is complementary to the guide RNA in the region of the scissile bond and adopts the required helical geometry. The case for this model is elaborated below.

The features of the positively charged channel in AfPiwi are reminiscent of dsRNA‐binding proteins. Such proteins adopt distinct tertiary folds, but have in common a positively charged channel to engage the RNA sugar–phosphate backbone. Additionally, aromatic and polar residues interact with nucleotide bases and sugars. The groove in AfPiwi is strikingly similar to p19 of Tombusvirus, a protein that functions to suppress silencing of viral RNA by the host RNAi mechanism (Vargason et al, 2003; Ye et al, 2003). We used the crystal structure of the 21‐nucleotide RNA duplex bound to p19 to model an RNA molecule within the AfPiwi groove. Figure 5 shows that an RNA duplex could be readily accommodated within the AfPiwi channel allowing the phosphate–sugar backbone to contact the side chains of basic and polar residues that line the channel.

The finding that domain B of AfPiwi features an RNase H‐type fold, and that the two invariant catalytic Asp residues associated with this fold are conserved in the majority of eukaryotic Argonaute proteins, identifies Argonaute as a candidate for 'slicer’ in RISC. The same conclusion was drawn by Song et al (2004) from the structure of PfAgo, who also proposed that a conserved Glu residue (equivalent to Glu264 of PfPiwi; Figure 2) contributes a third carboxylate group to form a DDE catalytic triad. Moreover, in an accompanying paper, Hannon and colleagues demonstrated that mutation of the two conserved Asp residues in HsAgo2 abolished the cleavage activity of the associated RISC complex, while retaining the ability to bind siRNA (Liu et al, 2004). These results are consistent with genetic and siRNA‐knockout data showing that Argonaute proteins are required for RISC‐mediated mRNA cleavage (Liu et al, 2004; Meister et al, 2004; Okamura et al, 2004), and biochemical studies that show that the catalytic activity of RISC resembles that of an RNase H‐type enzyme (Martinez and Tuschl, 2004; Schwarz et al, 2004). The A‐form double‐helical conformation that provides the structural substrate for RNase H would in RISC be formed from the single‐stranded guide RNA and the associated mRNA target. Perhaps the strongest indication that Argonaute plays the role of Slicer in RISC comes from a recent study demonstrating that DmAgo2 is the only polypeptide in a highly purified active form of Drosophila RISC (Rand et al, 2004).

The putative catalytic Asp residues are conserved in the vast majority of eukaryotic Argonaute proteins, two exceptions being HsAgo4 and Hiwi2 (Figure 2 and Supplementary Figure 1). Do all of these proteins slice? When RISC complexes associated with HsAgo1–4 were tested, only HsAgo2‐associated complexes displayed cleavage activity (Liu et al, 2004; Meister et al, 2004). This indicates that other factors are required in addition to the presence of a catalytic site DDE motif to mediate RNA cleavage. These factors could be provided by residues from the PIWI domain itself, or from outside the PIWI domain, possibly contributed by another RISC subunit. The conservation of the DDE motif in HsAgo1 and HsAgo3, which were reported not to slice mRNA, is enigmatic. It is possible that HsAgo1 and HsAgo3 have diverged insufficiently from HsAgo2 to lose the ‘redundant’ catalytic residues. Alternatively, it is formally possible that these proteins may mediate cleavage under different untested conditions. The experiments of Liu et al and Meister et al are consistent with the emerging view that the multiple Argonaute paralogues in metazoa possess distinct, although related, biochemical functions (Caudy et al, 2002; Okamura et al, 2004). On the other hand, the single Argonaute protein present in S. pombe is responsible for mediating both siRNA‐triggered gene silencing and heterochromatin formation via the RITS complex (Sigova et al, 2004).

RNAi absolutely requires a phosphate group on the 5′ end of the guide RNA (Boutla et al, 2001; Nykänen et al, 2001; Chiu and Rana, 2002; Schwarz et al, 2002), and the position of cleavage in the target RNA strand appears to be measured from the 5′ phosphate. Specifically, Tuschl and colleagues demonstrated that hydrolysis occurs at the phosphodiester bond bridging the two RNA bases pairing to the 10th and 11th bases of the guide RNA (Elbashir et al, 2001a, 2001b). Together, these data predict an anchor site on RISC responsible for recognising the 5′ phosphate of the guide RNA, which then allows positioning of the target RNA scissile phosphate within the slicer catalytic site. The distance between the 5′ phosphate of the guide RNA strand and the scissile phosphate of the target mRNA strand in an A‐form helical conformation is ∼18–19 Å. Significantly, this is exactly the distance between CRI and CRII in the PIWI domain. As noted above, CRI possesses the hallmarks of a phosphate‐binding site, because of the density of basic residues and the presence of a metal ion. By modelling the 5′ phosphate of the guide RNA strand at this site, we found that the scissile phosphate of the target RNA could be positioned close to the two putative catalytic site Asp residues of the RNase H‐like fold (Figure 5). This model places the RNA duplex within the positively charged groove of PIWI, explaining the presence of basic, polar and aromatic residues within this groove conserved between AfPiwi, PfAgo and eukaryotic Argonaute proteins (Figures 1B and 2, and Supplementary Figure 1). Incorporation of both the 5′ phosphate receptor site and slicer catalytic site within the PIWI domain is consistent with the result of Rand et al (2004), that an Argonaute protein is the only polypeptide in a highly purified active form of RISC.

We have demonstrated that disruption of CRI by extension of the polypeptide chain at the C‐terminus significantly diminishes the affinity of AfPiwi for an siRNA‐like molecule, without perturbing the overall fold. The residual binding displayed by AfPiwiMUT may result from interactions within the conserved, positively charged groove. Moreover, Verdel et al (2004) show that tagging the C‐terminus of the S. pombe Argonaute protein renders it nonfunctional in vivo. Together, these results are consistent with our model for a functional interaction between the 5′ phosphate of an siRNA and CRI.

The PAZ domain, the N‐terminal conserved domain present in Argonaute proteins (and also found in Dicer), binds RNA with a preference for single‐stranded 3′ ends (Song et al, 2003; Lingel et al, 2004; Ma et al, 2004). Song et al (2004) modelled a double‐stranded siRNA into their PfAgo structure, placing the 3′ end of the RNA guide in the PAZ domain and the scissile bond of the target strand close to the active site in the PIWI domain. This analysis is complementary to our results, providing a model for the association of both ends of a single‐stranded guide RNA within a full‐length Argonaute protein and the specification of the cleavage site in a target mRNA.

Both AfPiwi and PfAgo originate from archaea, but the nature of the mechanisms related to RNA silencing in this kingdom are unclear. Our PSI‐BLAST searches could not retrieve archaeal homologues of either Dicer or RNA‐dependent RNA polymerase, two other key proteins involved in eukaryotic RNA silencing. Indeed, only a single RNase III domain (the catalytic domain of Dicer) could be detected, present in a small protein from Methanococcus maripaludis. However, it should be noted that until the structure of PfAgo was determined no PAZ domains were identified in archaea, so our searches may have failed due to substantial sequence divergence.

AfPiwi lacks the putative catalytic residues in CRII. Moreover, we did not find a metal located at this site in our crystal structure, despite the presence of a high concentration of cadmium in the crystallisation solution. Together with the fact that the protein does not possess a PAZ domain, it is at this stage difficult to predict the exact function of AfPiwi. We note that a close homologue of AfPiwi sharing these features exists in the eubacteria Streptomyces coelicolor. Genetic and biochemical analysis of this homologue could yield additional insights into RNA silencing mechanisms. Our identification of spatially distinct conserved regions of Argonaute proteins, which we propose perform defined roles, provides a framework for testing functions and mechanisms of RNA silencing by diverse Argonaute proteins.

Materials and methods

Cloning, expression and purification of AfPiwi

The ORF encoding A. fulgidus Piwi (AfPiwi) was amplified by PCR from genomic DNA and cloned into a modified version of pET‐17b (Novagen) incorporating an N‐terminal (His)6 tag and an intervening Prescission protease (Amersham) recognition site. Expression was carried out in the E. coli strain BL21(DE3) for 3 h at 37°C. AfPiwi was purified via Ni‐affinity chromatography, Prescission cleavage and gel filtration. For crystallisation, the protein was eluted in 10 mM Tris pH 8.0, 500 mM NaCl, 10% glycerol, 1 mM EDTA and 4 mM DTT. Selenomethionine‐substituted AfPiwi was produced in B834 (DE3) and purified as the native protein. AfPiwiMUT was produced by incorporating an additional codon (GGC) in a C‐terminal primer; purification was carried out as for wild type.

Crystallisation

All crystals were grown using the hanging drop method at 14 or 4°C. A 1 μl portion of protein solution (10 mg/ml) was mixed with 1 μl of a solution containing 0.1 M sodium acetate pH 4.6, 0.1 M CdCl2, 30% PEG 400 and 5 mM DTT. Crystals could be obtained reproducibly by streak seeding after a 3 h equilibration. Crystals grew after 5 days to a final size of 0.1 mm × 0.1 mm × 0.1 mm. For cryoprotection, crystals were immersed for 2 min in 0.1 M sodium acetate pH 4.6, 0.1 M CdCl2, 30% PEG 400, 10% glycerol and 750 mM NaCl, and then flash frozen at 100 K.

Data collection and refinement

Native data were collected on a single crystal of wild‐type AfPiwi at ID14‐EH3, ESRF. Se‐Met MAD data were collected at station BM14, ESRF. Data were processed using MOSFLM and scaled and merged in SCALA (CCP4, 1994) (see Table I for details). The structure was solved using Se‐Met MAD data. Initial Se sites were located using SnB. Phases from SHARP were then combined with the native 1.95 Å data and extended using DM. The initial model was built automatically by ARP/wARP. The structure was refined with REFMAC5 and manually rebuilt using COOT. There are also two Cd, five Ni and four Cl atoms in the model. These have been assigned by apparent electron density and are presumed to have come from the purification and crystallisation conditions.

End‐labelling and annealing of the RNA oligonucleotide

The RNA oligonucleotide was purchased deprotected and desalted from Dharmacon. The oligo was end‐labelled with [γ‐32P]ATP using T4 PNK (NEB) according to the manufacturers' instructions. Labelled oligo was purified on a MicroSpin G‐25 column (Amersham) equilibrated in 30 mM Hepes pH 7.5, 100 mM NaCl and 2 mM MgCl2. Annealing was performed by incubating the oligo at 90°C for 1 min and then in a large beaker of prewarmed water (70°C) for 3 h at RT followed by 16 h at 4°C.

Electrophoretic mobility shift assays

Binding reactions were performed at RT for ∼45 min in 50 mM Hepes pH 7.5, 400 mM NaCl, 2 mM MgCl2, 2 mM DTT and 50 ng/μl poly(dI‐dC) (Sigma). Poly(dI‐dC) was required to prevent precipitation in the wells of the gel. Glycerol (10%) was added to the reactions and the samples were analysed on nondenaturing 5% polyacrylamide gels in 1 × TBE buffer.

Ultraviolet crosslinking assays

Reactions were performed as for EMSA but without the poly(dI‐dC). Crosslinking was induced by exposure to 254 nm UV light in a Stratalinker 2400 (Stratagene) for 12 min. Samples were analysed by SDS–PAGE on 12% gels.

Supplementary data

Supplementary data are available at The EMBO Journal Online.

Supplementary Information

Supplementary Figure 1 [emboj7600488-sup-0001.pdf]

Supplementary Figure 2 [emboj7600488-sup-0002.pdf]

Acknowledgements

We thank staff at BM14 and ID14‐EH3, ESRF for help with data collection, Mark Williams (UCL) for the CD analysis, Joan Boyes and Tony Oliver for useful discussions, and Jing Yang for PKB. This work was funded by CR‐UK and the ICR. Coordinates and structure factors have been assigned ID codes 1w9h and r1w9hsf, respectively.

References