Open Access

Csy4 relies on an unusual catalytic dyad to position and cleave CRISPR RNA

Rachel E Haurwitz, Samuel H Sternberg, Jennifer A Doudna

Author Affiliations

  1. Rachel E Haurwitz1,
  2. Samuel H Sternberg2 and
  3. Jennifer A Doudna*,1,2,3,4
  1. 1 Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
  2. 2 Department of Chemistry, University of California, Berkeley, CA, USA
  3. 3 Howard Hughes Medical Institute, University of California, Berkeley, CA, USA
  4. 4 Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  1. *Corresponding author. Department of Molecular and Cell Biology, University of California, 708A Stanley Hall, Berkeley, CA 94720, USA. Tel.:+1 5106430225; Fax:+1 5106430080; E-mail: doudna{at}


CRISPR‐Cas adaptive immune systems protect prokaryotes against foreign genetic elements. crRNAs derived from CRISPR loci base pair with complementary nucleic acids, leading to their destruction. In Pseudomonas aeruginosa, crRNA biogenesis requires the endoribonuclease Csy4, which binds and cleaves the repetitive sequence of the CRISPR transcript. Biochemical assays and three co‐crystal structures of wild‐type and mutant Csy4/RNA complexes reveal a substrate positioning and cleavage mechanism in which a histidine deprotonates the ribosyl 2′‐hydroxyl pinned in place by a serine, leading to nucleophilic attack on the scissile phosphate. The active site catalytic dyad lacks a general acid to protonate the leaving group and positively charged residues to stabilize the transition state, explaining why the observed catalytic rate constant is ∼104‐fold slower than that of RNase A. We show that this RNA cleavage step is essential for assembly of the Csy protein‐crRNA complex that facilitates target recognition. Considering that Csy4 recognizes a single cellular substrate and sequesters the cleavage product, evolutionary pressure has likely selected for substrate specificity and high‐affinity crRNA interactions at the expense of rapid cleavage kinetics.


Many prokaryotes resist viral infection by means of an adaptive immune system that relies on one or more CRISPR (clustered regularly interspaced short palindromic repeats) loci (Haft et al, 2005; Makarova et al, 2006, 2011; Barrangou et al, 2007; Karginov and Hannon, 2010; Al‐Attar et al, 2011; Wiedenheft et al, 2012). CRISPRs contain short virus‐ or plasmid‐derived sequences that are positioned between copies of a repeated sequence (Bolotin et al, 2005; Mojica et al, 2005; Pourcel et al, 2005; Sorek et al, 2008). Small RNAs generated from the CRISPR locus (crRNAs) assemble with CRISPR‐associated (Cas) proteins to form targeting complexes that can base pair with nucleic acids containing complementary sequences, leading to their destruction (Barrangou et al, 2007; Brouns et al, 2008; Marraffini and Sontheimer, 2008; Hale et al, 2009; Garneau et al, 2010).

The production of small RNAs from the CRISPR locus is a hallmark of CRISPR‐based immunity (Marraffini and Sontheimer, 2010; Terns and Terns, 2011). Precursor transcripts encompassing the full‐length locus are cleaved within each repeat sequence to generate mature crRNAs that consist of a spacer sequence flanked by portions of the repeat sequence (Marraffini and Sontheimer, 2010). CRISPR‐Cas immune systems fall broadly into three types, in which similar tasks are accomplished using distinct sets of enzymes (Makarova et al, 2011). In the type II CRISPR system, RNase III cleaves an RNA duplex formed by the CRISPR repeat and a trans‐activating CRISPR RNA (tracrRNA) (Deltcheva et al, 2011), while in the type I and type III systems, a CRISPR‐specific endoribonuclease cleaves the repeat elements in a sequence‐specific fashion (Brouns et al, 2008; Carte et al, 2008, 2010; Haurwitz et al, 2010; Gesner et al, 2011; Lintner et al, 2011; Sashital et al, 2011; Sternberg et al, 2012). We previously demonstrated that Csy4 (also known as Cas6f) is the enzyme responsible for crRNA production in CRISPR subtype I‐F (Haurwitz et al, 2010).

Csy4 is a 21.4 kDa protein that recognizes its RNA substrate via sequence‐ and structure‐specific contacts. It cleaves cognate RNAs at the 3′ end of a five‐base‐pair stem‐loop, generating crRNAs comprising a unique spacer sequence flanked by 8 and 20 repeat‐derived nucleotides on the 5′ and 3′ ends, respectively. Csy4 has equally tight affinity for both its substrate pre‐crRNA and product crRNA, binding both with a 50 pM equilibrium dissociation constant (Sternberg et al, 2012). A single mature crRNA and one copy of Csy4 are components of the large ribonucleoprotein (RNP) Csy targeting complex (Wiedenheft et al, 2011b), but the mechanism of Csy complex assembly is currently unknown.

RNA cleavage by Csy4 is divalent metal ion‐independent and requires chemical activation of a ribosyl 2′‐hydroxyl for internal nucleophilic attack on the phosphodiester bond (Haurwitz et al, 2010). In the previously reported crystal structures of Csy4 bound to substrate RNA, we used a construct lacking the 2′‐hydroxyl nucleophile upstream of the scissile phosphate to abrogate cleavage. The structures revealed three active site‐proximal residues: Ser148, His29, and Tyr176 (Figure 1A). crRNA biogenesis was strongly inhibited by S148C and H29A mutations, while a Y176F mutation exhibited near wild‐type activity. This mutational analysis led us to speculate that Ser148 plays a role in activating and/or positioning the 2′‐hydroxyl for nucleophilic attack because it is located in close proximity to the 2′ carbon. Based on structural and biochemical evidence, we hypothesized that His29 may act as a proton donor for the 5′‐hydroxyl leaving group because mutation of His29 to lysine partially preserved catalytic activity (Haurwitz et al, 2010).

Figure 1.

Amino acid contributions to catalysis. (A) Csy4 active site from Csy4/substrate complex (PDB ID 2XLK). Active site residues are shown in stick format and the scissile phosphate is marked with an asterisk. The hydrogen bonds of the base pair between nucleotides C6 and dG20 are shown as dashed lines. (B) Representative single‐turnover cleavage assays with wild‐type and mutant Csy4. No protein (NP) controls shown at left. (C) Single‐turnover cleavage analysis of wild‐type and mutant Csy4. Data plotted are average of triplicate experiments and error bars represent the standard error of the mean (s.e.m.). Solid lines represent fits to an exponential equation. (D) pH‐rate profile for wild‐type and H29K Csy4. Rapid cleavage kinetics above pH 9.5 for wild‐type Csy4 prevented accurate determination of the rate. Each data point is an average of three independent experiments and error bars represent the s.e.m. Data were fit according to the equation described in the Materials and Methods.

Here we investigated the chemical mechanism of Csy4‐catalyzed CRISPR RNA cleavage. Three crystal structures of wild‐type and mutant Csy4 bound to product RNAs, coupled with kinetic analyses of mutant Csy4 cleavage rates, suggest a substrate positioning and cleavage mechanism in which Ser148 holds the 2′‐hydroxyl nucleophile in place and His29 deprotonates it for attack on the scissile phosphate. The lack of both a general acid and positively charged residues in the active site explains the observed rate constants that are 103‐ to 104‐fold slower relative to other metal ion‐independent ribonucleases. We additionally demonstrate that CRISPR transcript processing by Csy4 is essential for subsequent formation of the Csy complex in vivo. Given the essential role Csy4 plays in formation of this targeting complex, slow cleavage rates in conjunction with highly accurate substrate selection likely ensure that cognate pre‐crRNA substrates are cleaved with little to no off‐target activity on other cellular RNAs.


His29 functions as a general base to activate the 2‐hydroxyl nucleophile

Our previous biochemical analysis of Csy4 implicated a serine residue as the general base or as important for substrate positioning and a histidine residue as a general acid in the transesterification reaction catalyzed by Csy4 (Haurwitz et al, 2010). In our previous experiments, we conducted single time‐point (5 min) reactions. This method may obscure mutants that have severe cleavage defects but nevertheless retain a low level of activity, and so to more accurately investigate the specific involvement of the proposed catalytic dyad and other active site‐proximal residues during pre‐crRNA cleavage (Figure 1A), we performed quantitative single‐turnover cleavage assays with various mutants and determined their corresponding first‐order rate constants (Figure 1B and C; Supplementary Table S1). Alanine substitution of the active site histidine abolished all activity, indicating that His29 contributes an essential catalytic function. To further investigate its role, we evaluated the pH dependence of Csy4‐catalyzed RNA cleavage. The resulting pH‐rate profile (Figure 1D) exhibits a sigmoidal shape and reveals that cleavage rates increase monotonically with pH. These data are consistent with the catalytic requirement of a single titratable residue having a pKa≈7.9 that is active only in its deprotonated state. Consistent with our previous work, a Csy4 mutant with lysine substitution of His29 retains cleavage activity, albeit with ∼130‐fold slower kinetics than wild‐type (Figure 1C; Supplementary Table S1). The pH‐rate profile for RNA cleavage by the H29K mutant has the same shape as wild‐type but is shifted to a higher pH (Figure 1D; pKa≈9.9), in good agreement with the corresponding shift in pKa of the imidazole and amino side groups of histidine and lysine, respectively. These data strongly suggest that catalytic activity requires His29 to be in its deprotonated form, and that this residue functions as a general base during cleavage by activating the 2′‐hydroxyl nucleophile through proton abstraction. Substitution of His29 with aspartate, whose side chain is negatively charged at physiological pH, resulted in a functional enzyme, further supporting the role of His29 as the general base (Figure 1C). Direct proton abstraction would require the His29 side chain to be positioned proximally to the G20 2′‐hydroxyl, but in the previously published Csy4/substrate structures (Haurwitz et al, 2010), the His29 side chain interacts instead with the scissile phosphate and is not within hydrogen bonding distance of the expected location of the 2′‐hydroxyl. Those crystals were grown at acidic pH ranges (∼4.6–5) where the His29 side chain is likely to be protonated and Csy4 is catalytically defective (Figure 1D). Thus, the previously observed interaction between the scissile phosphate and His29 side chain may result artificially from the acidic pH of the crystallization conditions (see below).

Alanine substitution of Ser148 decreased the cleavage rate ∼8000‐fold relative to wild‐type (Figure 1B and C; Supplementary Table S1), suggesting that this residue plays a critical role in substrate binding, positioning, or cleavage chemistry (see below). Mutation of Tyr176 to phenylalanine or alanine reduced the cleavage rate only ∼13‐fold and ∼130‐fold, respectively (Figure 1C; Supplementary Table S1). The side chain of Tyr176 points into the active site and stacks on top of the His29 imidazole group; mutation to phenylalanine likely disrupts any role the phenolic hydroxyl plays in substrate binding, whereas mutation to alanine could also disrupt His29 positioning. Alanine substitution of either Ser150 or Thr151, both located in the active site loop, reduced the cleavage rate ∼350‐fold, suggesting these residues may play a role in either direct binding of the RNA substrate or by forming a network of hydrogen‐bonding interactions that orient the side chain of Ser148.

The Csy4 active site constrains the G20 ribose in the C2‐endo sugar pucker

To determine how Csy4 interacts with the 2′‐hydroxyl nucleophile, we crystallized a Csy4/RNA product complex comprising Csy4S22C and a 19‐nucleotide RNA product that was generated by endoribonucleolytic cleavage of a 20‐nucleotide substrate RNA (Figure 2A). Csy4S22C is a mutant of Csy4 that retains wild‐type activity and yields better diffracting crystals (Haurwitz et al, 2010). Crystals of this complex diffracted x‐rays to 2.0 Å resolution, and the structure was solved by molecular replacement using the previous substrate complex structure (PDB ID 2XLK) as a search model (Table I). The structure of Csy4 in this product complex is similar to that observed in the previously published substrate complex (PDB ID 2XLK; RMSD=0.431 Å over 811 atoms) (Figure 2B; Supplementary Figure S1A). Additionally, the crRNA hairpins of the product and substrate RNAs are bound to Csy4 in the same location and align with an RMSD of 0.519 Å over 214 atoms. We observed clear density for a 3′‐phosphate (Supplementary Figure S2), consistent with previous mass spectrometry results that identified the termini of Csy4 cleavage products as a 5′‐hydroxyl and 3′‐phosphate (Wiedenheft et al, 2011b). Additionally, we observe that nucleotide A5, a single‐stranded nucleotide immediately upstream of the stem‐loop, makes two hydrogen‐bonding contacts in a base‐specific fashion with the peptide backbone of Leu139 (Sternberg et al, 2012).

Figure 2.

Crystal structure of Csy4/product RNA complex at 2.0 Å resolution. (A) Shown at left is the substrate RNA used to generate the protein/RNA complex. Cleavage by Csy4 (purple arrow) produces the product RNA (right) present in the crystal structure. Gray lettering denotes nucleotides for which there was no corresponding electron density and therefore could not be modeled. (B) Overall structure of Csy4S22C (dark green) bound to product RNA (light green). Electron density was well‐defined for all 187 amino acids of Csy4 and 16 of the 19 nucleotides in the product RNA. (C) Detailed view of the Csy4 active site (gray box, in B). The 2′‐hydroxyl nucleophile is marked with a pound sign and the scissile phosphate is marked with an asterisk. RNA/protein hydrogen‐bonding interactions are marked with dashes.

View this table:
Table 1. Data collection and refinement statistics

Unique to the product complex structure is the presence of the 2′‐hydroxyl nucleophile in the active site (Figure 2C), which was readily apparent in the molecular replacement solution (Supplementary Figure S2). Upon modeling a ribonucleotide into the active site, we observed that the electron density was inconsistent with a ribose in the C3′‐endo conformation but was fit well with a ribose in the C2′‐endo form (Supplementary Figure S2). The 2′‐hydroxyl nucleophile is positioned between the side chains of Ser148 and Tyr176, both of which are within hydrogen‐bonding distance (2.9 Å and 3.2 Å) (Figure 2C), suggesting that these interactions may force the G20 ribose to adopt the C2′‐endo sugar pucker observed in the crystal structure. In‐line attack of a 2′‐hydroxyl nucleophile on the adjacent scissile phosphate requires a locally extended RNA backbone (Yang, 2011) and does not proceed when the sugar pucker is C3′‐endo. The observation of a C2′‐endo sugar pucker in the Csy4 active site is therefore representative of the extended conformation that would be required for cleavage to proceed.

Ser148 positions the RNA for cleavage

Our cleavage assays demonstrated that the S148A mutation is far more deleterious to catalysis than the Y176A mutation, suggesting that Ser148 is the primary residue responsible for positioning the 2′‐hydroxyl and maintaining the requisite extended phosphate backbone conformation. The Tyr176 side chain likely plays a redundant role in stabilization of the C2′‐endo conformation and may be more important for positioning His29. To test this hypothesis, we crystallized a complex of Csy4S148A and a 16‐nucleotide substrate RNA (Figure 3A). The resulting 2.6 Å structure (Figure 3B, Table I), solved by molecular replacement, likely contained a mixture of substrate and product RNAs (16‐ and 15‐nucleotides in length, respectively) due to the slow rate of Csy4S148A‐catalyzed cleavage. The C21 nucleotide, immediately downstream of the scissile phosphate, is disordered when present and electron density for this nucleotide is therefore not observed (Haurwitz et al, 2010). The Csy4S148A protein structure is similar to that of wild‐type Csy4 (RMSD=0.309 Å over 815 atoms), and the RNA hairpin is bound to the S148A mutant in the same location as observed in the product structure (RMSD=0.526 Å over 270 atoms; Supplementary Figure S1B). However, the active site ribose adopts a C3′‐endo sugar pucker in this case, thereby repositioning the 2′‐hydroxyl nucleophile 5.5 Å away from the Tyr176 side chain (Figure 3C). We conclude that the Tyr176 side chain is insufficient to maintain the C2′‐endo sugar pucker in the absence of Ser148, suggesting that the large catalytic defect for the S148A mutant may result from the Csy4 active site relying on the inherent sugar pucker interconversion rate in order for the substrate phosphate backbone to be properly extended for cleavage.

Figure 3.

Crystal structure of Csy4S148A/RNA complex at 2.6 Å resolution. (A) Shown at left is the substrate RNA used to generate the protein/RNA complex. Cleavage by Csy4 (purple arrow) produces product RNA (right). Because of the slow cleavage rate of the S148A mutant, crystals likely contained a mixed population of substrate and product RNAs. (B) Overall structure of Csy4S148A (dark purple) and RNA (light purple). 153/187 amino acids and 14/15 nucleotides could be modeled into the electron density. The amino acids composing the arginine‐rich helix are among those for which there is little to no electron density. (C) Superposition and close‐up of product complex (green) and S148A complex (purple) active sites (gray box, in B). The double‐headed black arrow highlights the 3.2 Å change in 2′‐hydroxyl location between the two structures. The two 2′‐hydroxyl nucleophiles are labeled with pound signs and the scissile phosphates are indicated with an asterisk.

His29 may interact directly with the 2‐hydroxyl nucleophile

As described above, all of the Csy4/RNA crystal structures result from crystals grown at pH 4.6–5. To determine what interactions His29 may make in the absence of the potentially pH‐induced interaction with the scissile phosphate, we crystallized a complex of Csy4 and a 15‐nucleotide RNA composed of only the crRNA hairpin with a 3′‐hydroxyl terminus (Figure 4A). The 2.3 Å resolution structure of this complex (hereafter called the minimal structure) once again revealed a Csy4 conformation similar to that observed previously (RMSD=0.346 Å over 843 atoms; RNA superposition RMSD=0.499 Å over 263 atoms) (Figure 4B; Table I; Supplementary Figure S1C). While the locations of the Tyr176 and His29 side chains are nearly identical between the product and minimal structures, the G20 nucleotide and the active site loop that contains Ser148 shift 3.4 Å and 2.5 Å between the two structures, respectively (Figure 4C). The G20 ribose is in the C2′‐endo conformation, and the 2′‐hydroxyl nucleophile is 3.6 Å and 3.7 Å away from the His29 and Tyr176 side chains, respectively (Figure 4D). The lack of a 3′‐phosphate results in significant disorder in the active site loop as is evidenced by a lack of density for residue 149 and for the side chains of nearly all of the active site loop residues (Figure 4D). This structure provides evidence that there is flexibility in the location of RNA within the Csy4 active site because in previous structures, the His29 sidechain is greater than 5 Å from the G20 2′‐hydroxyl. This flexibility likely facilitates His29 activating the 2′‐hydroxyl nucleophile via proton abstraction.

Figure 4.

Crystal structure of Csy4/RNA minimal complex at 2.3 Å resolution. (A) The stem‐loop RNA used for co‐crystallography lacks a 3′‐phosphate. (B) Overall structure of Csy4 (dark red) and stem‐loop RNA (pink). 151/187 amino acids and all 15 RNA nucleotides could be modeled into the electron density. Electron density for the active site loop is severely broken, and a dashed line indicates its approximate location. There is no electron density for the arginine‐rich helix. (C) Superposition and detailed view of product complex (green) and minimal complex (red) active sites (gray box, in B). The scissile phosphate belonging to the product complex is marked with an asterisk and the two 2′‐hydroxyl nucleophiles are marked with pound signs. (D) Magnified view of the minimal complex active site. Black lines indicate the distances between active site residues and the 2′‐hydroxyl nucleophile.

Csy complex formation requires Csy4‐catalyzed cleavage of CRISPR transcripts

Recent work has demonstrated that Csy4 associates with three other Cas proteins (Csy1‐3) and a single copy of crRNA to form the Csy complex, which targets complementary nucleic acids (Wiedenheft et al, 2011b). To determine whether pre‐crRNA cleavage by Csy4 is necessary for complex formation, we co‐expressed Csy1‐3 and a pre‐crRNA with either wild‐type Csy4 or the catalytically inactive mutant, Csy4H29A (Haurwitz et al, 2010), in Escherichia coli BL21(DE3) cells. The Csy complex was affinity purified via a 6 × histidine tag appended to the N‐terminus of Csy3, followed by size exclusion chromatography. Co‐expression of the wild‐type proteins and pre‐crRNA yielded an RNP with an estimated molecular mass of ∼350 kilodaltons (Figure 5), in agreement with previous work (Wiedenheft et al, 2011b). Substitution of catalytically inactive Csy4 in the co‐expression experiment resulted in the purification of only Csy3, which was not associated with a crRNA (Figure 5). Csy3 over‐expressed on its own in E. coli BL21(DE3) cells purifies as both a large oligomeric complex containing non‐specific RNA and as a nucleic acid‐free monomer (unpublished observations), similar to the two peaks observed for Csy3 co‐expressed with mutant Csy4. To ensure that Csy4H29A is defective only in catalysis and not in its ability to interact with other Csy complex components, we mixed together Csy complex components that were individually purified and evaluated the mixtures by size exclusion chromatography. Adding either wild‐type or H29A Csy4 to Csy1‐3 and a mature crRNA resulted in Csy complex formation (Supplementary Figure S4), suggesting that the Csy4H29A mutant is defective only for catalysis and not for interaction with other Csy complex components, and that catalysis is a necessary precursor to complex formation.

Figure 5.

Csy4 cleavage of pre‐crRNA is required for Csy complex formation. (A) Schematic depicting pre‐crRNA cleavage by Csy4 and formation of the Csy CRISPR ribonucleoprotein (crRNP) complex. The CRISPR repeat and spacer sequence are in black and green, respectively. Cleavage sites are denoted with purple arrows. (B) Superose 6 gel filtration column elution profiles of affinity‐purified Csy1, Csy2, His6‐Csy3, and pre‐crRNA co‐expressed with wild‐type (blue) or H29A (red) Csy4. (C) Coomassie blue‐stained 12% SDS–PAGE showing protein components of the superose 6 fractions for wild‐type (lane 1) and H29A (lanes 2–4, as noted in B) Csy4 co‐expression assays. (D) SYBR Gold‐stained 15% denaturing PAGE showing phenol:chloroform extracted nucleic acids from superose 6 fractions (from B).

Taken together with previous work demonstrating that Csy complex assembly does not proceed in the absence of RNA (Wiedenheft et al, 2011b), we conclude that Csy4‐catalyzed biogenesis of mature crRNAs with fully processed termini is necessary for stable Csy complex formation.


The production of crRNAs is central to CRISPR‐mediated adaptive immunity in prokaryotes. The three crystal structures of Csy4/RNA complexes and quantitative cleavage assays presented here reveal an unexpected endoribonuclease active site in which a serine residue constrains the nucleophile‐containing ribose in the C2′‐endo sugar pucker and a histidine residue serves as the general base to activate the 2′‐hydroxyl nucleophile. Unlike RNase A and other well‐studied metal ion‐independent nucleases, the Csy4 active site lacks a general acid and positively charged residues near the active site that would lower the energetic barrier to the transition state, resulting in correspondingly slow cleavage rates. We propose that upon binding a pre‐crRNA substrate, the Ser148 residue rearranges the G20 ribose into the C2′‐endo conformation, providing the correct geometry for His29 to abstract a proton from the 2′‐hydroxyl nucleophile and enable nucleophilic attack of the scissile phosphate. The resulting 2′,3′‐cyclic phosphate terminus is likely opened to a 3′‐phosphate via hydrolysis by a water. Csy4 then retains its crRNA product (Sternberg et al, 2012) and serves as the nucleation point for Csy complex formation.

We observe that the G20 ribose in the wild‐type Csy4 active site adopts the C2′‐endo sugar pucker. The C2′‐endo conformation is generally rare in double‐stranded RNA but is overrepresented in catalytic active sites and RNA tertiary interactions (Cantor and Schimmel, 1980; Mortimer and Weeks, 2009). In the Csy4 active site, Ser148 and Tyr176 likely interact directly with the 2′‐hydroxyl nucleophile via hydrogen bonding, restraining the ribose ring in the C2′‐endo conformation. Mutation of Ser148 to alanine slows cleavage nearly 8000‐fold and allows the G20 ribose to retain the C3′‐endo conformation. We propose that this significant cleavage rate defect may arise from a particularly slow rate of C2′‐endo/C3′‐endo interconversion at the G20 ribose in the absence of the Ser148 side chain. While most RNA sugars interconvert between the C2′‐ and C3′‐endo conformations on a microsecond to millisecond time scale (Johnson and Hoogstraten, 2008), a discrete set of C2′‐endo nucleotides has been observed to experience local dynamics with half‐lives on the order of 10–100 seconds, significantly slower than other local RNA conformational changes (Gherghe et al, 2008; Mortimer and Weeks, 2009). For example, the folding rate of bacterial RNase P RNA is limited by the sugar pucker interconversion of a single RNA nucleotide from C3′‐endo to C2′‐endo, which occurs at a rate of ∼0.24 min−1 (Mortimer and Weeks, 2009). Consistent with the observation that members of this class of slow interconverting C2′ endo‐containing ribonucleotides are partially constrained by hydrogen‐bonding or base‐stacking interactions (Gherghe et al, 2008), the G20 nucleotide base pairs with C6, hydrogen‐bonds with Arg102 on the major groove face, and stacks below A19 and above Phe155. We hypothesize that G20 belongs to this unusual class of C2′‐endo containing nucleotides and propose that the ∼8000‐fold defect in observed cleavage rate of the S148A mutant is due in large part to the extremely slow sugar pucker interconversion dynamics of the G20 nucleotide. However, we cannot rule out that the hydrogen bonding interaction between Ser148 and the 2′‐hydroxyl also contributes to nucleophile activation.

The observed rate of cleavage for wild‐type Csy4 (∼3 min−1 at pH 7.2) is orders of magnitude slower than that of other well‐characterized RNases. For example, RNase A enzymes from a variety of organisms cleave RNA substrates with apparent single‐turnover rate constants of 910 to 40 500 min−1 (Katoh et al, 1986), and the colicin E5 ribonuclease from E. coli cleaves minimal substrates with a kcat of ∼5000, min−1 (Ogawa et al, 2006). In fact, Csy4 has an observed cleavage rate similar to ribozyme‐catalyzed RNA cleavage rate constants, which are typically <2 min−1 (Zamel et al, 2004). Ribozymes perform the same transesterification reaction as protein RNases (Cochrane and Strobel, 2008), but are thought to be significantly slower because they typically lack general acids and bases with pKa values close to neutral pH (Yang, 2011). The well characterized metal‐independent RNase families of RNase A, RNase T1, and RNase T2 contain catalytic cores composed of a histidine pair; a glutamate and histidine; and a glutamate, lysine, and three histidines, respectively (Yang, 2011). Like many of these protein RNases, the Csy4 active site contains a histidine general base, but it appears to lack a general acid as there is no chemically appropriate residue positioned proximal to the 5′‐hydroxyl leaving group. Consistent with this observation is the sigmoidal shape of the Csy4 pH‐rate profile (Figure 1D). Whereas RNase A exhibits a bell‐shaped pH‐rate profile indicative of a cleavage mechanism that relies on two titratable residues (Raines, 1998), the Csy4 pH‐rate profile is consistent with only a single titratable residue that is likely to be His29.

An additional hallmark of metal ion‐independent RNases is stabilization of the pentacovalent transition state by one or more positively charged residues (Cochrane and Strobel, 2008). Like ribozymes, which lack functional groups that are positively charged at a neutral pH, Csy4 does not have any positively charged residues in or surrounding the active site. We hypothesize that Csy4 compensates for a lack of stabilizing positive charges by making additional hydrogen bonds to the transition state, analogous to the hairpin ribozyme, which makes 2–3 more contacts to the transition state than precursor or product RNAs (Rupert and Ferre‐D'Amare, 2001; Rupert et al, 2002; Cochrane and Strobel, 2008). This is consistent with the ∼350‐fold effect on cleavage observed for alanine substitution of Ser150 or Thr151, which lie in the active site loop and participate in a hydrogen bonding network that can include Ser148 and the scissile phosphate (Supplementary Figure S3). Through this network, Ser150 and Thr151 may aid in the stabilization of the pentacovalent transition state.

Using an in vivo assembly assay, we found that crRNA processing by the endoribonuclease Csy4 is essential to the stable formation of crRNA‐containing targeting complexes that bind to complementary nucleic acids and trigger their degradation. Because Csy complexes do not stably form on unprocessed pre‐crRNA, we hypothesize that the formation of the mature Csy crRNP requires a free 5′ terminus generated by Csy4‐catalyzed cleavage. Mature crRNAs across multiple CRISPR types contain 8‐nucleotides of repeat‐derived sequence at the 5′ end (Brouns et al, 2008; Carte et al, 2008; Marraffini and Sontheimer, 2008; Hale et al, 2009), and it has been proposed that these sequences, termed the 5′ handle, may serve as Cas protein binding sites (Terns and Terns, 2011; Wiedenheft et al, 2012). For example, the 5′ handle forms a hook‐like structure in the crRNP from E. coli K12 (Cascade) that correlates with termination of the ribonucleoprotein filament (Wiedenheft et al, 2011a). We speculate that the 5′ handle of the mature crRNA in P. aeruginosa recruits one or more Csy proteins to the nascent RNP. The requirement for a free crRNA 5′ terminus during complex formation would therefore point to specific recognition of the 5′ handle in the assembly of Cas protein complexes.

These observations, along with recent work demonstrating a very tight crRNA binding affinity by Csy4 (50 pM) (Sternberg et al, 2012), have led us to conclude that Csy4 evolved as a finely tuned RNA binding protein while retaining only modest cleavage kinetics. Similarly, the CRISPR type I‐E endoribonuclease (referred to as Cas6e, Cse3, or CasE) exhibits relatively slow cleavage kinetics (∼5 min−1) and tight substrate and product binding (Kd≈3 nM) (Sashital et al, 2011). Both Csy4 and Cse3 retain their crRNA products and are members of the crRNPs that target invading nucleic acids. These two CRISPR systems have likely evolved CRISPR endoribonucleases whose highly accurate substrate selection ensures incorporation of the appropriate RNA into the targeting complex, while the lack of a substrate turnover requirement has not contributed selective pressure for rapid cleavage kinetics.

Materials and methods

Protein expression and purification

Csy4 and single point mutants were expressed and purified as previously described (Haurwitz et al, 2010) with minor exceptions. Briefly, His6‐MBP‐Csy4 or His6‐Csy4 fusion constructs (vectors pHMGWA and pHGWA, respectively (Busso et al, 2005)) were expressed in either E. coli BL21(DE3) cells or E. coli Rosetta 2(DE3) cells (Novagen). Following batch nickel resin affinity purification, cleavage with TEV protease, and a second nickel resin step, samples were separated on a single Superdex 75 (16/60) size exclusion column (GE Healthcare) in 100 mM HEPES pH 7.5, 500 mM potassium chloride, 5% glycerol, and 1 mM TCEP. Proteins were then dialyzed against 100 mM HEPES pH 7.5, 150 mM potassium chloride, 5% glycerol, and 1 mM TCEP; concentrated; and stored at −80°C.

RNA cleavage assays

Single‐turnover cleavage experiments were performed at 24°C in 20 mM HEPES, 100 mM potassium chloride, pH 7.2. Cleavage reactions were carried out in 60 ul volume containing 500 pM [5′‐32P]‐crRNA repeat (5′‐GUUCACUGGCCGUAUAGGCAGCUAAGAAA‐3′), 400 nM Csy4, and 72 units RNasin Plus (Promega). At noted time points, 10 ul of the reaction were removed and quenched with 30 ul of acid phenol:chloroform (Ambion). 5 ul of the aqueous layer were mixed with 5 ul of formamide loading buffer and separated on a 15% denaturing polyacrylamide gel in 1 × TBE running buffer. Cleaved and uncleaved RNAs were visualized by phosphorimaging and quantified using ImageQuant (GE Healthcare). For each sample, the percentage of RNA cleaved (intensity of cleaved RNA band divided by the sum of the cleaved and uncleaved bands) was plotted as a function of time. Plots were fit to an exponential decay curve using Kaleidagraph (Synergy Software). Rate constants are reported as kobs because the rate‐limiting step for cleavage is unknown. All cleavage assays were done in triplicate.

Cleavage reactions for pH‐rate profiles were 55 ul in volume, contained 400 nM Csy4 and 500 pM [5′‐32P]‐crRNA repeat, and were performed in 20 mM buffer, 100 mM potassium chloride, and 1 mM dithiothreitol (DTT). Buffers used were as follows: pH 4.0–6.5 —citric acid; pH 7.0–8.5—4‐(2‐hydroxyethyl)‐1‐piperazineethanesulfonic acid (HEPES); pH 9.0–9.5—N‐cyclohexyl‐2‐aminoethanesulfonic acid (CHES); and pH 10.0–11.0—N‐cyclohexyl‐3‐aminopropanesulfonic acid (CAPS). Cleavage data were collected and analyzed as described above. pH‐rate plots in Figure 1D were fit to the following equation using Kaleidagraph (Synergy Software): kobs=(kobs,MAX × Ka)÷(Ka+[H+]), where Ka is an apparent acid dissociation constant and [H+] is the proton concentration.


Csy4/RNA complexes were generated and purified as previously described (Haurwitz et al, 2010). Briefly, an excess of synthetic crRNA fragment was added to Csy4 and the sample was incubated at 30°C for 30 min. For the product complex, this incubation step permitted full cleavage of the substrate RNA into product RNA. The RNA/protein complex was then separated from free RNA via size exclusion chromatography. All crystals were grown at 18°C using the hanging drop vapor diffusion method by mixing equal volumes (1 μl+1 μl) of protein/RNA sample and reservoir solution. All complexes yielded plate‐shaped crystals. Csy4S22C/product complex crystals were grown in 22% PEG4000, 120 mM sodium citrate pH 5.0, and 50 mM magnesium chloride. Csy4S148A/RNA complex crystals were grown in 20% PEG4000, 150 mM sodium citrate pH 5.0, and 100 mM magnesium chloride. Minimal complex crystals were grown in 21% PEG4000, 180 mM sodium citrate pH 5.0, and 100 mM magnesium chloride. Crystals were cryo‐protected with reservoir solution containing 25% glycerol and flash frozen in liquid nitrogen. Minimal complex crystals were soaked with mother liquor supplemented with 2 mM ammonium metavanadate for 1.5 h prior to cryo‐protection and flash freezing.

Structure determination

Diffraction data were collected at beam lines 8.2.1 and 8.3.1 of the Advanced Light Source, Lawrence Berkeley National Laboratory. Datasets were processed in XDS (Kabsch, 2010). All three structures were determined using molecular replacement in Phaser (Collaborative Computational Project, 1994; McCoy et al, 2007). Chains A and C (corresponding to protein and RNA, respectively) from the previously solved Csy4/substrate complex (PDB ID 2XLK) were used as search models for the product complex. The Csy4 protein (lacking the arginine‐rich helix) and RNA (lacking the A5 nucleotide) models from the product complex were used as search models for the S148A and stem‐loop complex structures. The models presented here resulted from iterative rounds of manual rebuilding in COOT (Emsley and Cowtan, 2004) and KiNG (Chen et al, 2009) and refinement in Phenix.refine (Adams et al, 2010). Riding hydrogens were included during refinement. Models were periodically validated using MolProbity (Chen et al, 2010).

All three complexes yielded crystals belonging to the C2 monoclinic space group that contained one complex per asymmetric unit. As in one of our previously published substrate structures (PDB ID 2XLI; (Haurwitz et al, 2010)), the RNA stems from neighboring complexes form coaxially stacked helices via an RNA kissing‐loop interaction. The RNA helix and the associated arginine‐rich alpha helix sit in a large solvent channel and exhibit elevated B factors. In the 2.0 Å resolution product structure, there is clear density for all amino acids in the arginine‐rich helix, whereas in the 2.6 Å S148A structure and the 2.3 Å minimal complex structure, there is no density for the arginine‐rich helix.

All structure figures were made using PyMol (DeLano, 2002).

Csy complex in vivo reconstitution

The four Csy proteins were co‐expressed from a polycistronic expression construct in which Csy3 had a His6 fusion tag along with a synthetic CRISPR locus containing eight repeats and seven identical spacers in Escherichia coli BL21(DE3) cells as described previously (Wiedenheft et al, 2011b). Site‐directed mutagenesis was used to introduce an alanine substitution at position 29 of the csy4 gene. Briefly, protein expression was induced with addition of 0.5 mM isopropyl β‐d‐1‐thiogalactopyranoside (IPTG) at an optical cell density at 600 nm of ∼0.5, followed by shaking at 18°C overnight. Samples were lysed and clarified as previously reported (Wiedenheft et al, 2011b). Samples were affinity purified with nickel NTA resin (Qiagen) and incubated overnight with tobacco etch virus (TEV) protease to release the His6 tag. Following a second nickel affinity step, samples were purified on a Superose 6 (10/300) size exclusion column (GE Healthcare) in 20 mM HEPES pH 7.5, 100 mM potassium chloride, and 1 mM TCEP.

Csy complex in vitro reconstitution

Csy3 was recombinantly expressed as a His6‐MBP fusion in E. coli BL21(DE3) cells. His6‐MBP‐Csy1 and untagged Csy2 were co‐expressed in E. coli BL21(DE3) cells. Both protein samples were subjected to the same purification steps as described above for Csy4. Mature crRNAs were purified from in vivo reconstituted Csy complex (see above) by acid phenol:chloroform extraction, chloroform extraction, and ethanol precipitation. Csy1/2, Csy3, Csy4, and crRNA were mixed in 1:6:1:1 molar ratios for a total of 160 ug of sample in 250 ul. Samples were subjected to size exclusion chromatography as described in the previous section.

Conflict of Interest

The authors declare that they have no conflict of interest.

Supplementary Information

Supplementary Data [emboj2012107-sup-0001.pdf]


We thank M Jinek, B Wiedenheft, and D Sashital for helpful discussions and members of the Doudna lab for critical reading of the manuscript. SHS acknowledges support from the National Science Foundation and National Defense Science & Engineering Graduate Research Fellowship programs. JAD is a principal investigator of the Howard Hughes Medical Institute. Coordinates and structure factors for the Csy4‐crRNA complexes have been deposited in the Protein Data Bank under the accession codes 4AL5, 4AL6, and 4AL7.

Author contributions: REH designed experiments, purified proteins, performed single‐turnover cleavage assays, crystallized the complexes, solved the crystal structures, and wrote the manuscript. SHS designed experiments, performed pH‐rate profile experiments, and contributed to the manuscript. JAD designed experiments and wrote the manuscript.