The SAGA (Spt–Ada–Gcn5 acetyltransferase) complex is an important chromatin modifying complex that can both acetylate and deubiquitinate histones. Sgf29 is a novel component of the SAGA complex. Here, we report the crystal structures of the tandem Tudor domains of Saccharomyces cerevisiae and human Sgf29 and their complexes with H3K4me2 and H3K4me3 peptides, respectively, and show that Sgf29 selectively binds H3K4me2/3 marks. Our crystal structures reveal that Sgf29 harbours unique tandem Tudor domains in its C‐terminus. The tandem Tudor domains in Sgf29 tightly pack against each other face‐to‐face with each Tudor domain harbouring a negatively charged pocket accommodating the first residue alanine and methylated K4 residue of histone H3, respectively. The H3A1 and K4me3 binding pockets and the limited binding cleft length between these two binding pockets are the structural determinants in conferring the ability of Sgf29 to selectively recognize H3K4me2/3. Our in vitro and in vivo functional assays show that Sgf29 recognizes methylated H3K4 to recruit the SAGA complex to its targets sites and mediates histone H3 acetylation, underscoring the importance of Sgf29 in gene regulation.
The SAGA (Spt–Ada–Gcn5 acetyltransferase) complex has been extensively investigated due to its important role in the regulation of gene expression (Baker and Grant, 2007; Rodriguez‐Navarro, 2009). SAGA was identified in yeast as a 1.8‐MDa histone acetyltransferase complex (Grant et al, 1997). The complex is highly conserved from yeast to humans, further confirming its importance in transcriptional regulation. Structural integrity of the complex requires three core components, Spt7, Spt20 and Ada1 (Wu and Winston, 2002). The SAGA complex contains two catalytic proteins, Gcn5 and Ubp8. Gcn5 is the histone acetyltransferase catalytic component, while Ubp8 is the histone deubiquitinating component (Rodriguez‐Navarro, 2009). The complex contains about 20 proteins. In addition to the catalytic proteins of SAGA, Gcn5 and Ubp8, the remainder of the components have been shown to be important facilitators of either Upb8 or Gcn5 catalytic function. These other proteins modulate SAGA catalytic activities through their functional domains (Lee and Workman, 2007). These include an acetyllysine‐binding bromodomain in Gcn5 and Spt7 (Zeng and Zhou, 2002); DNA‐binding SWIRM domain in Ada2 (Qian et al, 2005; Da et al, 2006); and DNA‐ or histone‐binding SANT domain in Ada2 (Boyer et al, 2004), and WD40 domains of unknown function in Taf5 (Durso et al, 2001) and Spt8 (Sermwittayawong and Tan, 2006) (Supplementary Figure S1).
In 2002, Sgf29 was identified as a new component of the yeast SAGA complex (Sanders et al, 2002), but very little is known about its function. Like most SAGA proteins, Sgf29 is conserved from yeast to humans, and contains an N‐terminal coiled‐coil domain and C‐terminal putative Tudor domains. More recently, the rat orthologue of Sgf29 was shown to directly bind Ada3, a SAGA component and important modulator of Gcn5 activity, via its N‐terminal coiled‐coil domain and activate c‐Myc targeted gene expression (Kurabe et al, 2007). Downregulation of Sgf29 was able to suppress genes involved in c‐Myc‐mediated malignant transformation, implicating the important function of Sgf29 in proper gene regulation of STAGA, the mammalian homologue of SAGA.
Most recently, a study by Vermeulen et al (2010) used mass spectrometry‐based technologies to screen for interactors of activating and repressive tri‐methyl marks on H3 and H4 in human cells. Sgf29 was shown to bind both H3K4me3 and H3K4me2 sites, with a slight preference for H3K4me3. ChIP sequencing revealed the presence of Sgf29 at gene promoters, which overlapped with the H3K4me3 mark. Conventional biochemistry methods confirmed the H3K4me2/3‐specific binding by Sgf29. Sgf29, via its Tudor domains, was shown to be responsible for linking the human SAGA complex to H3K4me3 since knockdown of Sgf29 resulted in loss of H3K4me3 binding (Vermeulen et al, 2010). Therefore, Sgf29 acts as a molecular link between human SAGA and the post‐translational modification H3K4me3.
Many chromatin‐associated proteins or protein complexes have both histone writing and reading activities, which are crucial for their proper function. For example, Clr4, the methyltransferase of the ClrC complex in fission yeast, contains both a catalytic SET domain and a chromodomain, which methylates histone H3K9 and binds H3K9me3, respectively (Min et al, 2002; Zhang et al, 2008). The ability of Clr4 to both write and read H3K9me is essential to spread heterochromatin and facilitate heterochromatin maintenance (Zhang et al, 2008). Sgf29, as a reader of H3K4 methyl marks for the SAGA complex, joins the group of other proteins in both yeast and humans known to recognize methyl marks for their respective HAT complex and target its proper acetylation. For example, Yng1 is known to target NuA3 HAT activity to the 5′ ORF in yeast by binding tri‐methylated H3K4 via its PHD finger (Taverna et al, 2006). Similarly, in human cells, the PHD finger of the ING4 protein, a subunit in the HBO1 acetyltransferase complex, binds to tri‐methylated H3K4, increasing acetylation by the HBO1 acetyltransferase at the promoter of target genes, leading to efficient transcription (Hung et al, 2009; Saksouk et al, 2009). Misregulation of these targeted processes can have costly outcomes when they control genes that require tight regulation. Such is the case with ING4, a known tumour suppressor, where it drives expression of genes involved in anti‐cancer activities (Hung et al, 2009). Therefore, these subunit reader domains of HAT complexes are important to properly target writer functions, specifically acetyltransferase activity in this case, and misregulation or loss of their function can lead to aberrations in chromatin dynamics.
In this work, our structural studies show that Sgf29 contains unique tandem Tudor domains at its C‐terminus and our binding assays show that these tandem Tudor domains selectively bind H3K4me2/3. To elucidate the molecular mechanism of this specific recognition by Sgf29, we determined the crystal structures of the tandem Tudor domains of human and Saccharomyces cerevisiae Sgf29 in complex with different modified histone H3K4 peptides. Furthermore, our in vivo functional assays show that Sgf29 is required for histone H3 acetylation by the SAGA complex.
Results and discussion
Sgf29 preferentially recognizes histone H3K4me2/3 via its tandem Tudor domains
Based on secondary structure prediction, we found that Sgf29 contains a coiled‐coil domain at its N‐terminus and putative tandem Tudor domains at its C‐terminus (Figure 1A). In contrast to the sequence diversity at its N‐terminus, we found that the C‐terminal region of Sgf29 has relatively higher sequence identity than the N‐terminus, especially within the conserved Tudor domains (Figure 1B).
The Tudor domain, as an important member of the ‘Royal Family’ of histone‐binding modules, is structurally similar to the chromo, PWWP and MBT domains (Maurer‐Stroh et al, 2003), and has been shown to bind methylated histones (Adams‐Cioaba and Min, 2009). Thus, it was compelling to speculate that Sgf29 may preserve this histone methyllysine binding ability. To better understand the binding specificity of human hsSGF29 and its yeast orthologue scSgf29, we used isothermal titration calorimetry (ITC), surface plasmon resonance (SPR) and fluorescence polarization (FP) assays to measure the binding affinity of both hsSGF29 and scSgf29 for histone H3K4, H3K9, H3K27, H3K36, H3K79 and H4K20 peptides bearing different methylation states. We found that both hsSGF29 and scSgf29 do not exhibit detectable binding to any of the H3K27, H3K36, H3K79 and H4K20 peptides, regardless of their methylation states (Table I). Instead, both Sgf29 proteins show strong binding to methylated H3K4 peptides and preferentially bind H3K4me2 and H3K4me3 marks (Table I). Yeast scSgf29 shows no detectable binding to the unmodified H3K4 peptide. Human hsSGF29 can still bind unmodified H3K4 peptide, but with nearly 50‐fold weaker affinity (Kd=24.0 μM). Interestingly, hsSGF29 can also bind to H3K9me3 peptides as strongly as to the unmodified H3K4 peptide (Table I). Because the H3K9 peptides we used share the first 11 residues with the H3K4 peptides, and scSgf29 does not bind H3K9me3 peptide similar to its inability to bind the unmodified H3K4 peptide (Table I), presumably hsSGF29 binds H3K9 peptides via the K4 containing sequence. In order to corroborate this, we synthesized a peptide H36−20K9me3 covering residues 6–20 of histone H3. The length of this peptide is the same as the wild‐type (WT) H3K9me3 peptide, and the residue K9me3 is in a position corresponding to the K4me3 residue from the N‐terminus of H3. Our ITC‐binding assay shows that the binding to hsSGF29 is completely abolished for the H36−20K9me3 peptide (Table I). Hence, the tandem Tudor domains of Sgf29 selectively recognizes H3K4me2/3 marks.
Sgf29 contains unique tandem Tudor domains
To uncover the molecular architecture of the putative tandem Tudor domains of Sgf29, we determined the crystal structures of human hsSGF29 (residues 129–293) and S. cerevisiae scSgf29 (residues 113–259). The crystal structures show that both human and yeast Sgf29 indeed contain tandem Tudor domains at their C‐termini. The scSgf29 and hsSGF29 structures are very conserved with an RMSD of 1.6 Å for all aligned Cα atoms, although scSgf29 and hsSGF29 only have 20% amino‐acid sequence identity (Figure 1B). Each Tudor domain consists of five twisted anti‐parallel β strands forming a typical barrel‐like fold (Figure 1C and D). scSgf29 was crystallized with a maltose‐binding protein (MBP) tag fused to aid crystallization (Supplementary Figure S2A–C). scSgf29 in complex with the methylated H3K4 peptides were crystallized at pH 4.0. At such low pH, scSgf29 can still bind H3K4me2/3, although the binding affinity decreased dramatically (Supplementary Figure S2D and E). The tandem Tudor domains in Sgf29 tightly pack against each other face‐to‐face, which is distinct from other known tandem Tudor domain structures (Botuyan et al, 2006; Huang et al, 2006; Adams‐Cioaba et al, 2010), which we will discuss below.
Structural basis for the selective binding of Sgf29 to histone H3K4me2/3 peptides
To shed light on the molecular mechanism of selective binding of Sgf29 to methylated histone H3K4, we determined the crystal structures of hsSGF29 (residues 115–293) and scSgf29 (residues 113–259) in complex with di‐ and tri‐methylated H3K4 peptides, respectively. The structures of the H3K4me2–Sgf29 and H3K4me3–Sgf29 complexes are almost identical for both hsSGF29 and scSgf29 (Figure 2; Supplementary Figures S3 and S4). We used a longer hsSGF29 construct for crystallization of the complexes because crystals were of higher quality than those of the short construct (residues 129–291). The longer construct contains an extra α helix in the N‐terminus, which is located between the two Tudor domains and sits outside the histone binding cleft (Figure 2B). Hence, this extra N‐terminal α helix is not directly involved in histone binding, which is also confirmed by our binding results. Our studies show that the short hsSGF29 fragment binds as tightly as the long hsSGF29 fragment to the H3K4me3 histone peptide (Supplementary Figure S5).
From the complex structures of both scSgf29 and hsSGF29, we can see that the first four residues of the H3K4me3 peptide are snugly embedded between the two Tudor domains (Figure 2A and B). The first residue in the H3K4me3 peptide (H3A1) is anchored in a small negatively charged pocket created by the first Tudor domain in both hsSGF29 and scSgf29 (Figure 2). The backbone amine group of H3A1 forms a conserved salt bridge with the carboxylate oxygen of D194 in hsSGF29 (D163 in scSgf29) (Figure 2). The importance of this H3A1‐binding pocket is exemplified by the significant reduction in binding affinity (∼70‐fold weaker) when the H3A1 residue is deleted from the H3K4me3 peptide (Table I). Because the H3A1‐binding pocket is negatively charged and rigidly formed, acetylation of H3A1 completely abolishes H3K4me3 binding to hsSGF29 (Table I). Hence, the short side chain H3A1 residue is tightly secured in the small negatively charged pocket formed in the first Tudor domain of Sgf29.
The tri‐methyllysine K4me3 is bound in a conserved negatively charged pocket located in the second Tudor domain in both hsSGF29 and scSgf29, and is flanked by two aromatic residues (Y238 and Y245 in hsSGF29, and Y205 and Y212 in scSgf29) (Figure 2). The backbone amine group of the K4me3 also forms a hydrogen bond with the carbonyl oxygen of Y245 in hsSGF29 (Y212 in scSgf29). A third aromatic residue, F264 in hsSGF29 (F229 in scSgf29), lies underneath and buttresses Y238 (Y205 in scSgf29), although it also provides additional hydrophobic interaction with the methyllysine. A negatively charged residue D266 (E232 in scSgf29) on the other side of the K4me3‐binding pocket interacts with the methyllysine via a salt bridge. Therefore, the tri‐methyllysine residue K4me3 is anchored by the second Tudor domain through cation‐π, van der Waals, hydrophobic, electrostatic and hydrogen bond interactions.
We performed a series of mutagenesis experiments in hsSGF29 in order to verify the importance of the K4me3 and H3A1 binding residues. Mutating Y245 to alanine completely abolished the binding, mutating F264, Y238 or D266 to alanine markedly reduced the binding affinity (Table II), which is consistent with our structural observations that Y245 are essential in binding methylated H3K4me3, and Y238, F264 and D266 reinforce the binding.
Mutating D194 to alanine or D194/D196 to alanines in hsSGF29 disrupts binding to H3K4me3 (Table II), underlying the importance of D194 in H3K4m2/3 binding through anchoring the H3A1 residue. The D196A mutant could still bind H3K4me3 although with an ∼12‐fold weaker affinity (Table II). In human but not yeast Sgf29, D196 forms an extra hydrogen bond with the H3A1 residue. D196 is replaced by E165 in scSgf29, which does not form the hydrogen bond with H3A1 (Figure 2A). D196R mutation completely disrupts binding to H3K4me3 (Table II), which indicates that the negative charge around the H3A1‐binding pocket is necessary to stabilize the complex.
The backbone of Arg2 in H3 (H3R2) forms two hydrogen bonds with the backbone amine group and the side chain carbonyl oxygen of Thr242 in hsSGF29. Mutating Thr242 to alanine severely diminishes the binding (Table II). Nevertheless, the H3R2 side chain is flexible and does not form any conserved interactions with hsSGF29 (Figure 2; Supplementary Figure S3D). This explains why methylation of H3R2 does not affect its binding to hsSGF29 (Kd=0.5 μM; Table I).
Sgf29 has unique tandem Tudor domains, distinct from those of JMJD2A, FMR1, 53BP1 and SND1
So far, the crystal structures of a few tandem Tudor domains have become available. Based on the architecture of these Tudor domains, tandem Tudor domains can be classified into five subfamilies, i.e., Sgf29, JMJD2A (Huang et al, 2006), 53BP1 (Botuyan et al, 2006), FMR1 (Ramos et al, 2006; Adams‐Cioaba et al, 2010) and SND1 (Liu et al, 2010a, 2010b) subfamilies. The two Tudor domains in Sgf29 face each other with the first two β strands from each Tudor domain packing against each other (Figure 3A and B), accordingly named as ‘face‐to‐face’ tandem Tudor domains. The two Tudor domains in 53BP1 line up one after the another with the first two β strands from the first Tudor packing against the last two β strands from the second Tudor (Figure 3C) (Botuyan et al, 2006), named as ‘lineup’ tandem Tudor domains. The two Tudor domains in the FMR1 subfamily of Tudor proteins, such as FMR1 (Ramos et al, 2006), FXR1/2 (Adams‐Cioaba et al, 2010) and UHRF1, form a head‐to‐head architecture with the same ends of the β strands packing against each other (Figure 3D). JMJD2A has hybrid Tudor domains (Figure 3E) (Huang et al, 2006). The extended Tudor domain in SND1 and other TDRD members was predicted to contain a single Tudor domain, but the recent structural studies demonstrated that the N‐ and C‐terminal extensions surrounding the canonical Tudor domain in SND1 and other TDRD members fold together and form a Tudor‐like domain (Figure 3F) (Liu et al, 2010a, 2010b). All these five kinds of tandem Tudor domains bind lysine/arginine methylated proteins, but only in Sgf29 do both Tudor domains extensively bind methylated histones (Supplementary Figure S6). Two negatively charged pockets are formed in each individual Tudor domain, and the length between these two pockets determines the selectivity of Sgf29. The other four kinds of tandem Tudor domains mainly use one Tudor domain to interact with their corresponding ligands. That explains why they have less strict sequence selectivity. For example, JMJD2A cannot only bind H3K4me3, but also binds H4K20me3 in an opposite orientation (Lee et al, 2008).
Interaction of Sgf29 tandem Tudor domains with methylated H3K4 is important for the acetyltransferase activity of SAGA complex in vivo
Although recombinant Gcn5 can only acetylate histone H3 lysine 14 (H3K14) in vitro, the substrate specificity of Gcn5 expands to histone H3 lysine 9, 14, 18, 23 and histone H4 when it functions with other SAGA components as a complex (Zhang et al, 1998; Grant et al, 1999). To investigate whether Sgf29 is able to regulate the acetyltransferase activity of the SAGA complex in budding yeast in vivo, the global acetylation levels of H3 and H4 were examined. We found that similar to Gcn5 and Ada3 deletion (Figure 4A), Sgf29 deletion reduces the global acetylation levels of H3K9ac and H3K18ac and to a lesser extent H3K14ac, but has no significant effect on the acetylation levels of H3K23ac and H4ac, which are consistent with H3K9, H3K14 and H3K18 being the primary acetylation sites for SAGA (Grant et al, 1999; Lee and Workman, 2007) (Figure 4A). Notably, the yeast Sas3‐dependent NuA3 acetyltranferase complex also acetylates H3K14 in vivo, explaining the modest reduction in H3K14ac seen in our SAGA mutants (Howe et al, 2001; Martin et al, 2006; Taverna et al, 2006).
Since the deletion of scSgf29 affects the acetylation of H3 markedly, we introduced the scSgf29 gene into an Sgf29 knockout yeast strain to investigate if scSgf29 could rescue histone acetylation of H3 (Figure 4B, compare lanes WT, ΔSgf29 and pSgf29_FL). The results show that the global acetylation of H3K9, H3K14 and H3K18 is restored in the rescue assay. We next made some point or deletion mutants of scSgf29, which would reduce or disrupt scSgf29 binding to H3K4me2/3, and examined the effect of these mutations on the histone H3 and H4 acetylation by transforming these scSgf29 mutants into the Sgf29 knockout yeast strains. We found that these mutants could not rescue H3K9, H3K14 and H3K18 global acetylation (Figure 4B). The expression levels of WT and mutant scSgf29 were monitored by quantitative RT–PCR. We found that the global reduction of H3K9, H3K14 and H3K18 acetylation is not caused by the loss of scSgf29 expression in mutant yeast strains (Figure 4C).
In order to explore if Sgf29 has a similar role in mammalian cells, we used shRNA to knock down hsSGF29 in MDA‐MB231 cells (Figure 4D), and found that hsSGF29 knockdown reduced the acetylation levels of H3K9, H3K14, H3K18 and H3K23, but not that of H4 (Figure 4E). Hence, in both S. cerevisiae and humans, Sgf29 regulates H3K9, H3K14 and H3K18 acetylation. Interestingly, it was reported recently that Spt20 subunit knockdown of the SAGA complex leads to reduction in the global acetylation levels of histone H3K9 and H3K14, but not histone H4 (Nagy et al, 2010). Together, these results suggest that recognition of H3K4me2/3 marks by Sgf29 have an important role in H3 acetylation by the SAGA complex.
Sgf29 is important for recruiting SAGA to its target sites
Because tri‐methylation of histone H3K4 is required for subsequent acetylation of histone H3 (Jiang et al, 2007), and Sgf29, as a component of the SAGA complex, binds H3K4me2/3 marks and is required for H3 acetylation, we speculated that the recognition of the methylated histone mark H3K4me2/3 by Sgf29 tandem Tudor domains would contribute to the targeting of the acetyltransferase activity of the SAGA complex and mediate the crosstalk between the post‐translational modifications of histones. If this hypothesis holds, Sgf29 would not be directly required per se for the HAT activities of SAGA. To verify our hypothesis, first we asked whether the loss of Sgf29 would affect the integrity of the SAGA complex by multidimensional protein identification technology (MudPIT) mass spectrometry analysis. SAGA was purified from an Spt8TAP‐tagged strain with Sgf29 deleted and compared with that purified from a WT strain. Our MudPIT mass spectrometry analysis revealed that deletion of Sgf29 resulted only in the loss of the Sgf29 protein (Supplementary Table SI), showing that Sgf29 was not required for the complex integrity. Next, in order to determine a role for Sgf29 in SAGA complex in vitro, a variety of substrates including core histones, nucleosome or histone H3 tails with various modifications were used to examine the HAT activity of the ΔSgf29 SAGA complex on these substrates. We found that deleting Sgf29 had marginal effects on SAGA activity on all these substrates in vitro (Figure 5A). Additionally, in vitro HAT activity was not dependent on the H3K4 methylation and deleting the first five residues of histone H3 had no significant effect on its HAT activity (Figure 5A). Therefore, Sgf29 is required for complete and efficient HAT activity of SAGA in vivo, but dispensable in vitro.
Since deleting Sgf29 reduced H3 acetylation levels in vivo, we wanted to see if deletion of Sgf29 affected genome‐wide H3 acetylation or localization of SAGA to genes. To do this, we employed genome‐wide ChIP‐Chip analysis. We found that deletion of Sgf29 resulted in a loss of H3K9 acetylation at promoters when compared with WT strain (Figure 6A and B). We also found that SAGA localization was decreased at promoters genome‐wide when compared with WT strain, as determined by ChIP through the Spt8 subunit (Figure 6C and D). These results are consistent with a recent study by Vermeulen et al (2010), in which ChIP‐Seq data revealed that human Sgf29 is highly associated with promoter regions and this association coincides with H3K4me3 at the promoter. In addition, we looked at a specific subset of genes previously identified as requiring SAGA for their expression (Lee et al, 2000). This subset of 593 genes was analysed in the same manner as the genome‐wide data and similar trends were found in both data sets (Figure 6A–D). In both cases, deletion of Sgf29 resulted in a decrease in both H3K9 acetylation and SAGA localization (Figure 6A–D). Furthermore, we dissected the genome into different categories based on gene length and transcription rate, and determined how deleting Sgf29 would affect acetylation and SAGA recruitment on these different categories of genes (Supplementary Figures S7–S10). In general, we found that deletion of Sgf29 displayed similar decreases in acetylation and SAGA recruitment across these different categories analyzed. Interestingly, the genome‐wide reduction of SAGA was more dramatic than the SAGA‐specific genes (Figure 6D). This is likely due to the fact that SAGA is present at many more genes than it has been shown to regulate (Venters et al, 2011). Additionally, our genome‐wide analysis was conducted using the SAGA‐specific subunit, Spt8, while the expression studies were done using deletions that affect not only SAGA, but also the SAGA‐related complex SliK/SaLSA (Lee et al, 2000).
Since Sgf29 is required to maintain the global H3 acetylation, we examined what effects deleting Sgf29 or one or both of its Tudor domains would have on the well‐studied SAGA target gene, GAL1. SAGA and H3 acetylation are present at the GAL1 UAS, 5′ and 3′ ORF under inducing conditions (Govind et al, 2007). SAGA targeted to the transcribed coding region of the GAL1 gene specifically increases H3 acetylation and nucleosome eviction and also promotes RNA Pol II processivity (Govind et al, 2007). Thus, loss of proper recruitment of SAGA to all regions of GAL1 would severely affect efficient transcription of GAL1. At the GAL1 locus, we found that scSgf29 deletion or Tudor domain deletions resulted in the H3K9 hypoacetylation across the entire GAL1 locus (Figure 5C). We next examined whether scSgf29 deletion or mutations would result in the loss of SAGA at GAL1. We found that, indeed, scSgf29 deletion or mutations resulted in the loss of SAGA recruitment to GAL1 (Figure 5D). Consistently, transcription of GAL1 was severely reduced in all the Sgf29 deletion or Tudor domain deletion strains (Figure 5B). Sgf29's recruitment of SAGA and its subsequent acetylation is dependent upon Set1‐specific methylation since deletion of COMPASS subunits, Spp1 (required for tri‐methylation of H3K4) and Swd1 (required for di‐ and tri‐methyl H3K4), also results in loss of proper SAGA localization and H3 acetylation (Figure 5C and D). Previous experiments have shown that Set1‐dependent H3K4me is present at the GAL1 UAS and 5′ region during gene activation (Bryk et al, 2002; Ingvarsdottir et al, 2007). Supposedly, lack of Sgf29 would compromise the recruitment of the SAGA complex to the GAL1 locus, repressing the acetylation of histone H3 thereby debilitating the transcription of the GAL1 gene. Thus, Sgf29 is an important component in SAGA to bridge histone methylation and histone acetylation.
These findings contribute to the extensive reading and writing functions of chromatin‐associated proteins. Specifically, in the SAGA complex, we find that Sgf29 is an important chromatin reader that helps target SAGA to promoters via its Tudor domains. Interestingly, recent studies show that human Sgf29 binding to H3K4me3 is augmented by the presence of H3K9 and H3K14 acetylation (Vermeulen et al, 2010). Most likely, this is the result of a synergistic effect of the methyl reader, Sgf29, and the acetyl reader, Gcn5 via its bromodomain that together may help to propagate transcription of a specific SAGA‐dependent gene.
In summary, Sgf29 exhibits strong binding affinity and selectivity for H3K4me2/3 owing to the extensive intermolecular interactions between the complementary surfaces of Sgf29 and the first four residues of the H3K4me peptide. In particular, the H3A1 and K4me3 binding pockets created by each of the tandem Tudor domains and the fixed distance between these two pockets are the structural determinants in conferring Sgf29 the ability of selectively recognizing H3K4me2/3, not other histone lysine sites. Sgf29's ability to recognize and bind methylated H3K4 is required for proper histone H3 acetylation by SAGA.
Materials and methods
Protein expression and purification
Two fragments of human SGF29 (residues 115–293 and 129–291) covering the tandem Tudor domains were subcloned into a pET‐28a‐MHL vector. The protocol for protein expression and purification is similar to that described in Adams‐Cioaba et al (2010). The his‐tag was cleaved from SGF29 (residues 115–293) by the addition of 0.05 mg of TEV protease per milligram of SGF29 protein, followed by incubation in ice for 12 h. The sample was then passed through a Ni‐NTA column and the flow‐through was collected and further purified by size exclusion chromatography (Superdex 75, GE Healthcare). In another Sgf29 construct (residues 129–291), the his‐tag was not removed. The fragment of yeast Sgf29 (residues 113–259) containing the tandem Tudor domains was cloned into a pET‐23b vector with N‐terminal fusion to MBP (MBP‐Sgf29). Recombinant MBP‐Sgf29 was overexpressed in Escherichia coli BL21 (DE3) at 20°C for about 20 h. MBP‐Sgf29 was purified by affinity chromatography (amylose resin, New England Biolabs), anion exchange chromatography (resource Q, GE Healthcare) and size exclusion chromatography (Superdex 200, GE Healthcare). Purified MBP‐Sgf29 was concentrated to 10 mg/ml in Tris–HCl, pH 8.0 and 150 mM NaCl.
FP‐binding assays were performed at 25°C in 20 mM Tris–HCl, pH 7.5, 50 mM NaCl, 1 mM DTT and 0.01% Tween‐20 as described in Xu et al (2010). The ITC assays were carried out at 25°C in 25 mM Tris–HCl, pH 7.5, 50 mM NaCl and 1 mM DTT as described in Xu et al (2011). The SPR assays were determined at 8°C using a BIAcore 3000 instrument. The peptides were biotinylated and immobilized on a streptavidin‐precoated sensor chips (SA chip) and all measurements were carried out in 50 mM Tris–HCl, pH 8.5, 350 mM NaCl and 5% glycerol (solution A). The protein at concentrations ranging from 0.625 to 3 μM was flowed for 3 min over the peptide‐coated SA chip at a flow rate of 30 ml/min and the change in response unit was recorded. Protein dissociation was monitored for 3 min by flowing solution A over the sensor chip at a flow rate of 30 ml/min. By assuming a single‐site equilibrium model, the Kd was determined by global non‐linear regression fitting of the association and dissociation curves according to the Langmuir binding isotherm model. The peptides were purchased from Millipore.
Purified Sgf29 (residues 115–293) was mixed with methylated histone peptides by addition of a two‐fold molar excess of peptide to the protein solution and crystallized at 18°C in 0.1 M Bis‐Tris, pH 5.5, 20–34% PEG3350, 100 mM ammonium sulphate and 10 mM strontium chloride. Crystals of SGF29 (residues 129–291) were obtained at 18°C in 0.1 M HEPES, pH 7.5, 0.2 M NaCl, 25% PEG3350. Crystals of MBP‐fused yeast Sgf29 (113–259) (MBP‐Sgf29) were grown at 8°C in 0.1 M NaAC‐HAC, pH 4.0, 2.0 M (NH4)2SO4. Crystals of MBP‐Sgf29 in complex with methylated histone peptides were grown at 8°C in 0.1 M NaAc‐HAC, pH 4.0, 20–25% PEG3350.
Diffraction experiments were performed at 100 K. Further experimental details are listed in Table III. Diffraction intensities were indexed and scaled with HKL2000/HKL3000 (Minor et al, 2006). The structure of Sgf29–H3K4me3 (115–293) was solved by the single wavelength anomalous diffraction method (Wang, 1985) using a selenomethionine derivative (Hendrickson et al, 1990) of the protein and data collected on the Rigaku FRE generator with a Cu target. Heavy atom substructure determination and initial phase calculation were carried out using SHELXD and SHELXE (Schneider and Sheldrick, 2002), respectively. Automated model building in ARP/warp (Perrakis et al, 2001) was used to generate the initial model. Manual model improvements and refinement were carried out with COOT (Emsley and Cowtan, 2004) and REFMAC (Vagin et al, 2004), respectively. The MOLPROBITY server (Davis et al, 2004) was used for the validation of protein model geometry. The other human SGF29 structures were solved by molecular replacement using the program MOLREP (Vagin and Teplyakov, 2000). X‐ray diffraction data of apo MBP‐Sgf29 and MBP‐Sgf29 in complex with methylated histone peptides were collected at 100 K at beamline BL17U of SSRF and 3W1A of BSRF and processed with HKL2000 (Minor, 1997; Otwinowski and Minor, 1997). The initial phases were obtained by molecular replacement with the program MOLREP using the MBP structure (PDB ID: 1HSJ) as template (Vagin and Teplyakov, 2000), and refinement were carried out with program COOT (Emsley and Cowtan, 2004) and REFMAC (Vagin et al, 2004). Crystal diffraction data and refinement statistics are displayed in Table III.
Purification of TAP‐tagged complexes was carried out as previously described (Lee and Workman, 2007) with the following modifications: Elutions were carried out in a volume of 500 μl and repeated five times for a total volume of 3 ml of purified complexes. TAP‐purified complexes were resolved on a 10% SDS–PAGE gel and visualized by silver staining.
Yeast strains and media
Yeast strain BY4742_ ΔSgf29 (Sgf29 knockout) (MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 YCL010c∷kanMX4) and its isogenic parent strain BY4742 (MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0) were obtained from the EUROSCARF. Rich medium (YP, 2% peptone, 1% yeast extract) plus 2% glucose (YPD) and YC complete medium plus 2% glucose or 3% glycerol, respectively, were used for growth as required. Standard protocols for transformation were used in strains constructions (Gietz and Woods, 2002).
The pRS316‐scSgf29 centromere plasmid was constructed by cloning a 1607‐bp fragment, which contains the scSgf29 coding region and its upstream 400 bp and downstream 400 bp regions into the BamHI–XhoI sites of pRS316. The plasmid was confirmed by DNA sequencing. Mutants of scSgf29 were created by site‐directed mutagenesis with primer‐specific two‐step PCR using WT pRS316‐scSgf29 plasmid as template to change the desired amino acids into alanine (A). All the mutants were cloned in yeast centromere plasmid pRS316 and confirmed by DNA sequencing.
Semi‐quantitative growth assay
Phenotypic analysis was performed using serial dilutions of yeast cells grown at 30°C to early mid‐log phase (OD600 of 1.0–1.5). Serial dilutions were carried out by spotting cells and diluting four‐fold for each spot onto the indicated plates with starting cell density of 107 cells/ml. Plates were grown at 30°C for 2–3 days and imaged.
TCA‐precipitated proteins were urea‐denatured, reduced, alkylated and digested with endoproteinase Lys‐C (Roche) followed by modified trypsin (Promega) as described (Washburn et al, 2001). Peptide mixtures were loaded onto 100 ìm fused silicamicrocapillary columns packed with 5‐μm C18 reverse phase (Aqua, Phenomenex), strong cation exchange particles (Partisphere SCX, Whatman) and reverse phase (McDonald et al, 2002). Loaded microcapillary columns were placed in‐line with a Quaternary 1100 series HPLC pump (±Agilent) and a LTQ or XP linear ion trap mass spectrometer equipped with a nano‐LC electrospray ionization source (ThermoFinnigan). Fully automated 10‐step MudPIT runs were carried out on the electrosprayed peptides, as described (Florens and Washburn, 2006). Tandem mass (MS/MS) spectra were interpreted using SEQUEST (Eng et al, 1994) against a database of 11 986 amino‐acid sequences, consisting of 5880 S. cerevisiae proteins (non‐redundant entries from NCBI 2008‐02‐11 release), 177 usual contaminants (such as human keratins, IgGs and proteolytic enzymes), and, to estimate false discovery rates, 5993 randomized sequences for each non‐redundant protein entry. Peptide/spectrum matches were selected and compared using DTASelect/CONTRAST (Tabb et al, 2002) with the following criteria set: spectra/peptide matches were only retained if they had a DeltCn of at least 0.08, and minimum XCorr of 1.8 for singly‐, 2.5 for doubly‐ and 3.5 for triply charged spectra. In addition, peptides had to be fully tryptic and at least seven amino acids long. Combining all runs, proteins had to be detected by at least two such peptides, or one peptide with two independent spectra.
Antibodies and western blotting for scSgf29
Yeast strains were grown to stable log phase (0.6–1.0) in YC complete medium plus 2% glucose. Then, cells were harvested and washed in 1 ml PBS twice. The pellets were resuspended with 500 μl Lyticase Buffer (50 mM Tris–HCl at pH 7.5, 10 mM MgCl2, 1 M sorbitol) containing 50 U lyticase and incubated at 30°C for 1 h. Then spheroblasts were collected and washed in LB twice and resuspended with 500 μl EBX (20 mM Tris–HCl at pH 7.4, 100 mM NaCl, 0.25% Triton X‐100, 5 mM DTT, 50 mM Na‐butyrate+, protease inhibitors cocktail). Triton X‐100 was added up to 0.5% to lyse the outer cell membrane and the lysate was layered over 1 ml NIB Buffer (20 mM Tris–HCl at pH 7.4, 100 mM NaCl, 1.2 M sucrose, 5 mM DTT, 50 mM Na‐butyrate+, protease inhibitors cocktail) and centrifuged. Then the nuclear pellets were resuspended in 500 μl EBX and Triton X‐100 was added up to 1% to lyse the nuclear membrane and centrifuged. Then the chromatin and nuclear debris were obtained. Washed the chromatins with EBX three times and suspended in 50 μl 1 M Tris (pH 8.0). SDS–PAGE loading buffer was added and samples were incubated at 95°C. For western blotting assays, histone samples were electrophoresed on SDS‐15% polyacrylamide gels, electroblots to PVDF membrane and visualized immunochemically by standard methods. Immunodetection was performed with an enhanced chemiluminescence kit (34079, Thermo). The antibodies used to probe were Primary antibody anti‐acetyl‐H3K9 (07‐352, Upstate), anti‐acetyl‐H3K14 (07‐353, Upstate), anti‐acetyl‐H3K18 (ab1191, Abcam), anti‐acetyl‐H3K23 (ab46982, Abcam), anti‐acetyl‐H4 (06‐598, Upstate), anti‐mono‐methy‐H3K4 (ab8895, Abcam), anti‐di‐methy‐H3K4 (ab32356, Abcam), anti‐tri‐methy‐H3K4 (9727s, Cell Signaling) and anti‐Histone H3 (ab1791, Abcam), anti‐Histone H4 (ab10158, Abcam), anti‐GCN5p (ab63810, Abcam) and secondary antibody, goat anti‐rabbit IgG, HRP conjugate (12‐348, Millipore).
Antibodies and western blotting for hsSGF29
MDA‐MB231 cells were cultured in RPMI with 10% FBS. shRNAs were obtained from the RNAi Consortium (TRC) courtesy Dr J Moffat, processed and used as outlined in the TRC protocols http://www.broadinstitute.org/rnai/trc. Sgf29 and Gcn5l2 shRNA chosen on the basis of best knockdown were clones NM_138414.1‐269s1c1 and NM_021078.2‐1848s21c1, respectively. For western blot experiments, histones were extracted by standard acid extraction method using 0.1 N HCl overnight, acetone precipitated, protein amount quantified and 0.1 μg ran on SDS Tris‐Bis gels (Invitrogen). The antibodies used to probe were from Abcam: total histone H3 (#1791), H3K18ac (#1191) and Upstate: H3K9ac (#06‐942).
Chromatin immunoprecipitation assays were performed as described by Kuo and Allis (1999). The wild‐type, ΔSgf29, sgf29ΔT1, sgf29ΔT2 strains were derivatives of BY4742 (Research Genetics, Huntsville, AL). Yeast cultures (50 ml) were grown in SC medium containing 2% dextrose to an absorbance at 600 nm of 0.5, cells were then reseeded in SC medium containing 2% galactose until they reached an absorbance at 600 nm of 1.0. Equal amounts of yeast extracts (1.5 mg total protein) were immunoprecipitated with 1 μl of antibody against acetyl histone H3, lysine 9 (Millipore, 07‐352) or antibody against Ada2 in a total reaction volume of 400 μl. DNA was amplified using oligonucleotide primers (Invitrogen) specific for GAL1 UAS and GAL1 5′ and 3′ intergenic regions. The recovered DNA was quantified in triplicate by real‐time PCR using Power SYBR Green PCR Master Mix (Applied Biosystems) and the Bio‐Rad MyiQ Single Color real‐time PCR detection system. ChIP values obtained are represented as percent input; where signals obtained from the ChIP are divided by the signal obtained from the respective input sample.
HAT assays were carried out as previously described (Eberharter et al, 1996), with the following changes: in the peptide HAT assay, 0.5 μg of each individual histone peptide synthesized by Tufts University Core Facility were used in a 30‐μl reaction. Both the core histones and the nucleosomes were isolated from HeLa nuclear extract and 0.5 μg were used in the assays described. Additionally, the reactions were carried out for 30 min at 30°C and activity was monitored as DPM (disintegrations per minute). Equal amounts of Spt8TAP and Spt8TAP (ΔSgf29) were used as the HAT enzyme in the reaction.
mRNA isolation and qRT–PCR
The WT, ΔSgf29, sgf29ΔT1, sgf29ΔT2, ΔSpp1 strains were derivatives of BY4742 (Research Genetics). Yeast cultures (50 ml) were grown in SC medium containing 2% dextrose to an absorbance at 600 nm of 0.5, cells were then reseeded in SC medium containing 2% galactose until they reached an absorbance at 600 nm of 1.0. Total RNA was extracted from cell pellets, RNA was then DNase treated using TURBO DNA‐free kit from Ambion. RNA (1 μg) was reverse transcribed using SuperScript VILO cDNA synthesis kit (Invitrogen, # 11750). Generated cDNA was amplified in real time using iQ SYBR Green Supermix (Bio‐Rad, # 2012‐03) and the Bio‐Rad MyiQ Single Color real‐time PCR detection system. cDNA was amplified using primers specific for GAL1 and ACT1 genes.
To monitor the expression of WT scSgf29 and mutants, yeast strain BY4742 with WT scSgf29 or mutants was grown in YPD overnight. Overnight cultures were reseeded into fresh medium to an optical density at OD600 of 0.4–0.5 and grown at the same temperature to early mid‐log phase (OD600 of 1.0–1.5). Then cells were collected and RNA was obtained with RNeasy® Mini Kit (Qiagen, 74104).
To analyse GAL1 expression, yeast strains BY4742_SGF29Δ and BY4742, were grown in YPD overnight. Then, cells were collected and washed with ddH2O (auto‐claved) twice. After washing, yeast cells were diluted in YP‐raffinose (2%) or YC‐raffinose (2%) to mid‐log phase and grown for 2–3 h. In all, 2% galactose was added to induce the expression of GAL1. Cells were collected after induction for 2 h and RNA was extracted with TRIzol (Invitrogen, 15596‐026).
Reverse transcription was performed using TAKARA PrimeScript® Reverse Transcription System with 200–500 ng total RNA to yield 20 μl cDNA. cDNA was amplified in real time using TAKARA SYBR® Premix EX Taq™ with an ABI StepOnePlus™ Real‐Time PCR Systems (Applied Biosystems). The primers used for RT–PCR are listed below.
Median gene plots
GPR files were loaded into R and normalized within arrays with the Limma package, using default methods. For each set of strain/IP replicates, normalized expression ratios were averaged per probe. These were converted into wiggle‐format track files, yielding one track per strain/IP combination. These average tracks were used to create two sets of derivative tracks, one set being mutant/WT ratios for each IP, the other being mark/H3 for each non‐H3 IP. These three sets of tracks (Averages, Mutant/WT, Mark/H3) were summarized using the average gene analysis pipeline described (Li et al, 2007), with one modification: for each bin, instead of averaging the expression values, the median of non‐zero values was taken, as we wanted to reduce sensitivity to outliers.
To observe the variation in H3K9Ac and Spt8 profiles among expressed versus non‐expressed genes, or short versus long genes, the median gene profiles were divided into subsets based on gene properties (Supplementary Figures S7–S10). Genes were divided into five quantiles based on microarray expression values for sgf29Δ, and median gene profiles were generated for these quantiles in three conditions (mutant, WT and mutant/WT), for both K9Ac and Spt8 data. Genes were then divided into five quantiles by length, and the process was repeated. To observe changes in the above effects on SAGA‐dependent genes, we selected a list of genes whose expression levels fell more than two‐fold on deletion of SAGA, from the Young lab website (http://web.wi.mit.edu/young/pub/holstege.html). The above analyses were then repeated using only this subset of genes. Final gene counts for whole‐genome and SAGA‐dependent genes are slightly shorter in these analyses (6229/6693) because we only used genes that were measured on both ChIP and expression array. This also shortened the SAGA‐dependent list slightly (593/611).
Protein Data Bank: Atomic coordinates and structure factors have been deposited under the following accession codes: 3MP8, 3MP6, 3MP1 for scSgf29 and its complexes with histone H3K4me peptides; 3LX7, 3ME9, 3MEA, 3MET, 3MEU, 3MEV, 3MEW for hsSGF29 and its complexes with histone H3K4me peptides.
Supplementary data are available at The EMBO Journal Online (http://www.embojournal.org).
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary Figures S1–S10
We thank SK Swanson, S Nakanishi, MP Washburn, A Shilatifard, G Wasney, M Vedadi, Z Zhu, J Kania, F MacKenzie and A Bochkarev for advice and technical assistance. We also thank Aled Edwards and Cheryl Arrowsmith for critical reading of the manuscript. This research was supported by the Structural Genomics Consortium, a registered charity (number 1097737) that receives funds from the Canadian Institutes for Health Research, the Canadian Foundation for Innovation, Genome Canada through the Ontario Genomics Institute, GlaxoSmithKline, Karolinska Institute, the Knut and Alice Wallenberg Foundation, the Ontario Innovation Trust, the Ontario Ministry for Research and Innovation, Merck & Co., Inc., the Novartis Research Foundation, the Swedish Agency for Innovation Systems, the Swedish Foundation for Strategic Research and the Wellcome Trust to JM. This work was also supported by the grants from the Chinese Ministry of Science and Technology (Nos. 2009CB825502, 2006CB806501), ‘100 talents program’ of the Chinese Academy of Sciences, and the grant from National Natural Science Foundation of China (No. 30970576) to JZ, by a post‐doctoral fellowship to KKL from the Damon Runyon Cancer Research Foundation (1751‐03) and NIH Grants GM46787 to JLW and CA132878 to GM. Results shown in this report are derived from work performed at Argonne National Laboratory, Structural Biology Center at the Advanced Photon Source. Argonne is operated by UChicago Argonne, LLC, for the US Department of Energy, Office of Biological and Environmental Research under contract DE‐AC02‐06CH11357.
Author contributions: The overall study was conceived and designed by CX, JR, JZ and JM with important contributions from CB, KKL, TLB, PAG and JLW. CB, CX, JR, KKL, TLB, WT, DB, JL, MW, BOZ, BEF, AP and AAH performed the experiments. All authors analyzed the data. JZ and JM wrote the paper with substantial contributions from CX, JR, KKL and TLB.
- Copyright © 2011 European Molecular Biology Organization