Related RNA polymerases (RNAPs) carry out cellular gene transcription in all three kingdoms of life. The universal conservation of the transcription machinery extends to a single RNAP‐associated factor, Spt5 (or NusG in bacteria), which renders RNAP processive and may have arisen early to permit evolution of long genes. Spt5 associates with Spt4 to form the Spt4/5 heterodimer. Here, we present the crystal structure of archaeal Spt4/5 bound to the RNAP clamp domain, which forms one side of the RNAP active centre cleft. The structure revealed a conserved Spt5–RNAP interface and enabled modelling of complexes of Spt4/5 counterparts with RNAPs from all kingdoms of life, and of the complete yeast RNAP II elongation complex with bound Spt4/5. The N‐terminal NGN domain of Spt5/NusG closes the RNAP active centre cleft to lock nucleic acids and render the elongation complex stable and processive. The C‐terminal KOW1 domain is mobile, but its location is restricted to a region between the RNAP clamp and wall above the RNA exit tunnel, where it may interact with RNA and/or other factors.
There is a Have you seen? (April 2011) associated with this Article.
Structural studies of cellular RNA polymerases (RNAPs) from all three kingdoms of life revealed a conserved enzyme architecture and active centre (Zhang et al, 1999; Cramer et al, 2000, 2001, 2008; Vassylyev et al, 2002; Hirata et al, 2008; Korkhin et al, 2009; Grohmann and Werner, 2011). In contrast, RNAP‐associated factors are not conserved between bacterial, archaeal, and eukaryotic lineages, except for the transcription elongation factor Spt5 that is called NusG in bacteria. NusG consists of an N‐terminal (NGN) domain and a flexibly linked C‐terminal Kyrides–Onzonis–Woese (KOW) domain (Knowlton et al, 2003; Mooney et al, 2009). Archaeal Spt5 is highly homologous to NusG and its NGN domain associates with the zinc‐binding protein Spt4 to form the Spt4/5 heterodimer (Hirtreiter et al, 2010; Klein et al, 2011). Eukaryotic Spt4/5 is called DSIF in metazoans and is very similar to the archaeal heterodimer, except that Spt5 contains an additional acidic N‐terminal region and an additional 3–4 C‐terminal KOW domains that are followed by a C‐terminal repeat region (CTR) (Hartzog et al, 1998; Wada et al, 1998; Guo et al, 2008).
The core function of NusG and Spt4/5 is to stimulate transcription elongation and RNAP processivity, and this function resides in the conserved NGN domain (Burova et al, 1995; Chen et al, 2009). The NGN domain binds the conserved coiled coil of the RNAP clamp (Sevostyanova et al, 2008; Mooney et al, 2009; Hirtreiter et al, 2010; Sevostyanova and Artsimovitch, 2010). NusG and Spt4/5 also have additional roles in transcription‐coupled processes. In bacteria, NusG is required for ρ factor‐dependent transcription termination (Sullivan and Gottesman, 1992; Cardinale et al, 2008), and it couples transcription to translation (Burmann et al, 2010; Proshkin et al, 2010). In eukaryotes, Spt4/5 is involved in mRNA 5′‐capping (Wen and Shatkin, 1999), promoter‐proximal gene regulation by the negative elongation factor NELF (Palangat et al, 2005), transcription‐coupled DNA repair (Jansen et al, 2000), organism development (Guo et al, 2000), and recruitment of activation‐induced cytidine deaminase to DNA during antibody diversification (Pavri et al, 2010). Spt4/5 is present on all transcribed yeast genes, and is apparently a general component of the elongation complex (Mayer et al, 2010).
To understand the mechanisms used by the Spt5/NusG elongation factor, structural details of its interaction with RNAP are required. Here, we present the crystal structure of a conserved complex of Spt4/5 from the archaeon Pyrococcus furiosus (Pfu) with the RNAP clamp domain. This structure leads to a reliable atomic model of the eukaryotic RNAP II elongation complex with Spt4/5, suggests the molecular basis of transcription processivity, and provides a framework for further studies of elongation‐coupled processes. The Spt5 NGN domain binds over the RNAP active centre cleft between the clamp on one side and the protrusion on the other side, enclosing the DNA–RNA hybrid and maintaining the transcription bubble. After our work had been completed, an electron microscopic reconstruction of an RNAP–Spt4/5 complex was reported that provided a medium‐resolution view of the Spt4/5‐containing RNAP elongation complex and resulted in similar overall conclusions (Klein et al, 2011).
A recombinant RNAP clamp that binds Spt4/5
In long‐standing efforts we could prepare milligram quantities of complexes of recombinant Saccharomyces cerevisiae Spt4/5 with endogenous yeast RNAP II and of Pfu Spt4/5 with the highly homologous endogenous Pfu RNAP (Materials and methods; Figure 1A and B). This demonstrated that recombinantly expressed Spt4/5 binds to endogenously purified RNAP, but these preparations never co‐crystallized. We thus considered determining the structure of the isolated RNAP clamp domain in complex with Spt4/5, which could enable accurate modelling of the RNAP–Spt4/5 complex. We chose to prepare the Pfu complex because the Pfu RNAP clamp contains several shorter loops and was thus predicted to exhibit less surface flexibility. Based on the free RNAP II structure (Armache et al, 2005), we designed a fusion protein of the three RNAP polypeptide parts that constitute the clamp domain. We fused residues 1053–1115, 3–318, and 334–371 of the three largest Pfu RNAP subunits B, A′, and A′′, respectively, separated by short linkers (Figure 1C). After expression of the fusion protein in bacteria, a soluble recombinant clamp domain was obtained (rClamp, Figure 1D). The rClamp protein was correctly folded since it formed a stable, apparently stoichiometric complex with recombinant Spt4/5 (Figure 1D).
Structure of RNAP clamp–Spt4/5 complex
The purified rClamp–Spt4/5 complex could be crystallized and its X‐ray structure determined at 3.3 Å resolution (Materials and methods). For structure solution, we combined experimental phases obtained from anomalous diffraction of four zinc ions (three in the clamp and one in Spt4) with model phases obtained by molecular replacement with the S. cerevisiae clamp structure (Armache et al, 2005). The Methanococcus janaschii Spt4/5 structure (Hirtreiter et al, 2010) was then fitted into the experimentally phased electron density map alongside the clamp structure, and after repeated cycles of rebuilding and refinement, an atomic model of the complex was refined that only lacked the Spt5 KOW domain, which was disordered (Table I; Figure 2). In the rClamp–Spt4/5 complex, the structures of free Spt4/5 and the clamp in free RNAP are essentially unaltered, except for minor local conformational changes.
Conserved clamp–Spt5 interaction
The structure revealed that the NGN domain of Spt5 binds to the RNAP clamp coiled coil as predicted (Mooney et al, 2009; Hirtreiter et al, 2010). The clamp–Spt5 interface comprises the tip and one side of the coiled coil, and a hydrophobic concave surface patch on the Spt5 NGN domain (Figure 3). The interaction involves the clamp coiled‐coil residues 255–268 from the Pfu RNAP subunit A′, which correspond to residues 279–292 of S. cerevisiae RNAP II subunit Rpb1 and residues 282–295 of Escherichia coli RNAP subunit β′ (Figures 2C and 3). The interaction patch on the Spt5 NGN domain involves 11 residues in three different regions of the primary sequence that cluster on the domain surface (Figure 2C). A structure‐based alignment of NGN domains from eukaryotic, archaeal, and bacterial homologues revealed that the surface patch is generally conserved, including most hydrophobic residues (Figures 2C and 3). These results are consistent with mutagenesis data that indicated that the concave patch on the NGN domain interacts with the clamp (Mooney et al, 2009; Hirtreiter et al, 2010). The conservation of the clamp–Spt5 interface indicates that our structure is a good model for all complexes of Spt5/NusG with RNAPs, and suggests a general architecture of the Spt5/NusG‐containing RNAP elongation complex, the minimal physiological form of the elongation complex.
Spt5/NusG closes the RNAP active centre cleft
To obtain a model of the archaeal RNAP–Spt4/5 complex, we superimposed the clamp domain in our structure with the clamp in the structure of free Sulfolobus solfataricus RNAP (Hirata et al, 2008). To obtain models of the bacterial RNAP–NusG complex and the eukaryotic RNAP II–Spt4/5 complex, we repeated the superposition with the structures of Thermus thermophilus RNAP (Vassylyev et al, 2007) and S. cerevisiae RNAP II (Armache et al, 2005), respectively, and then replaced the archaeal Spt4/5 by T. thermophilus NusG (Reay et al, 2004) or yeast Spt4/5 (Guo et al, 2008) via superposition of their NGN domains. The resulting three models of corresponding complexes from all three kingdoms of life were free of steric clashes, and even a non‐conserved domain in the bacterial RNAP (Chlenov et al, 2005) could be accommodated (Figure 4; Supplementary data). The models showed that the NGN domain resides above the RNAP active centre cleft, essentially closing the cleft (Figure 4). In the bacterial model, the NGN domain reaches over the cleft and resides in contact distance to the RNAP lobe and protrusion (Figure 4). In the archaeal and eukaryotic models, a contact of the NGN domain with the protrusion and lobe may also be possible if the clamp closes slightly further. Spt4 points away from the RNAP surface, consistent with its non‐essential nature in eukaryotes and with the lack of an Spt4 homologue in bacteria.
The NGN domain locks nucleic acids in the cleft
We next modelled the RNAP II–Spt4/5 complex with the DNA template/non‐template duplex and the RNA product by including the nucleic acids from the complete elongation complex (Kettenberger et al, 2004; Andrecka et al, 2009). To obtain a model that was free of clashes, only a minor shift of the upstream DNA was required (Figure 5). The model shows that Spt4/5 is positioned on the elongation complex such that the nucleic acids, in particular the DNA–RNA hybrid and the DNA strands forming the transcription bubble, are locked in the enzyme active centre cleft (Figure 5). Upstream and downstream DNA are thus kept separated by Spt4/5. The DNA upstream duplex and non‐template strand within the bubble run along a positively charged surface of Spt4/5 (Figure 5B). At least part of this surface of the NGN domain is positively charged in all species investigated, even though only two basic residue positions are conserved (Figure 2C). The modelling is consistent with biochemical data, showing that NusG binds near the upstream edge of the transcription bubble (Sevostyanova and Artsimovitch, 2010) and that the NusG paralogue RfaH maintains the upstream bubble (Belogurov et al, 2010).
Restricted location of the KOW domain above exiting RNA
We next investigated the possible location of the universally conserved and flexible KOW domain located just C‐terminal of the NGN domain (Figure 2A). The last ordered residue in Spt5 (residue E85) is located between the top of the clamp and wall, about 55 Å above the RNA exit tunnel (Figure 6). Since the linker from this last ordered residue to the KOW domain is restricted to a length of 11–13 residues over species, the location of the KOW domain is restricted to a sphere of a maximum radius of ∼45 Å. The sphere encompasses the region between Spt4 and the RNAP clamp, wall, and Rpb4/7 subcomplex (Figure 6). To model possible locations of the KOW domain, we superimposed the NGN domains in available NusG/Spt5 structures that contain both domains (Steiner et al, 2002; Knowlton et al, 2003; Klein et al, 2011), with the NGN domain in our elongation complex model. The resulting positions of the KOW domain fall within the sphere and are realistic, as no clashes with RNAP were observed with the exception of one structure (PDB code 1M1G) (Figure 6 and data not shown). None of the modelled KOW domains are close to exiting RNA. Modelling also showed that the KOW domain cannot reach the RNA exit tunnel, even when the linker between the NGN and KOW domains is fully extended. The KOW domain may however contact RNA that has emerged well beyond the exit tunnel and has grown to 25–30 nucleotides in length.
Highly extended eukaryote‐specific Spt5 regions
We finally considered the possible location of eukaryote‐specific Spt5 regions located C‐terminal to the KOW1 domain (Figure 2A). Because of a very long linker between KOW domains 1 and 2 (118 and 114 residues in yeast and human Spt5, respectively), KOW domain 2 and subsequent regions could reach any position on the Pol II surface. If all linkers between the KOW domains and the CTR would be fully extended, the C‐terminus of Spt5 would be located around 2000 Å away from the RNAP II surface. This corresponds to twice the length of a fully extended C‐terminal repeat domain (CTD) of yeast RNAP II (Cramer et al, 2001). In human Pol II, the Spt5 C‐terminus could reach up to 2600 Å from the Pol II surface, and this would be about 1.6 times the length of a hypothetical totally extended CTD.
Here, we report the crystal structure of a recombinant RNAP clamp in complex with Spt4/5, the only universally conserved transcription factor. The structure revealed a conserved clamp–Spt5 interface and enabled accurate modelling of RNAP complexes with Spt4/5 counterparts from all three kingdoms of life. These results represent a significant advance in our understanding of transcription complex architecture since atomic details of RNAP interactions with transcription factors are to date limited to the bacterial factors σ70 and Gfh1 (Vassylyev et al, 2002; Murakami et al, 2002a; Tagami et al, 2010), and the eukaryotic factors TFIIS and TFIIB (Kettenberger et al, 2003; Bushnell et al, 2004; Kostrewa et al, 2009; Liu et al, 2010).
Our work revealed that the Spt5 NGN domain resides above DNA and RNA bound in the RNAP active centre cleft, and provides an explanation for the universal function of NusG and Spt4/5 in transcription processivity during RNA elongation. The NGN domain locks nucleic acids in the cleft, preventing their dissociation and increasing elongation complex stability. In addition, interaction between the positively charged Spt5 surface and the negatively charged DNA non‐template strand may prevent collapse of the transcription bubble. Many of these conclusions could be drawn from a recently published electron microscopic reconstruction of an RNAP–Spt4/5 complex (Klein et al, 2011) and are confirmed here and extended based on high‐resolution data.
The data further provide insights into the initiation–elongation transition. During initiation, straight promoter DNA is melted and loaded into the cleft to trigger RNA synthesis. This results in upstream and downstream DNA duplexes that extend from RNAP at approximately right angles (Figure 5). Subsequent Spt4/5 binding may render the initiation–elongation transition irreversible, because it sterically enforces the nucleic acid arrangement and prevents RNA release and reassociation of DNA strands. Premature binding of Spt4/5 during initiation is likely prevented by initiation factors that occupy overlapping binding sites on the clamp, in particular the bacterial factor σ70 (Vassylyev et al, 2002; Murakami et al, 2002a, 2002b) and the archaeal/eukaryotic factor TFE/TFIIE (Chen et al, 2007). In addition, the results indicate that any model for transcription termination must explain how Spt4/5 is released from RNAP, to set free the nucleic acids.
Additional mechanisms contribute to Spt4/5 function during elongation and remain to be explored on a structural level. First, in an intact RNAP elongation complex, the NGN domain may contact the side of the cleft opposite the clamp, in particular the lobe and/or protrusion. This may involve subtle alterations in clamp position that could alter catalytic properties of RNAP allosterically (Hirtreiter et al, 2010) consistent with normal mode analysis (Yildirim and Doruker, 2004). Second, the conserved KOW1 domain adjacent to the NGN domain may contact DNA and/or exiting RNA, provided that the RNA has reached a length of 25–30 nucleotides, and such contacts could contribute to elongation complex stability and may also involve the RNAP II subcomplex Rpb4/7 (Ujvari and Luse, 2006; Cheng and Price, 2008; Missra and Gilmour, 2010). The KOW1 domain also contacts the bacterial termination factor ρ and may mediate ρ action on nearby exiting RNA. Consistent with this model, RNAP contacts are limited to the NGN domain and the KOW domain is mobile. Third, additional KOW domains that are present in eukaryotic Spt5 could contact exiting RNA and could reach anywhere on the RNAP II surface to assist in eukaryote‐specific functions. For example, they could reach to the foot domain of RNAP II that was implicated in mRNA capping (Suh et al, 2010). Finally, the CTR that is present in eukaryotic Spt5 is subject to phosphorylation, and contributes to the recruitment of the PAF complex, which in turn recruits factors involved in chromatin modification and mRNA maturation (Liu et al, 2009; Zhou et al, 2009).
Materials and methods
Construct design and cloning
The full‐length Pfu genes encoding Spt4 and Spt5 and the S. cerevisiae genes encoding full‐length Spt4 and Spt5 residues 283–853 were cloned into a pET24d‐derived bicistronic vector. This vector contained a second ribosomal‐binding site introduced between the SalI and NotI sites by the primer GTCGACAATAATTTTGTTTAACTTTAAGAAGGAGATATACATATGGCGGCCGC (SalI and NotI sites are in italics, the ribosomal‐binding site is bold, and NdeI site is underlined). The Spt4 and Spt5 genes were cloned in the vector flanked by the sites NcoI/EcoRI and NdeI/NotI, respectively. Spt5 contained a C‐terminal hexahistidine tag. The DNA encoding for the rClamp was cloned into a pOPINE vector, with the three fragments connected by two linkers composed of glycines (G), alanines (A), and serines (S) (Figure 1C).
Preparation of protein complexes
All proteins were expressed in Rosetta (DE3) pLys S (Novagen) grown in LB medium at 37°C to an OD600 of 0.6. Expression was induced with 0.5 mM IPTG for 16 h at 20°C. Cells were lysed by sonication in buffer L1 (50 mM Tris pH 7.5, 300 mM KCl, and 3 mM DTT) for the rClamp and in buffer L2 (25 mM HEPES pH 7.5, 500 mM KOAc, 10 mM Imidazole, 0.1 mM ZnCl2, 10% Glycerol, and 3 mM DTT) for Spt4/5. After centrifugation at 15 000 g for 20 min, the supernatant was loaded onto a 1.5‐ml Ni‐NTA column (Qiagen) for rClamp and onto two 1 ml Ni‐NTA columns for Spt4/5. Each column was equilibrated with the respective L1 or L2 buffer. The rClamp column was washed with 10 ml and each Spt4/5 column with 5 ml of L1 (rClamp) or L2 (Spt4/5) buffer plus 100–300 mM imidazole. For archaeal proteins, a heat step (70°C, 10 min) was used to remove the E. coli contaminant proteins. The samples were centrifuged at 14 000 g for 10 min and the supernatant was applied to a Superdex 75 10/300 column (GE Healthcare) equilibrated in buffer GF1 (20 mM HEPES pH 7.0, 200 mM KCl, 5 mM DTT, and 10% Glycerol) for rClamp and in buffer GF2 (20 mM HEPES pH 7.5, 200 mM KCl, 0.1 mM ZnCl2, 20 mM Imidazole, 2.5 mM DTT, and 10% Glycerol) for Spt4/5. S. cerevisiae RNAP II and Pfu RNAP were prepared as described (Kusser et al, 2007; Sydow et al, 2009). To form the archaeal complexes, a three‐fold molar excess of Spt4/5 was added to the rClamp or RNAP. For the eukaryotic complex, a 30‐fold molar excess of Spt4/5 was added to the Pol II‐nucleic acid complex (Kettenberger et al, 2004). Proteins were incubated for 1 h at 20°C. The archaeal proteins were then incubated at 70°C for 10 min. The samples were centrifuged at 14 000 g for 5 min prior to loading to a Superose 12 10/300 column (GE Healthcare).
Crystal structure determination
The Pfu rClamp–Spt4/5 complex was concentrated to 4 mg/ml. Crystals grew within 3–4 days at 20°C in hanging drops over a reservoir solution containing 10% PEG 8000, 100 mM Na/K Phosphate pH 6.2, 150 mM Guanidine hydrochloride, and 200 mM NaCl. The crystals were cryo‐protected by stepwise transfer to their mother liquor supplemented with increasing concentrations of glycerol (7, 14, and 22%) and were flash frozen in liquid nitrogen. Crystals were mounted at 100 K on beamline X06SA of the Swiss Light Source, Villigen. We collected 360° of data in 0.25° increments on a PILATUS 6 M detector (DECTRIS) at the K‐absorption edge of zinc. Diffraction images were integrated and scaled with XDS/XSCALE (Kabsch, 2010) or MOSFLM/CCP4 (CCP4, 1994; Leslie, 2006), to a high‐resolution limit of 3.3 Å. Molecular replacement was carried out with PHASER (McCoy et al, 2005) using a search model from yeast RNAP II (Armache et al, 2005) truncated to the clamp coiled‐coil domain. PHASER located the search model but revealed poor density for Spt4/5. A SAD phasing approach was then pursued where the intrinsic zinc sites were located using an anomalous difference Fourier map with phases calculated from the molecular replacement search solution. Three sites were located within the clamp coiled‐coil domain and used as input sites in SAD phasing with autoSHARP (Global Phasing Limited), which found an additional fourth site corresponding to the zinc ion in Spt4. However, the resulting maps were inadequate for building until the partial model phases from PHASER were combined with SAD phases. Subsequent density modification by autoSHARP showed clear density for Spt5 and weak density for Spt4. To reduce potential model bias during phasing, a polyalanine model was built into the initial map and refined before side chain modelling. Model building and refinement were carried out with COOT (Emsley and Cowtan, 2004) and autoBUSTER (Global Phasing Limited), respectively.
Molecular modelling and figure preparation
Superpositions and molecular modelling was carried out with COOT (Emsley and Cowtan, 2004), and figures were prepared with PYMOL. Sequence alignments were edited with ALINE (Bond and Schuttelkopf, 2009).
Database accession numbers
Coordinates and structure factors for the Pfu RNAP clamp–Spt4/5 complex have been deposited at the protein data bank under accession number 3QQC.
Supplementary data are available at The EMBO Journal Online (http://www.embojournal.org).
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary Data 1
Supplementary Data 2
Supplementary Data 3
Supplementary Data 4
We thank Sandra Schildbach and members of the Cramer laboratory for help. We thank the Crystallization Facility at the Max‐Planck‐Institute for Biochemistry in Martinsried. Part of this work was performed at the Swiss Light Source (SLS) at the Paul Scherrer Institut, Villigen, Switzerland. FWMR and SS were supported by the Alexander‐von‐Humboldt Stiftung. PC was supported by the Deutsche Forschungsgemeinschaft, SFB646, TR5, FOR1068, NIM, the Bioimaging Network BIN, and the Jung‐Stiftung.
This is an open‐access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation without specific permission.
- Copyright © 2011 European Molecular Biology Organization