Fine-Tuning of RNA Functions by Modification and Editing (Topics in Current Genetics)

Modification and editing of RNA: historical overview and important facts to remember Henri Grosjean Abstract RNA plays ...

Author: Henri Grosjean (Editor)

25 downloads 512 Views 12MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Modification and editing of RNA: historical overview and important facts to remember Henri Grosjean

Abstract RNA plays a central role in many cellular processes and several peculiarities of RNAs are probably relics of an ancient primordial RNA World. To fulfill their multiple present-day functions, these molecules need more than just four canonical bases. The numerous modified nucleosides that are formed during processing of nascent precursor RNA transcripts clearly serve this purpose. The recent discoveries of RNA-guided RNA modification machineries and of RNA editing processes leading to selected conversions of one base into another in the pre-RNA, add new dimensions to the problems surrounding the biosynthesis and functions of modified and edited nucleosides in RNA. The majority of these so-called minor or edited nucleosides appear to improve the performance of the matured RNA by working more efficiently and accurately in various steps of cellular metabolism. However, their effects can be subtle and not easy to demonstrate either in vivo or in vitro. Here, we review some basic characteristics of the modified nucleosides and of enzymes leading to such post-transcriptional modifications and editing of RNA

1 Short historical background 1.1 Discovery of modified nucleosides Figure 1 points out the most important discoveries concerning modified nucleosides in nucleic acids (mostly RNA) and their corresponding enzymes, which span about five decades, while the discovery of the RNA editing process was made more recently, about 20 years ago. Before 1948, naturally occurring nucleic acid polymers (DNA and RNA) were thought to contain only four canonical nucleosides: the ribo- or deoxyriboderivatives of adenine, cytidine, guanine and uracil or thymine. Hotchkiss (1948) reported the first evidence for presence of trace amounts of a non-canonical nucleoside in DNA. This nucleoside was identified as deoxy 5-methylcytosine 5 (dm C; Wyatt 1950). Soon after, Cohn and Volkin (1951) also detected small amounts of another compound (designated “?”), but in RNA hydrolysates. The structure of that minor compound was identified in 1956 as 5-ribosyluracil, an

Topics in Current Genetics H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106848 / Published online: 27 January 2005 © Springer-Verlag Berlin Heidelberg 2005

2 Henri Grosjean

Fig. 1. Milestones discoveries related to post-transcriptional modification, splicing, editing and interference of nucleic acids (DNA and RNA). Gray circles along abscissa correspond to the various periods of the greatest scientific excitements concerning the novelty of the discoveries: 1) the identification of numerous modified nucleosides in RNA hydrolysates and in newly sequenced RNAs (period 1955-1970); 2) the discovery of intron splicing phenomena (period 1975-1985); 3) the RNA Editing phenomena (period 1985-1995), and 4) more recently the discoveries related to RNA interference and DNA editing processes (period 1995 - present). For details, see text and information within the figure

isomer of the canonical 1-ribosyluracil (uridine), thereafter called the “fifth ribonucleoside” (Davis and Allen 1957) and soon after designated pseudouridine (abbreviated in Ψ) (Cohn 1960). This unusual ribonucleoside was present in about 12% in “salt-soluble” RNA fractions (in fact tRNA), while the “salt-insoluble” fraction (in fact rRNA) also contained Ψ but less (below 1%). To date, we know that pseudouridine Ψ is indeed present in every isoacceptor tRNA (at least one mole per mole, present at position 55 in the so-called TΨ-loop), in all rRNA molecules and in most sn(o)RNAs. Only mRNA has not been shown to date to contain Ψ, however, an experiment to prove or disprove that possibility has not really been carried out. Pseudouridine results from enzymatic isomerization (internal transglycosylation, see below) of the genetically encoded U into Ψ, catalyzed by RNA:pseudouridine synthases. To date, many distinct RNA:pseudouridine synthases have been identified and several of them (mostly from Escherichia coli) have been obtained in crystallized forms (see e.g. Hoang and Ferré d’Amaré 2000;

Historical overview and facts to remember 3

Kaya et al. 2004). Only recently, have the detailed mechanistic aspects of pseudouridine isomerization in RNA begun to be understood (reviewed in Mueller 2002; Spedaliere et al. 2004). Soon after the discovery of Ψ in RNA, a great deal of efforts have been made in many laboratories (1950-60) to identify other ‘rare ‘ or ‘minor’ nucleosides in RNA. They were 2’-O-methylribose derivatives (Cm, Gm, Um, Am), 5methylribouridine (m5U, also named riboT) and 5-methylribocytosine (m5C). Between 1955, when Chargaff and Davidson published their book, “The Nucleic Acids”, which had almost no allusion to modified nucleosides in nucleic acids and 16 years later when Ross Hall (1971) published the first book devoted to “The Modified Nucleosides in Nucleic Acids”, at least 35 well-characterized modified nucleosides had been identified in both DNA (in fact only dm5C at that time) and RNA (34 new structures). These included inosine (a deaminated form of adenosine abbreviated in I, Wagner and Ofengand 1970, Kammen and Spengler 1970) and many hypermodified nucleosides such as mnm5s2U, ms2i6A, t6A (for chemical structures as well as names of these compounds, consult Limbach et al. 1995; also http://medstat.med.utah.edu/RNAmods/). These heavily chemically modified derivatives are now known to be present in stoechiometric amounts in the anticodon loop of certain isoacceptor tRNAs (for details, consult Sprinzl et al. 1998; or http://rna.wustl.edu/snoRNAdb/). Thanks to the development of methods for purifying individual RNA species (isoacceptor tRNAs, different rRNAs, and later also various snRNA) from diverse organisms of the three domains of life (Bacteria, Archaea, and Eukarya) and methods for sequencing them, more than 100 non-canonical (now simply called modified nucleosides) have been identified. Among them, over 80 distinct modified nucleosides have been found to occur naturally in tRNAs and about an additional 20 were shown to be present in other types of RNA (rRNA, mRNA, sn(o)RNA, and even chromosomal RNA). Only 2’O-methylribose derivatives (Am, Gm, Cm, Um) of the four canonical NMP and N6-aminoadenosine (m6A) have been found in all major types of RNA, the 2’-O-methylderivatives and pseudouridines being by far the most frequently encountered ‘minor’ ribonucleosides. Information about modified nucleosides in most classic textbooks on Biochemistry or Molecular Biology is generally limited to only a few examples of modified nucleosides, the most popular one being inosine because of its presence in the anticodon of yeast tRNAAla (anticodon IGC), the first tRNA that was sequenced (Holley et al. 1965). The famous Wobble Hypothesis (Crick 1966) stating that inosine in position 34 of tRNA could base pair with A in the third position of a codon has been only recently demonstrated (Murphy and Ramakrishnan 2004). 1.2 Discovery of RNA modification enzymes The identification of the first RNA modification enzyme, a tRNA:m5U methyltransferase (now designated TrmA in Bacteria and Trm2 in Eukarya) was made almost simultaneously by the group of Borek (Fleissner and Borek 1962) in the USA and by Svensson and collaborators in Sweden (Svensson et al. 1963). This

4 Henri Grosjean

enzyme catalyzes the site-specific S-AdoMet-dependent formation of m5U (also known as ribothymidine) at position 54 of the TΨ-loop in almost all tRNAs. The existence of this enzyme was an important discovery because it demonstrated for the first time that, at variance with DNA polymerase, which incorporates deoxythymidine in DNA using triphosphate precursors, the ribothymidine derivative occurs in RNA by post-transcriptional modification of an encoded uridine, thus, at the precursor RNA level. A great deal of research to identify and characterize the other enzymes catalyzing many different modified nucleosides present in RNA (as well as in DNA) was undertaken in several laboratories (reviewed in Borek and Srinivasan 1966; Söll and Kline 1982; Kline and Söll 1982). However, the lack of adequate substrates to test their activities in vitro, together with the difficulties encountered in purifying the enzymes that are present in low amounts in the cells, were the major obstacles and were not easily overcome for almost three decades. Not only did they hinder progress of research on RNA modification, but they also discouraged many scientists from working on the purification and characterization of a given RNA modification enzyme. Fortunately, recent developments of the recombinant DNA and RNA technologies, together with techniques for chemical synthesis of DNA and RNA and detection of modified nucleosides in nucleic acids (reviewed in Grosjean et al. 1998, 2004; Crain 1998; Zimmermann et al. 1998) offers adequate possibilities to obtain synthetic or semi-synthetic substrates. Subsequently, it is then possible to analyze the modified nucleoside content after incubation with cell extracts. These simple tests have allowed important progress in the identification and characterization of many additional enzymes, including those that were the most difficult ones to study, that is to say, enzymes involved in the formation of hypermodified nucleosides (reviewed in Garcia and Goodenough-Lashua 1998). Moreover, access to complete genome sequences of many different organisms, coupled with routine cloning techniques, and production of purified recombinant enzymes from almost any organism, opens an extraordinary avenue to study this huge family of enzymes. It is now possible to compare them at the level of their primary and tertiary structures (when obtained in the crystallized forms) and explore their putative evolutionary origins (see e.g. Anantharaman et al. 2002) in relation to that of tRNA (Marck and Grosjean 2002). 1.3 RNA editing, a new concept 1.3.1 Insertion/deletion of nucleotides An entirely unexpected RNA modification process, originally called ‘RNA editing’, has been discovered by Benne and collaborators (Benne et al. 1986; reviewed in Benne 1994). They demonstrated that addition or deletion of several nongenomically encoded uridine residues occurred in mitochondrial mRNA of kinetoplastide protozoa. This post-transcriptionnal process was initially observed in the coding region of mRNAs where it alters (‘edits’) the protein coding sequences of mRNAs. The same phenomenon has now been observed in RNA other than

Historical overview and facts to remember 5

mRNA, such as rRNA and tRNA of mitochondria of several myxomycetes. Several enzymes are implicated in this process allowing breakage of phosphodiester bonds and insertion/deletion of one or several uridine mononucleoside(s) within an pre-RNA: an endonuclease, 3’-uridylyl exonuclease (in the case of U-deletion) or 3’-uridylyl transferase (TUTase, in the case of U-insertion), an RNA ligase and a trans-acting guided RNA (gRNA) that provides the necessary information of where editing must occur (reviewed in Estévez and Simpson 1999; Stuart and Panigrahi 2002). Such an RNA-guided insertion/deletion mode of RNA editing is reminiscent of the RNA-guided RNA modification machinery (see below) that catalyzes the multi-site specific formation of many 2’-O-methyl riboses and pseudouridines in pre-RNAs. Alternative but probably related mechanisms should also exist for C, G, or A insertions as observed in several types of mitochondrial RNAs of the slime mold, Physarum polycephalum, or for 5’- and 3’-terminal editing in mitochondrial tRNA of ameboid protozoon or metazoan animals. However, detailed mechanisms and knowledge of the enzymes involved are still undiscovered (discussed in Gott and Emeson 2000; Gray 2001). 1.3.2 Conversion of bases mediated by deaminases Soon after the discovery of the insertion/deletion mode of RNA sequence alteration, the existence of a completely different RNA editing process was demonstrated. It involves conversion of C-to-U in mRNA encoding apolipoprotein B (apoB) of human cells (Powell et al. 1987; Chen et al. 1987) and an A-to-I conversion (Bass and Weintraub 1988; Wagner et al. 1989). This occurs in viral RNAs of infected eukaryotic cells, in cellular mRNAs of the brain of vertebrates, and in double-stranded RNA microinjected in Xenopus oocytes (reviewed in Hough and Bass 2001; Emeson and Singh 2001). At variance with the insertion/deletion mode of RNA editing (see above Section 1.3.1), C-to-U and A-to-I base conversions occur without breakage of the phosphodiester bond and are mediated solely by proteins, including RNA deaminases (‘Cdar’ specific for uracil and ‘Adar’ specific for adenine). Close inspection of both mechanisms and of the enzymes involved in such RNA editing processes reveals that the C-to-U type of editing corresponds to a special case of RNA modification. Here, the ‘modified’ cytidine is just a canonical uracil and the A-to-I mode of editing corresponds to an enzymatic process that is very similar to the one already known to catalyze the formation of inosine in anticodon loop of cellular tRNAs of many types of organisms (Auxilien et al. 1996; Gerber et al. 1998; Mass et al. 1999; Wolf et al. 2002). Moreover, the finding that deaminase domains of RNA adenine deaminases (Adar) acting on mRNAs and double-stranded (ds)RNAs have clear evolutionary relationships with those involved in the formation of inosine in tRNAs (Tad) and, to some extent, also to those of cytidine deaminsases acting on RNA, DNA, and on CMP/CTP precursors (Keller et al. 1999; Gerber and Keller 2001), strongly argues in favor of a fundamental unity and, possibly, a common evolutionarily origin of all RNA deaminases (editing and modifying enzymes) identified so far (see also Conticello et al. 2005).

6 Henri Grosjean

1.3.3 Base substitution in RNA Other types of post-transcriptional RNA alterations, distinct from deaminasemediated C-to-U and A-to-I reactions, have been shown to occur in various RNAs (mRNA, rRNA, and/or tRNA) of plant mitochondria, myxomycetes, and some unicellular protozoa. They consist of site-specific conversion of a canonical base into another canonical base (mostly U-to-C, but also occasionally U-to-A, or G, Cto-A, A-to-G…), of which the mechanisms and type of enzymes involved have not yet been elucidated. Some of these editing processes may occur by a deletion reaction followed by an insertion reaction as is the case for the UMP deletion/insertion type of RNA editing as in the mRNA of kinetoplastide mitochondria (Section 1.3.1). In that scenario, alteration of pre-RNA sequence involves cleavage of the RNA phosphodiester bond. However, alternatives could involve enzymes leaving the sugar-phosphate backbone intact. For example, the U-to-C base conversion could be mediated by an amidase-type of reaction as the one known for CTP synthetase activity (Zalkin 1985; Marchfelder et al. 1998). Concerning the pyrimidine to purine ‘conversions’, they could occur via a transglycosylase-type of reaction, possibly involving enzymes of the same family as those catalyzing the formation of pseudouridine, queuosine, and achaeosine in RNAs (see below, Section 1.4.2). If this were the case, then using the term ‘base substitution’ instead of ‘base conversion’ would be more appropriate. 1.3.4 RNA editing or RNA modification? Obviously, ‘RNA editing’ and ‘RNA modification’ in some cases are ’two sides of the same coin’. The term ‘RNA editing’ should be reserved when the consequence of a given RNA ‘alteration’ is to change the genetic meaning of the edited product (as initially proposed by Rob Benne 1993). ‘RNA modification’ should be used when the change is purely structural, despite the fact that the same enzymatic machinery can act as an ‘editor’ or a ‘modificator’, depending on the consequence of the enzymatic RNA alteration. With this definition, some chemical alterations that are traditionally called ‘RNA modifications’, such as the formation of inosine or better lysidine (a C-modified nucleoside that decodes A instead of G, discussed in chapter by Suzuki in this volume) in the wobble position of a tRNA, should be called RNA editing. Moreover, in some cases, RNA modification and RNA editing are two distinct processes that can be intimately integrated and even interdependent (see review article by Rubio and Alfonzo in this volume). The important point is that modification and/or editing of a particular nucleoside is only a small part of a more complex post-transcriptional maturation process that includes also splicing, 5’- and 3’-trimming, 5’-capping and correct folding. The precise interplay of these various types of RNA alterations allows in fine to produce fully mature RNAs with ‘new’ chemical and structural properties from that strictly encoded by the corresponding gene as non-functional primary RNA transcript (e.g. see Bratt and Ohman 2003; Tang et al. 2002).

Historical overview and facts to remember 7

1.4 Different mechanisms for RNA modification/editing 1.4.1 The ‘classical’ RNA modification enzymes During the last decade, the number of RNA modification enzymes characterized and their corresponding genes, mostly from E. coli and the yeast Saccharomyces cerevisiae, has increased considerably and few of the corresponding recombinant enzymes have been crystallized (reviewed in Hopper and Phyzicky 2003; Bujnicki et al. 2004; De Crécy-Lagard 2004; also review the chapters by Suzuki, Johansson and Byström in this volume). The majority of these enzymes alter the chemical nature of a genetically encoded nucleoside by reduction, deamination, and thiolation, or they add a chemical group (such as a methyl or even bigger compound such as isopentenyl group) on one atom of a base or a ribose of the RNA substrate. As a rule, the diester bond of the ribose-phosphate backbone is never broken. The cofactors required for these enzymatic reactions are metabolites originating from the central metabolism, such as S-adenosyl-L-methionine, FAD, NADH, methylene tetrahydrofolate, ATP, GTP, isopentenyl-pyrophosphate, and/or various amino acids… These ‘classical’ RNA modification enzymes can be site-specific, multi-site specific on RNA substrate or dual-specific (acting on different types of RNA). Some of these enzymes act only on pre-RNA containing intron, while others work only on partially matured RNA of which intron has been removed (reviewed in Björk 1996; Garcia and Goodenough 1998; Grosjean et al. 1998). 1.4.2 Transglycosylases The case of the RNA pseudouridine synthases discussed above in Sections 1.1 and 1.2, is special since enzymatic reaction involves cleavage of the canonical C1-N1 glycosidic bond, followed by a 180° rotation of the genetically encoded uracil -

with respect to the ribose, and reformation of a non-canonical C1 C5 glycosidic bond. This reaction does not require an external source of energy, and is usually referred to as an isomerization of the uracil base into Ψ. However, that reaction should be more appropriately designated as intramolecular- (or cis-) transglycosylation reaction, and the corresponding enzymes catalyzing such reaction, tRNAuridine transglycosylase. The existence of transglycosylation type of reactions involving replacement in trans of an encoded guanine at position-34 in the anticodon of certain pre-tRNAs by a pre-modified base precursor (preQ1 in bacteria or queuine base in eukaryotes) has also been demonstrated. The corresponding enzymatic activity was initially discovered by Farkas’ group (Farkas and Hankins 1973; Howes and Farkas 1978) and later confirmed and well characterized by two Japanese groups (Okada et al. 1976; Itoh et al. 1977; Okada and Nishimura 1979). The enzyme, now designated tRNA-guanine-34 transglycosylase (abbreviated tgt, also called Qinsertase), exists in almost all bacteria and eukaryotes (reviewed in Nishimura 1983). In archaea, formation of archaeosine, also a modified deazaguanine derivative related to preQ1 as in bacteria, is present at position 15 in the D-loop of most archaeal tRNAs (Gregson et al. 1993). This derivative is catalyzed via an evolu-

8 Henri Grosjean

tionarily related tRNA-guanine transglycosylase but of different substrate specificity from that of bacteria and eukarya (Watanabe et al. 1997). The fact that the tgt enzyme or a tgt-like ORF is almost universally detected in all genomes together with the universal existence of a nucleoside deoxyribosyltransferase and DNA glycosydase involved in the replacement of a base moiety of a DNA by a free purine or pyrimidine (Savva et al. 1995; reviewed in Mol et al. 1995) indicates that enzymes catalyzing transglycosylation-type of reactions (in trans) existed very early in evolution, probably already in primordial cells. They now appear to be universally used for various types of specific post-transcriptionnal/postreplicative modifications (perhaps also editing, see above in Section 1.3.3) of nucleic acids from the three domains of life (also discussed in Garcia and Kittendorf 2005). 1.4.3 Guide-RNA-mediated RNA modification machinery In 1996, the discovery of the snoRNA-guided RNA modification machinery in yeast and in vertebrate cells catalyzing methylation of many 2’-O-ribose residues (more than 50 in S. cerevisiae and more than 200 in vertebrate rRNAs) (Kiss et al. 1996; Nicoloso et al. 1996; reviewed in Decatur and Fournier 2003) added a new dimension to the amazing varieties of mechanisms leading to RNA alterations. Soon after, in 1997, the same phenomenon was revealed for the ‘isomerization’ of many U’s into Ψ’s in rRNA (Ni et al. 1997; Ganot et al. 1997). These two distinct types of RNA-assisted RNA machinery have now been demonstrated to catalyze formation of almost all 2’-O-methylriboses (except for a few, see review in this volume by Lapeyre) and pseudouridine formation in rRNA of S. cerevisiae and probably as well of vertebrates and archaeal cells. They both work also on presn(o)RNAs and on pre-RNAs (tRNA and rRNA) in archaea (Omer et al. 2003; Ziesche et al. 2004; reviewed in this volume by Yu et al.). Here, an enzyme, SAdoMet-dependent methylase or RNA-uracil transglycosylase (alias RNA pseudouridine synthase) works together within a ribonucleoprotein complex containing few additional proteins and a characteristic snRNA (called hereafter guide-RNA or gRNA). The antisense element of the gRNA, as anticipated already in 1995 by Bachellerie and Fournier 1995), forms an RNA duplex, about 10-30 base-pair long with consecutive nucleotides of the corresponding region of the pre-RNA. This allows the enzyme of the RNP complex to efficiently locate and modify its target nucleoside. Interestingly enough, while all guide-RNA mediated RNA modification systems identified so far work with gRNA in trans, at least in the case of archaeal intron-containing pre-tRNATrp, it appears to be able to self-guide (in cis) its own 2’-O-methylation at two distinct positions (34 and 39), because the long intron does contain the ad hoc motif specifying 2’-O-methylation (Clouet d’Orval 2001). However, as pointed out by Singh et al. (2004), motifs present in the intron of one pre-RNATrp can serve as guide in trans for a second intron-containing tRNATrp, and vice-versa at least in vitro. Moreover, certain guide RNAs in humans, harboring the characteristic structures of both the 2’-O-methylation guide and of the pseudouridylation guide were shown to serve in trans for both types of RNA modifications (Jady and Kiss 2001; Darzacq et al. 2002). The main advan-

Historical overview and facts to remember 9

tage of this RNP enzymatic machinery, which is somehow reminiscent of the guide-RNA-mediated RNA editing system as discussed above (Section 1.3.1), is that only a few proteins, including only one methyltransferase or one uracil transglycosylase, and a set of interchangeable gRNAs within an RNP complex, make it possible to catalyze very specifically the formation of many 2'-O-methylated riboses or pseudouridines in different cellular RNAs (for details see chapter by Yu et al. in this volume). Neither bacteria, nor any organelle examined so far seems to use this RNA modification strategy. Nevertheless, speculations on a very early evolutionary origin of RNA-guided nucleotide-modification complexes have been proposed (Gray 2001; Terns and Terns 2002; Tran et al. 2004). 1.4.4 Guide-protein-mediated RNA editing machinery The RNA deaminase APOBEC-11, when associated with other proteins within an editosome complex can discriminate among thousands or even more other premRNAs and precisely deaminate a single cytidine at position 6666 of the 43 kb pre-mRNA transcript coding for a 512 kDa ApoB-100 protein (Powell et al. 1987; Chen et al. 1987; see Section 1.3.2 above). The same enzyme, when tested individually as a pure recombinant enzyme in vitro, also catalyzes the deamination of the precursor CMP into UMP (Navaratnam et al. 1995), yet in vivo this function is currently fulfilled by cytidine deaminases. Nevertheless, purified APOBEC-1 cannot edit ApoB mRNA in vitro unless additional auxiliary proteins are added to the reaction (Navaratnam et al. 1993; Teng et al. 1993). Thus, clearly, the high specificity of an 'intrinsically unspecific' enzyme can be achieved through its association with other proteins within a multi-protein complex serving as guide for the deaminase to reach its correct target (Navaratnam et al. 1998; for reviews see Driscoll and Innerarity 2001; and chapter by Smith et al. in this volume). On a more microscopic scale, the modular composition of many RNA modifying/editing enzymes such as those containing, for example, a well defined THUMP, TRAM, PUA, or S4 RNA binding domain (for definition of the acronyms, see Aravind and Koonin 1999, 2001; Anantharaman et al. 2001) or a dsRBM (double-stranded RNA Binding Motif, reviewed in Hough and Bass 2001) fused to a well-defined catalytic module can be considered as a lower limiting case of protein-guided RNA modification mini-machinery. Which one of the two, the complex heteromeric multicomponent system or the more ‘compact’ all-in-one enzyme, came first during cellular evolution is, of course, an interesting question to debate. 1.5 Localization and temporal order of RNA modification/editing The location of most RNA modification and editing enzymes within distinct cellular compartments, as well as of their possible associations with cellular (sub) structures or engagement in multienzymatic complexes are important elements of RNA maturation processes. The specificity of certain enzymatic reactions and the temporal order in which the many enzymatic RNA maturation steps occur, such as

10 Henri Grosjean

RNA modification/editing, 5’ and 3’-end processing, intron splicing, 5’-capping, and CCA addition may depend in large part on the physical accessibility of the substrate to the many different processing enzymes. Moreover, as RNA maturation proceeds, the structure and conformation of the RNA substrate also change, thus, allowing other macromolecules to bind the defolded/refolded RNA substrate along the various steps of RNA maturation (see e.g. Hoang and Ferré d’Amaré 2001; Clouet d’Orval 2001; Ishitani et al. 2003; Marck and Grosjean 2003). With that argument in mind, it might well be that many of the reconstituted in vitro systems we are studying with purified or partially purified components of the complex maturation machinery is irrelevant as compared to the real complexity of the cellular situation. Obviously, unpredictable new properties of a given biological system could exist at the cellular level that cannot be explained by simply studying their individual parts. It is a fact that although RNA modification/editing enzymes are numerous and present in low amounts in the cell, the entire process of many different nucleoside modifications or editing within the same RNA molecule that occur in vivo is astonishingly efficient, with no apparent accumulation of intermediates. This may indicate high affinity of the multiple enzymes for their substrates and/or channeling of RNA intermediates by temporary or stationary multifunctional enzyme complexes, possibly associated with component of the cell structure. Thus, not all RNA modification/editing might be amenable to purification as single proteins with retention of full functions. This is particularly true for those enzymes that have been already demonstrated to belong to multicomponent modification machineries, such as the guide RNA or guide protein-mediated RNA modification/editing machinery as state above (Section 1.4.3 and 1.4.4) or the mRNA-N6-adenosine methyltransferase of higher eukaryotic (for details, see review article by Bokar in this volume). As a rule, in all kinds of cells, it is becoming evident that RNA biosynthetic activities are spatially organized. This is well established in eukaryotic cells in which an important part of the very diverse biochemical processes leading to full maturation of a nascent RNA transcript begins in the nucleus, which itself is segregated into various substructures (such as nucleolus, Cajal bodies…). Then almost, but not fully mature RNAs pass through the nuclear membrane to be subsequently end-processed in the cytosol. In addition, since the eukaryotic mitochondria genome does not code for any of the RNA modification/editing enzyme, the question of how the nuclear-encoded enzymes find their appropriate subcellular location is an interesting one. For more discussions about this important aspect of RNA modification and editing, the reader should consult other review articles by Maden (1998), Bass (2001), Hopper and Phizicky (2003), and Decatur and Fournier (2003).

Historical overview and facts to remember 11

2 Unraveling the functions of RNA modification/editing enzymes 2.1 Need to expand the limited RNA vocabulary In cells, RNA molecules exist in many various forms; the best-studied ones in terms of their modified nucleotide content are tRNA, rRNA, mRNA, and small RNA molecules (sn, snoRNAs). All these RNAs play a major role in cell metabolisms, including accurate synthesis of DNA, RNA, and proteins. In addition, at variance with DNA, RNA is very flexible and can fold into many different structural conformations with 3D-motifs playing specific roles in interaction with other biomolecules or can perform essential catalytic reactions such as to self-splice introns or to catalyze peptide bond formation during translation. However, being composed of only four different major ribonucleotides, as opposed to proteins with 20 different amino acids, the primary RNA transcripts are obviously limited in their very diverse functions and need to be extensively processed. Introduction during evolution of selected chemically altered bases and/or riboses in a few key positions of the precursor RNA molecules allowed the modified mature RNA to considerably increase its ability to avoid, for example, certain dead-end misfoldings during maturation, adopt had hoc functional conformations, enhance recognition (or avoid misrecognition) by other macromolecules, and finally work efficiently and accurately in various metabolic pathways in which modified RNA are involved. This is evident when looking at the amazing diversity of nucleotides that has been identified at the wobble position 34, as well as in position 37, 3’adjacent to the anticodon in tRNAs, where many different chemical structures have been identified in the almost 600 tRNA sequenced so far from different organisms (Grosjean et al. 1995; Rozenki et al. 1999). These characteristically modified, often hypermodified nucleotides were shown to play essential roles in accurately decoding the genetic information of mRNAs on the ribosome (reviewed in Curran 1998; Agris 2001; see reviews by Suzuki, Rubio and Alfonzo, Namy et al., Colson et al. in this volume). Likewise, rRNAs also contain many modified nucleotides that are clustered in functional regions of the rRNA molecules, in the decoding center of the small rRNA and the peptidyl center of the large subunit (Ofengand and Rudd 2000; Decatur and Fournier 2002). The precise role of these modified nucleosides in rRNAs is still largely ignored; however, their high conservation between very distantly related species strongly argues for a role of these modifications in RNA function (see chapters by Lapeyre and Douthwaite et al. in this volume). 2.2 Not all nucleosides in RNAs are fully modified/edited The pattern of modification/editing (type and location) in each individual RNA depends on its origin (Bacteria, Eukarya, or Archaea). Even phylogenetically closely related organisms might present subtle differences in their modified nu-

12 Henri Grosjean

cleotides pattern. As far as modification is concerned, RNAs from simple organisms such as Mycoplasma or from cellular organelles such as mitochondria and chloroplasts are much less modified (both in number and variety of modified residues) than cytoplasmic RNAs from higher eukaryotes. As far as editing is concerned, much of the RNA editing studied so far occurs in mitochondria, chloroplasts, or specialized cells or tissues of higher eukaryotes (oocytes, brain, liver…), not in lower eukaryotes nor in bacteria. Nothing is known yet about editing in Archaea. For a given modified nucleoside in a particular RNA, the degree of modification may vary according to the physiological constraints of the cell from which the RNA originates (aerobic/anaerobic conditions, temperature, availability of intermediate metabolites or cofactors of the modification enzymes, various metabolic stress conditions, malignancy….). This is clearly the case of many of the modified nucleotides present in the anticodon of tRNAs or of certain 2’-O-methylriboses present in rRNAs. Plenty of such well studied cases are discussed in several review papers by Persson 1993; Björk and Rasmuson 1998; Winkler 1998; Kowalak et al. 1994). There are two main questions behind all these observations. i) Is the cell able to regulate its modification/editing machinery in response to different physiological conditions? ii) Do the kinds or levels of modification in a given RNA molecule serve as a regulatory device, or sensors for other metabolic processes? While the limited data available make it possible to reply positively to each of these two important questions, this area of research remains largely unexplored. The main difficulty is the lack of easy test(s) to evaluate the degree of modification of a given nucleotide in an RNA. All the information available in the RNA modification data bank (tRNA, rRNA and snRNA) is related to the ‘presence’ of a given modified nucleotide in a given position of an RNA molecule, never to the ‘degree’ of modification one can expect in a naturally occurring RNA. In other words, in a population of a given RNA species, the different individual tRNA molecules, or each individual rRNA molecule in a population of ribosomes, may differ by the presence of a given (or several) modified nucleotides, a problem one must remember when interpreting certain data. 2.3 Fine-tuning of RNA structure and function RNA modification is obviously related to RNA function, structure, and evolution and its importance has always been of prime interest. Modifications alter the local electrostatic and topological landscape of regions within RNAs that must interact in cis with another part of the RNA molecule or in trans with another molecule, usually a macromolecule such as RNA, DNA, and RNA binding proteins or various enzymes. However, despite the abundance of modified nucleotides in tRNAs as well as their presence in strategic regions of RNA molecules, the removal of any one of them often seems to have little or no effect on the growth rate of the mutant cells. There are, however, a few exceptions, where the removal of a given modification present only in a subset of tRNAs has a measurable or even profound effect on cell growth. Furthermore, while deletion or inactivation of a gene corre-

Historical overview and facts to remember 13

sponding to a given modified nucleoside has no detectable effect on cell growth, a second mutant affecting the formation of another modified nucleotide sometimes causes the cell to become sick or even die (see e.g. Grosshans et al. 2001; reviewed in Hopper and Phizicky 2003; Johansson and Byström; and Anderson and Droogmans in this volume). Despite difficulties in obtaining information on the function of modified nucleosides, several lines of evidence exist from in vivo as well as in vitro experiments demonstrating the importance of certain modified nucleosides in selected cellular processes. Maturation of snRNAs and mRNAs, ribosome assembly, translation of the genetic code, stabilization of the 3D structure of RNAs, recognition processes of tRNAs, and many others all depend in one way or another on the presence of certain modified nucleotides (for reviews, see every chapter in this volume). In the majority of cases, the effect appears as fine-tuning, always difficult to tackle, and rarely as an absolute requirement. For instance, the many modified nucleosides that are normally present in the 16S rRNA do not seem important for the assembly of the small 30S ribosomal subunit of E. coli. However, its activity is reduced compared to wild type ribosomes, which suggests a functional role of at least some of the many modified nucleosides in the decoding process (see Widerak et al. 2005). Likewise, the absence of only one Ψ over several in the 23S rRNA of E. coli, as well as in the 28S rRNA of S. cerevisiae, affect ribosome assembly (Gutgsell et al. 2001; King et al. 2003). As far as the tRNA is concerned, the majority of the modified nucleotides in this molecule has no or minor effects on most aminoacylation reactions; however, in some cases they are of the utmost importance. Indeed, completely unmodified tRNAAsp from yeast, tRNAGln and tRNAIle from E. coli are charged poorly due to lack of m1G37, mnm5s2U34 and t6A37 respectively. Furthermore, m1G37 in yeast tRNAAsp and k2C34 in E. coli tRNAIle prevent mischarging by arginine and methionine respectively (reviewed in Giegé et al. 1998; see also review chapter by Suzuki in this volume). The most abundant examples of the fine-tuning role of modified nucleotides exist for tRNAs in translational decoding and recoding (see Namy et al. in this volume). For instance, m1G37, t6A37, ms2i6A37, mnm5s2U34, k2C34 in tRNAs improve both the efficiency and fidelity of translation. The role of modified nucleosides outside the anticodon region of tRNAs has been more difficult to reveal but the modification in position 64 of yeast initiator tRNA is clearly an antideterminant towards eukaryotic elongation factor eEF-1a (Förster et al. 1993; see also chapter in this volume by Johansson and Byström). Other modifications like m1A9 in mitochondrial tRNALys, help to avoid misfolding, while other modified nucleosides in the T-loop like D, Gm or s2T, influence the dynamic 3D structure (reviewed in Agris 1996). For example, in tRNAs of highly thermophilic organisms, a correlation exists between the kind of modified nucleosides in the Dand T-loop and the upper temperature limit of life (Kowalak et al. 1994; Shigi et al. 2002; Droogmans et al. 2003). Taken together, these few examples demonstrate that subtle or important effects of modified nucleosides can be detected only under special test conditions. In addition, they should not hide the reality that the function of the large majority of

14 Henri Grosjean

modified nucleosides is not yet fully understood. It might even be that some modified nucleosides have no function at all. Indeed, due to the relatively broad specificity of certain modification enzymes that recognize simple motifs in RNAs, such as certain multi-site specific RNA pseudouridine synthases or RNA methylases, the probability exists that they may catalyze the formation of a modified nucleoside in several locations of an RNA molecule, only one possibly being functionally relevant. If the irrelevant modification(s) do(es) not prevent the cell from functioning correctly, why would Nature try to eliminate them? 2.4 Few modifying enzymes play a dual role in RNA maturation. Within the complexity of the cellular context, RNA modification enzymes can have additional roles beyond RNA modification. In few cases, it has been demonstrated that the catalytic activity of a given modification enzyme can be abolished by site-directed mutagenesis without affecting the growth rate of the mutated cells, while the disruption or the deletion of the corresponding gene within the genome leads to severe slow growth phenotype or even to cell death. This is the case for several RNA modifying enzymes of E. coli or yeast such as, for example, Dim1p, Mrm1p (alias Pet56), TrmA, Spb1p, and RluD (Lafontaine et al. 1998; Masson 1998; Persson et al. 1992; Pintard et al. 2000; Gutgsell et al. 2001; respectively). A quality-control or chaperone function of the corresponding modification enzymes, which is independent of their catalytic activity has been postulated (Lafontaine and Tollervey 1998), but clear demonstration of the exact secondary roles are still missing (see also discussion in this volume by Johansson and Byström). Perhaps the tRNA modification enzymes could also serve as the transport of the fully mature tRNAs from the nucleus to the cytoplasm or as the transport of selected tRNAs from the cytoplasm to the mitochondria. Thus, reducing the function of a given modification/editing enzyme to solely its catalytic function might not reveal its exact evolutionary ‘raison d’être’ within the metabolically complex and very efficient cellular context.

3 Conclusion and further prospects: unravel biological complexity The RNAs have been considered for a long time as passive vectors of the genetic information, although they actually play a key role not only in translation processes leading to the biosynthesis of all cell proteins but also in many other essential metabolisms. Indeed, over the last decade, it has become increasingly clear that untranslated RNA plays a crucial role in epigenetic silencing processes, such as imprinting and chromosome X inactivation in mammals, or in posttranscriptional gene silencing, in organisms as diverse as plants and fungi. This link with silencing recently reached its apotheosis with the discovery that small RNAs produced through RNA interference are involved in translational repres-

Historical overview and facts to remember 15

sion, heterochromatin formation or DNA elimination (either as microRNAs or small interfering RNAs). RNA entities are, thus, crucial effectors of both transcriptional and post-transcriptional phenomena. It is of utmost importance to determine whether or not all these RNA contain modified nucleosides. The limiting factor for such analyses is the lack of an easy method to determine the presence and evaluate the degree of modification of a given RNA molecule. Recent developments of highly sophisticated sequencing of tiny amounts of RNA by mass spectrometry (MS) coupled with high performance liquid chromatography (HPLC-MS) should open the door for interesting new discoveries related to the universal presence of modified nucleotides in almost all kinds of naturally occurring RNA molecules. Despite the considerable progress made during these last five decades on our understanding of the biosynthesis and functions of modified nucleosides in RNA, much remains to be solved. It will be important to identify the remaining individual RNA modification-editing enzymes and their corresponding genes in different cell types, to locate them within different compartments and/or various subcellular structures or in multienzyme complexes, to define their mechanism, specificity, and mode of regulation. It is also important to determine how similar they might be compared to other enzymes acting on RNA, what links exist between them and other metabolic processes providing the necessary cofactors and how the enzymes have emerged during evolution. In trying to solve all these problems, one has to keep in mind that what we want to know is not necessarily ‘intrinsic’ properties of individual components of the cellular machineries. From studying the complex interplay between the various cellular components in vivo, as well as in vitro, new ‘global’ and probably unexpected properties will probably emerge. Even knowing all about present day enzymology, the challenge will still remain of our understanding the whole process within an evolutionary framework: Do certain modified nucleosides correspond to relics of a prebiotic RNA World or on the contrary do they correspond to a more evolutionarily elaborate program allowing progressive acquisition of discrete new functions? Judging from the enormous progress that has been made this last decade on our understanding of how RNAs are synthesized and matured into molecules with many diverse and versatile functions, one can be optimistic and hope that soon the role of many more modified nucleosides will be solved.

Acknowledgements Originating from Belgium, I want thank the CNRS for providing the facilities to develop a research group in France on RNA modification enzymes. I am also thanking the Ministry of Scientific Research and the non-governmental organization ‘Actions pour la Recherche sur le Cancer’ for providing me research funds during most of these last 15 years since I migrated to France. I acknowledge Anne-Lise Haenni (University Paris 7, Jussieu) and Jaunius Urbonavicius, presently a FEBS postdoctoral fellow in my laboratory for helpful criticisms and for

16 Henri Grosjean

suggestions to improve the manuscript. I dedicate this review article to the memory of a colleague and good friend, James Ofengand who passed away while this review was being written.

References Agris PF (1996) The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function. Prog Nucleic Acid Res Mol Biol 53:79-129 Agris PF (2004) Decoding the genome: a modified view. Nucleic Acids Res 32:223-238 Anantharaman V, Koonin EV, Aravind L (2001) TRAM, a predicted RNA-binding domain, common to tRNA uracil methylation and adenine thiolation enzymes. FEMS Microbiol Lett 197:215-221 Anantharaman V, Koonin EV, Aravind L (2002) Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res 30:1427-1464 Aravind L, Koonin EV (1999) Novel predicted RNA-binding domains associated with the translation machinery. J Mol Evol 48:291-302 Aravind L, Koonin EV (2001) THUMP, a predicted RNA binding domain shared by 4thiouridine, pseudouridine synthases and RNA methylases. Trends Biochem Sci 26:215-217 Auxilien S, Crain P, Trewyn RW, Grosjean H (1996) Mechanism, specificity and general properties of the yeast enzyme catalysing the formation of inosine 34 in the anticodon of transfer RNA. J Mol Biol 262:437-458 Bachellerie JP, Nicoloso M, Balakin A, Jingwey N, Fournier MJ (1995) Antisense snoRNAs: a family of molecular RNAs with long complementarities to rRNA. Trends Biochem Sci 20:261-265 Bass BL (2001) RNA Editing. Oxford University Press, Oxford UK Bass BL, Weintraub H (1988) An unwinding activity that covalently modifies its doublestranded RNA substrate. Cell 55:1089-1098 Benne R (1994) RNA-editing in trypanosomes. European J Biochem 221:9-11 Benne R (1993) RNA Editing: The alteration of protein coding sequences of RNA. Ellis Horwood, Chichester UK Benne R, Van den Burg D, Brakenhoff JP, Sloof P, Van Boom JH, Tromp MC (1986) Major transcript of the frameshifted coxII gene from trypanosome mitochondria contains four nucleotides that are not encoded in the DNA. Cell 46:819-826 Björk GR (1996) Stable RNA modification. In: Neidhardt FC, Curtiss III R, Ingraham JL, Lin ECC, Low KB, Magasanik B, Reznikoff WS, Riley M, Schaechter M, Umbarger nd HE (eds) Escherichia coli and Salmonella, Cellular and Molecular Biology 2 ed ASM Press Washington DC, pp 861-886 Björk GR, Rasmuson T (1998) Links between tRNA modification and metabolisms and modified nucleosides as tumor markers. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 26, pp471-491 Borek E, Srinivasan PR (1966) The methylation of nucleic acids. Ann Rev Biochem 35:275-298 Bratt E, Ohman M (2003) Coordination of editing and splicing of glutamate receptor premRNA. RNA 9:309-318

Historical overview and facts to remember 17 Bujnicki, JM, Droogmans L, Grosjean H, Purushothaman SK, Lapeyre B (2004) Bioinformatics-guided identification and experimental characterization of novel RNA methyltransferases in nucleic acids. In: Bujnicki JM (ed) Practical Bioinformatics series, Springer-Verlag Berlin Heidelberg, Molecular Biol vol 5, pp 139-168 Chargaff E, Davidson JN (1955) The Nucleic Acids, vol 1, Academic Press Inc, New York NY Chen S-H, Habib G, Yang C-Y, Gu Z-W, Lee BR Weng S-A, Silberman SR, Cai S-J, Deslypere JP, Rosseneu M (1987) Apolipoprotein B48 is the product of a messenger RNA with an organ-specific in frame-stop codon. Science 238:363-366 Clouet d’Orval B, Bortolin ML, Gaspin C, Bachellerie JP (2001) Box C/D RNA guides for the ribose methylation of archaeal tRNAs: the tRNA-Trp intron guides the formation of two-ribose methylated nucleosides in the mature tRNA-Trp. Nucleic Acids Res 29:4518-4529 Cohn WE (1960) Pseudouridine, a carbon-carbon linked ribonucleoside in ribonucleic acids: isolation, structure and chemical characteristics J Biol Chem 235:1488-1498 Cohn WE, Volkin E (1951) Nucleoside-5’-phosphates from ribonucleic acid. Nature 167:483-484 Conticello SG, Thomas CJ, Petersen-Mahrt SK, Neuberger MS (2005) Evolution of the AID/APOBEC family of polynucleotide (deoxy)cytidine deaminase. Mol Biol Evol 22:367-377 Crain PF (1998) Detection and structure analysis of modified nucleosides in RNA by mass spectrometry. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 3, pp 47-57 Crick FHC (1966) Codon-Anticodon pairing: the wobble hypothesis. J Mol Biol 19:548555 Curran JF (1998) Modified nucleosides in translation. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 27, pp 493-516 Darzacq X, Jady B, Verheggen C, Kiss AM, Bertrand E, Kiss T (2002) Cajal body-specific small nuclear RNAs: a novel class of 2’-O-methylation and pseudouridylation guide RNAs. EMBO J 21:2746-2756 Davis RD (1998) Biophysical and conformational properties of modified nucleosides in RNA (nuclear magnetic resonance studies). In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 5, pp85-102 Davis FF, Allen FW (1957) Ribonucleic acid from yeast, which contain a fifth nucleotide. J Biol Chem 227:907-915 Decatur WA, Fournier MJ (2002) rRNA modifications and ribosome function. Trends Biochem Sci 27:344-351 Decatur WA, Fournier MJ (2003) RNA-guided nucleotide modification of ribosomal and other RNAs. J Biol Chem 278:695-698 De Crécy-Lagard V (2004) Finding Missing tRNA modification genes: a comparative genomics goldmine. In: JM Bujnicki JM (ed) Practical Bioinformatics Series, SpringerVerlag Berlin Heidelberg, Molecular Biology vol. 15, pp169-190 Driscoll DM, Innerarity TL (2001) RNA editing by cytidine deamination in mammals. In: Bass BL (ed) RNA Editing, Oxford University Press, Oxford UK, Chap 4, pp 61-76 Droogmans L, Roovers M, Bujnicki JM, Tricot C, Hartsch T, Stalon V, Grosjean H (2003) 1 Cloning and characterization of tRNA (m A58) TrmI from Thermus thermophilus HB27, a protein required for cell growth at extreme temperatures. Nucleic Acids Res 31:2148-2156

18 Henri Grosjean Emeson RB, Singh M (2001) Adenosine-to-inosine RNA editing: substrates and consequences. In: Bass BL (ed) RNA Editing, Oxford University Press, Oxford UK, Chap 6, pp 109-138 Estevez AM, Simpson L (1999) Uridine insertion/deletion RNA editing in trypanosome mitochondria – a review. Gene 240:247-260 Farkas WR, Hankins WD, Singh R (1973) The guanylation of transfer RNA: an enzymatic reaction. Biochim Biophys Acta 294:94-105 Fleissner E, Borek E (1962) A new enzyme of RNA synthesis: RNA methylase. Proc Natl Acad Sci USA 48:1199-1203 Förster C, Chakraburtty K, Sprinzl M (1993) Discrimination between initiation and elongation of protein biosynthesis in yeast: identity assured by a nucleotide modification in the initiator tRNA. Nucleic Acids Res 21:5679-5683 Ganot P, Bortolin ML, Kiss T (1997) Site-specific pseudouridine formation in preribosomal RNA is guided by small nucleolar. RNA Cell 89:799-809 Garcia GA, Goodenough-Lashua DEM (1998) Mechanisms of RNA-modifying and – editing enzymes. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 8, pp135-168 Garcia GA, Kittendorf JD (2005) Transglycosylation: a mechanism for RNA modification and editing. Bioorganic Chem (in press) Gerber AP, Grosjean H, Melcher T, Keller W (1998) Tad1p, a yeast tRNA-specific adenosine deaminase, is related to the mammalian pre-mRNA editing enzymes ADAR1 and ADAR2. EMBO J 17:4780-4789 Gerber AP, Keller W (2001) RNA Editing by base deamination: more enzymes, more targets, new mysteries. Trends Biochem Sci 26:376-384 Giegé R, Sissler M, Florentz C (1998) Universal rules and idiosyncratic features in tRNA identity. Nucleic Acids Res 26:5017-5035 Gott JM, Emeson RB (2000) Functions and mechanisms of RNA editing. Ann Rev Genet 2000 34:499-531 Gray MW (2001) Speculation on the origin and evolution of editing. In: Bass BL (ed) RNA Editing, Oxford University Press, Oxford UK, Chap 8, pp160-184 Gregson JM, Crain PF, Edmonds CG, Gupta R, Hashizume T, Phillipson DW, McCloskey JA (1993) Structure of the Archaeal transfer RNA nucleoside G*-15 (2-amino-4,7diohydro-4-oxo-7-B-D-ribofuranosyl-1H-pyrrolo[2,3-d]pyrimidine-5carboximidamide (Archaeosine). J Biol Chem 268:10076-10086 Grosjean H, Keith G, Droogmans L (2004) Detection and quantification of modified nucleotides in RNA using thin-layer chromatography. Methods Mol Biol 265:357-391 Grosjean H, Motorin Y, Morin A (1998) RNA-Modifying and RNA-Editing Enzymes: Methods for their identification. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 2, pp 21-46 Grosjean H, Sprinzl M, Steinberg S (1995) Posttranscriptionally modified nucleosides in transfer RNA: their locations and frequencies. Biochimie 77:139-141 Grosshans H, Lecointe F, Grosjean H, Hurt E, Simos G (2001) Pus1p-dependent tRNA pseudouridinylation becomes essential when tRNA biogenesis is compromised in yeast. J Biol Chem 276:46333-46339 Gutgsell NS, Del Campo M, Raychandhuri S, Ofengand J (2001) A second function for pseudouridine synthases: a point mutant of RluD unable to form pseudouridine 1911, 1915, and 1917 in Escherichia coli 23S ribosomal RNA restores normal growth to an TruD-minus strain. RNA 7:990-998

Historical overview and facts to remember 19 Hall RH (1971) The Modified nucleosides in nucleic Acids. Columbia University Press, New York, NY Hoang C, Ferré d’Amaré (2001) Cocrystal structure of a tRNA Psi-55 pseudouridine synthase: nucleotide flipping by an RNA-modifying enzyme. Cell 71:929-939 Holley RW, Apgar J, Everett GA, Madison JT, Marquise M, Merill JR, Penswick JR, Zamir A (1965) Structure of a ribonucleic acid. Science 147:1462-1473 Hopper AK, Phizicky EM (2003) tRNA transfers to the limelight. Genes and Development 17:162-180 Hotchkiss RD (1948) The quantitative separation of purines, pyrimidines and nucleosides by paper chromatograhy. J Biol Chem 175:315-332 Hough RF, Bass BL (2001) Adenosine deaminases that act on RNA. In: Bass BL, ed, RNA Editing, Oxford University Press, Oxford UK, Chap 5, pp77-108 Howes NK, Farkas WR (1978) Studies with a homogeneous enzyme from rabbit erythrocytes catalyzing the insertion of guanine into tRNA. J Biol Chem 253:9082-9087 Ishitani R, Nureki O, Nameki N, Okada N, Nishimura S, Yokoyama S (2003) Alternative tertiary structure of tRNA for recognition by a posttranscriptional modification enzyme. Cell 113:383-394 Itoh YH, Itoh T, Haruna I, Watanabe I (1977) Substitution of guanine for a specific base in tRNA by extracts of Ehrlich ascites tumour cells. Nature 267:467 Jady BE, Kiss T (2001) A small nucleolar guide RNA functions both in 2’-O-methylation and pseudouridylation of U5 spliceseomal RNA. EMBO J 20:541-551 Kammen HO, Spengler SJ (1970) The biosynthesis of inosinic acid in transfer RNA. Biochim Biophys Acta 213:352-364 Kaya Y, Del Campo M, Ofengand J, Malhotra A (2004) Crystal structure of TruD, a novel pseudouridine synthase with a new protein fold. J Biol Chem 279:18107-18110 Keller W, Wolf J, Gerber A (1999) Editing of messenger RNA precursors and of RNA by adenosine-to-inosine conversion. FEBS Letters 452:71-76 King TH, Liu B, McCully RR, Fournier MJ (2003) Ribosome structure and activity are altered in cells lacking snoRNPs that form pseudouridines in peptidyl transferases center. Mol Cell 11:425-435 Kiss-Laszlo Z, Henry Y, Bachellerie J-P, Caizergues-Ferrer M, Kiss T (1996) Site-specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell 85:1077-1088 Kline LK, Söll D (1982) Nucleotide Modification in RNA. In: Boyer PD (ed) The Enzymes, vol XV, Academic Press, New York, NY, pp 567-585 Kowalak JA, Dalluge JJ, McCloskey JA, Stetter KO (1994) The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochem 33:7869-7876 Lafontaine D, Delcour J, Glasser AL, Desgrés J, Vandenhaute J (1994) The DIM1 gene re6 6 sponsible for the conserved m 2m 2 dimethylation in the 3’ terminal loop of 18S rRNA is essential in yeast. J Mol Biol 241:492-497 Lafontaine LJ, Tollervey D (1998) Regulatory aspects of rRNA modification and pre-rRNA processing. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 15, pp 281-288 Limbach PA, Crain PF, Pomerantz SC, McCloskey JA (1995) Structures of posttrancriptionally modified nucleosides from RNA. Biochimie 77:135-138

20 Henri Grosjean Maden BEH (1998) Intracellular locations of RNA-modifying enzymes. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 24, pp 421-440 Marchfelder A, Binder S, Brennicke A, Knoop V (1998) RNA editing by base conversion in plant organellar RNAs. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 17, pp 307-323 Marck C, Grosjean H (2002) RNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA 8:1189-1232 Marck C, Grosjean H (2003) Identification of BHB splicing motif in intron-containing tRNA from 18 achaeons: evolutionary implications. RNA 9:1516-1531 Mass S, Gerber AP, Rich A (1999) Identification and characterization of a human tRNAspecific adenosine deaminase related to the ADAR family of pre-mRNA editing enzyme. Proc Natl Acad Sci USA 96:8895-8900 Masson T (1998) Functional aspects of the three modified nucleotides in yeast mitochondrial large-subunit rRNA. In Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 14, pp 273-280 Mol CD, Parikhi SS, Putman CD, Lo TP, Tainer JA (1999) DNA repair mechanisms for the recognition and removal of damaged DNA bases. Ann Rev Biophys Biomol Struct 28:101-128 Mueller EG (2002) Chips off the old block. Nature Structural Biol 9:320-322 Murphy FV, Ramakrishnan V (2004) Structure of a purine-purine wobble base pair in the decoding center of the ribosome. Nat Struct Biol 11:1251-1252 Navaratnam N, Bhattacharya S, Fujino T, Patel D, Jarmuz AL, Scott J (1995) Evolutionary origins of apoB mRNA Editing: catalysis by a cytidine deaminase that has acquired a novel RNA-binding motif at its active site. Cell 81:187-195 Navaratnam N, Fujino T, Bayliss J, Jarmuz A, How A, Richardson N, Somasekaram A, Bhattacharya S, Carter C, Scott J (1998) Escherichia coli cytidine deaminase provides a molecular model for apoB RNA editing and a mechanism for RNA substrate recognition. J Mol Biol 275:695-714 Navaratnam N, Morrison JR, Bhattacharya S, Patel D, Funahashi T, Giannoni F, Teng B-B, Davidson NO, Scott J (1993) The p27 catalytic subunit of the apolipoprotein B mRNA editing enzyme is a cytidine deaminase. J Biol Chem 268:20709-20712 Ni J, Tien AL, Fournier MJ (1997) Small nucleolar RNAs direct site-specific synthesis of pseudouridines in ribosomal RNA. Cell 89:565-573 Nicoloso M, QU LH, Michot B, Bachellerie JP (1996) Intron-encoded, antisense small nucleolar RNAs: the characterisation of nine novel species points to their role as guides for the 2’-O-ribose methylation of rRNAs. J Mol Biol 260:178-195 Nishimura S (1983) Structure, biosynthesis and function of queuosine in transfer RNA. Prog Nucl Acid Res Mol Biol 28:49-73 Ofengand J, Rudd KE (2000) Bacterial, Archaeal, and organellar rRNA pseudouridines and methylated nucleosides and their enzymes. In: The Ribosomes: Structure, Function, Antibiotics and Cellular Interactions, Garrett RA, Douthwaie SR, Liljas A, Matheson AT, Moore PB, Noller HF (eds) ASM Press, Washington DC, pp 175-189 Okada N, Harada F, Nishimura S (1976) Specific replacement of Q base in the anticodon of tRNA by guanine catalyzed by a cell-free extract of rabbit reticulocytes. Nucleic Acids Res 3:2593-2603

Historical overview and facts to remember 21 Okada N, Nishimura S (1979) Isolation and characterization of a guanine insertion enzyme, a specific tRNA transglycosylase from Escherichia coli. J Biol Chem 254:3061-3066 Omer AD, Ziesche S, Decatur WA, Fournier MJ, Dennis PP (2003) RNA-modifying machines in archaea. Mol Microbiol 48:617-629 Persson BC (1993) Modification of tRNA as a regulatory device. Mol Microbiol 8:10111016 Persson BC, Gustafsson C, Berg DE, Björk GR (1992) The gene for a tRNA modifying enzyme, m5U54-methyltransferase, is essential for viability in Escherichia coli. Proc Natl Acad Sci USA 89:3995-3998 Pintard L, Kressler D, Lapeyre B (2000) Spp1p is a yeast nucleolar protein associated with Nop1p and Nop58p that is able to bind S-adenosyl-L-methionine in vitro. Mol Cell Biol 20:1370-1381 Powel LM, Wallis SC, Pease RJ Edwards, YH Knott TJ, Scott J (1987) A novel form of tissue-specific RNA processing produces apolipoprotein-B48 in intestine. Cell 50:831840 Rozenski J, Crain PF, McCloskey JA (1999) The RNA modification database: 1999 update. Nucleic Acids Res 27:196-197 Savva S, McAuley-Hecht K, Brown T, Pearl I (1995) The structural basis of specific base excision repair by uracil-DNA glycosylase. Nature 373:487-493 Shigi N, Suzuki T, Tamakoshi M, Oshima T, Watanabe K (2002) Conserved bases in the TPsi-C loop of tRNA are determinants for thermophile-specific 2-thiouridylation at position 54. J Biol Chem 277:39128-39135 Singh SK, Gurha P, Tran EJ, Maxwell ES, Gupta R (2004) Sequential 2’-O-methylation of archaeal pre-tRNATrp nucleotides id guided by the intron-encoded but trans-acting box C/D ribonucleoprotein of pre-tRNA. J Biol Chem 279:47661-47671 Söll D, Kline LK 1982 RNA methylation. In: Boyer PD (ed) The Enzymes, vol XV, Academic Press, New York, NY, pp 557-566 Spedaliere CJ, Ginter JM, Johnston MW, Mueller EG (2004) The pseudouridine synthases: revisiting a mechanism that seemed settled. J Am Chem Soc 126:12758-12759 Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res 26:148-153 Stuart K, Panigrahi AK (2002) RNA Editing: complexity and complications. Mol Microbiol 45:591-596 Svensson I, Boman HG, Eriksson KG, Kjellin K (1963) Studies on microbial RNA I Transfer of methyl groups from methionine to soluble RNA from Escherichia coli. J Mol Biol 7:254-271 Tang TH, Rozhdestvensky TS, Clouet d’Orval BC, Bortolin ML, Huber H, Charpentier B, Branlant C, Bachellerie JP, Brosius J, Huttenhöfer A (2002) RNomics in Archaea reveals a further link between splicing of archaeal introns and rRNA processing. Nucleic Acids Res 30:921-930 Teng B, Burant CF, Davidson NO (1993) Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260:1816-1819 Terns M, Terns R (2002) Small nucleolar RNAs:versatile trans-acting molecules of ancient evolutionary origin. Gene Expression 10:17-39 Tran E, Brown J, Maxwell ES (2004) Evolutionary origins of the RNA-guided nucleotidemodification complexes: from the primitive translation apparatus? Trends Biochem Sci 29:343-350

22 Henri Grosjean Wagner LP, Ofengand J (1970) Chemical evidence for the presence of inosinic acid in the anticodon of an arginine tRNA of Escherichia coli. Biochem Biophys Acta 204:620623 Wagner RW, Smith JE, Cooperman BS, Nishikura K (1989) A double-stranded RNA unwinding activity introduces structural alterations by means of adenosine to inosine conversion in mammalian cells and Xenopus eggs. Proc Natl Acad Sci USA 86:26472651 Watanabe M, Matsuo M, Tanaka S, Akimoto H, Ashahi S, Nishimura S, Katze JR, Hasizume T, Crain PF, McCloskey JA, Okada N (1997) Biosynthesis of archaeosine, a novel derivative of 7-deazaguanozine specific to archael tRNA, proceeds via a pathway involving base replacement in the tRNA polynucleotide chain. J Biol Chem 272:20146-20151 Widerak M, Kern R, Malki A, Richarme G (2005) U2552 methylation at the ribosomal Asite is a negative modulator of translational accuracy. Gene (in press) Winkler ME (1998) Genetics and regulation of base modification in the tRNA and rRNA of prokaryotes and eukaryotes. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap 25, pp 441-469 Wolf J, Gerber AP, Keller W (2002) TadA, an essential tRNA-specific adenosine deaminase from Escherichia coli. EMBO J 21:3841-3851 Wyatt GR (1950) Occurrence of 5-methyl-cytosine in nucleic acid. Nature (London) 166:237-238 Zalkin H (1985) CTP synthase. Methods Enzymol 113:282-287 Zimmermann RA, Gait MJ, Moore MJ (1998) Incorporation of modified nucleotides into RNA for studies on RNA structure, function and intermolecular interactions. In: Grosjean H, Benne R (eds) Modification and Editing of RNA, ASM Press, Washington DC, Chap. 4 pp 59- 84

Grosjean, Henri Laboratoire d'Enzymologie et Biochimie Structurales, Bldg 34, CNRS, F91198 Gif-sur-Yvette, France [email protected].

Biosynthesis and function of tRNA wobble modifications Tsutomu Suzuki

Abstract Post-transcriptional modifications at the first (wobble) position of the tRNA anticodon participate in the precise decoding of the genetic code that is mediated by the codon-anticodon interaction. However, the biosynthesis and functions of many wobble modifications remain unknown. We describe, here, a reverse genetic approach that we used to explore the uncharacterized genes of Escherichia coli and yeast that are responsible for the wobble modifications (the Ribonucleome analysis). By combining this method with a comparative genomics approach, we identified an essential gene (tilS) that is responsible for the biosynthesis of lysidine at Ile the wobble position of the bacterial tRNA that is specific for the AUA codon. Lysidine is an essential wobble modification that is required for the identity of the tRNA and its AUA codon specificity. In vitro reconstitution of the wobble modification revealed the detailed mechanism by which lysidine is synthesized. Accurate maintenance of wobble modifications is, thus, required for various biological functions. We also show that the subcellular localization of tRNAs in Leishmania tarentolae is controlled by different wobble modifications. Moreover, we describe our recent studies that have revealed that the lack of wobble modification of mitochondrial tRNAs leads to translational defects that are associated with mitochondrial diseases, which suggests that disordered RNA modification may be a causative factor of human diseases.

1 Introduction 1.1 The wobble rule and the role of RNA modification in decoding The genetic code is deciphered by the anticodon of transfer RNAs (tRNAs), which are the adaptor molecules that bind amino acids at their 3’ ends and then attach to specific codons in messenger RNAs (mRNAs), thereby, transferring their amino acid to the growing peptide chain. The anticodon (positions 34, 35, and 36) of the tRNA base-pair with a specific codon (positions 1, 2, and 3) in the mRNA strand by hydrogen bonding on the ribosomal A site (Fig. 1). In this interaction, the 2nd and 3rd letters (positions 35 and 36) of the anticodon base-pair with the 2nd and 1st letter of the codon, respectively, by employing Watson-Crick-type pairing rules. A recent study of the crystal structure of the 30S ribosomal subunit revealed Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106361 / Published online: 7 January 2005 © Springer-Verlag Berlin Heidelberg 2005

24 Tsutomu Suzuki

Fig. 1. Base pairing between a transfer RNA (tRNA) anticodon and a messenger RNA (mRNA) codon. Base pairing between the nucleoside at position 34 of the tRNA (wobble nucleoside) and that at position 3 of a codon does not always conform to the Watson–Crick base-pairing rule.

that the conserved bases A1492, A1493, and G530 in the decoding center of 16S rRNA specifically monitor these two Watson-Crick base-pairings by A-minor interactions (Ogle et al. 2002). These interactions induce a large conformational rearrangement in the 30S subunit that may be involved in codon selection and, therefore, the fidelity of the decoding. In contrast, the base-pairing between the 1st letter of the anticodon (position 34) and the 3rd letter of the codon does not always conform to the above rules since G34 pairs with U3 as well as with C3, so that all codons ending with a pyrimidine (e.g. UUU and UUC) are translated to the same amino acid by a single anticodon (Phe for UUU/C translated by tRNAPhe harboring an anticodon GAA). Such irregular pairing is called ‘wobble’ pairing. The system by which wobble pairing occurs, which was first proposed by Crick (1966), and is known as the ‘wobble’ rule. Wobble pairing is a well-developed sophisticated system by which 61 sense codons are deciphered by a limited species of tRNAs. There is enough room on the A site of the 30S ribosome to accept wobble pairing (Ogle et al. 2002), which indicates that the wobble pair is not strictly recognized by 16S rRNA during decoding, thus, allowing a number of modified

Biosynthesis and function of tRNA wobble modifications 25

bases to be used at this position (Table 1). I shall discuss this point later in the present chapter. Modified nucleosides are often found in the wobble position (position 34), and these are involved in wobble pairing (Bjork 1995; Yokoyama and Nishimura 1995; Curran 1998). Such wobble modifications play critical roles in modulating codon recognition by restricting, expanding, or altering the decoding property of the tRNAs. The latest version of the codon-anticodon pairing rules, including the wobble rules, is given in Table 1. It should be noted that some of these rules have

26 Tsutomu Suzuki

been deduced on the basis of codon usage and analysis of the tRNA anticodon but still lack biochemical evidence in particular organisms. 1.2 Modified uridines Unmodified uridine is frequently found at the wobble position of tRNAs from Mycoplasma spp and mitochondria (Barrell et al. 1980; Bonitz et al. 1980; Andachi et al. 1987; Inagaki et al. 1995) that are responsible for family boxes, in which four codons with the same 1st and 2nd letters but different 3rd letter specify a single amino acid. Although uridine is supposed to only recognize A and G on the 3rd codon letter on the basis of the original wobble rule, U34 can actually base-pair with any of the four bases due to its conformational flexibility to enable wobbling with U:U and U:C (four-way wobble rule). In bacterial tRNAs, U34 is modified to 5-hydroxyuridine derivatives (xo5U) such as cmo5U and mo5U in tRNAs that are responsible for family boxes (Bjork 1995; Yokoyama and Nishimura 1995). It is known that xo5U wobble modifications result in the efficient recognition of U and G in addition to A (Samuelsson et al. 1980). Thus, the xo5U modification expands the decoding capacity of U34 in bacterial tRNAs. This can be explained by the fact that xo5U prefers to form the C2’-endo structure rather than the C3’-endo structure; the former structure is suitable for base-pairing with U on the ribosomal A site (Yokoyama et al. 1985). In contrast, in tRNAs for Lys, Gln and Glu, the U at the wobble position is modified to 5-methyl-2-thiouridine derivatives (xm5s2U) such as mnm5s2U, cmnm5s2U, mcm5s2U, and τm5s2U (Bjork 1995; Yokoyama and Nishimura 1995; Suzuki et al. 2002). As these tRNAs are responsible for decoding two codon sets that end in purine (R) (i.e. NNR), the xm5s2U modifications participate in preventing the misreading of the pyrimidine (Y)-ending near cognate codons (NNY) (Yokoyama et al. 1985). xm5s2U is largely fixed in the C3’-endo form(Yokoyama et al. 1979, 1985). Due to this conformational rigidity, the xm5s2U modification prefers to base-pair with A and prevents misreading of NNY codons (Agris et al. 1973; Yokoyama et al. 1985; Lustig et al. 1993). In addition, it was reported that 2-thio modification of mnm5s2U in Escherichia coli tRNALys confers efficient ribosome binding (Ashraf et al. 1999; Yarian et al. 2000). Furthermore, the C5-taurinomethyl group of τm5U was shown to be required for the efficient decoding of the UUG codon by stabilizing the U:G wobble-base pairing on the ribosomal A site (Kurata et al. 2003; Kirino et al. 2004). 1.3 Modified adenosine (inosine) Inosine (I) is a deaminated adenosine whose presence expands the ability of the decoding system to decipher three families of codons (NNU, NNC, and NNA) apart from the NNG codon (Crick 1966; Soll et al. 1966; Caskey et al. 1968). In bacteria, I is found only in tRNAArg, which has the ICG anticodon (Bjork 1995; Osawa 1995). In eukaryotes, I is commonly found in all tRNAs that are responsible for family boxes except for tRNAGly(GCC), tRNAPhe(GAA), and sometimes

Biosynthesis and function of tRNA wobble modifications 27

tRNALeu(GAG) (Bjork 1995; Osawa 1995). In general, the eukaryotic family boxes are deciphered by three isoaccepting tRNAs that have the anticodons INN, U*NN and CNN (Bjork 1995; Osawa 1995). Munz et al. (1981) have proposed, based on their genetic analyses, that tRNASer with the IGA anticodon in the fission yeast Schizosaccharomyces pombe cannot recognize the UCA codon, which is not consistent with the wobble rule. However, it was found by using the in vitro translation system of the asporogenic yeast Candida cylindracea that the codon CUA is translated efficiently as Leu by tRNALeu with the IAG anticodon (Suzuki et al. 1994). Thus, I34 can base-pair with A at the third codon position, even in eukaryotic systems, at least in vitro. This result is strongly supported by the fact that an exhaustive search for isoacceptors of Leu in C. cylindracea failed to find another tRNALeu responsible for decoding the CUA codon, such as tRNALeuU*AG (Suzuki et al. 1994). As an exception, unmodified A34 is found in tRNAThr from Mycoplasma and mitochondrial tRNAs (Samuelsson et al. 1980; Sibler et al. 1986; Andachi et al. 1987); this unmodified A34 is assumed to be capable of decoding all four codons. However, there is no experimental evidence to support this issue. 1.4 Modified guanosines In bacteria, the G34 of the tRNAs for Tyr, His, Asn, and Asp is modified to queuosine (Q) (Nishimura 1983). In eukaryotes, it is further modified to mannosyl queuosine (manQ) or galactosyl queuosine (galQ), in which galactose or mannose is attached to the pentadiol ring of the Q base (Kasai et al. 1976). It is known that Q prefers to base-pair with U than with C (Harada and Nishimura 1972; Grosjean et al. 1978; Meier et al. 1985). However, the exact role of the Q modification is not fully understood. It has recently been reported that the E. coli yadB gene, which is a paralog of glutamyl-tRNA synthetase, transfers glutamic acid to a cyclopentene moiety of Q to form glutamyl-queuosine (GluQ) (Dubois et al. 2004; Salazar et al. 2004). GluQ was observed in the isolated tRNAAsp under acidic conditions. The glutamyl group is believed to be easily deacylated from GluQ during tRNAAsp isolation, which suggests that Q is present as a form of GluQ in the normal tRNAs in the cells. This example also raised the possibility that the real chemical structures of some of the RNA modifications may be lost during the preparation and isolation of the tRNAs and that modified nucleosides in the cell may actually have different chemical structures to those that are analyzed. Another G modification is 2’-O-methyl guanosine (Gm), which is found in some tRNAPhe molecules isolated from eukaryotes and prokaryotes (Kuchino et al. 1982; Bjork 1995). Q and Gm may contribute to the prohibition of noncognate codon reading. In the mitochondrial tRNASer of mollusks and echinoderms, G34 is modified to 7methyl guanosine (m7G) (Matsuyama et al. 1998; Tomita et al. 1998), which is probably capable of decoding all four codons.

28 Tsutomu Suzuki

1.5 Modified cytidines In prokaryotes and eukaryotes, C34 is sometimes modified to 2’O-methyl cytidine (Cm) or 4-acetylcytidine (ac4C) (Bjork 1995). Cm and ac4C seem to strengthen the base-pairing with G, since these modifications stabilize the nucleotide conformation in the C3’ endo form (Kawai et al. 1992). In the tRNAMet molecule in animal mitochondria, C34 is modified to f5C (Moriya et al. 1994). As the AUA codon in the animal mitochondrial genetic code specifies Met instead of Ile (Watanabe and Osawa 1995), the f5C modification enables AUA to be deciphered as Met along with AUG. This modification expands the decoding capacity of tRNAMet to assign the non-universal genetic code. Formylation of C34 is thought to stabilize the interaction between anticodon (f5CAU) and codon (AUA) (Moriya et al. 1994), although, the A at the third codon position must be protonated (Yokoyama and Nishimura 1995). Lysidine (L, N*, or k2C) is a lysine-containing cytidine derivative that occurs at the wobble position (position 34) of eubacterial and some organellar AUA codon-specific Ile tRNAs (tRNAIle) that have the CAT anticodon (see Fig. 4) (Harada and Nishimura 1974; Muramatsu et al. 1988b; Weber et al. 1990; Matsugi et al. 1996). The CAT anticodon also occurs in the Met tRNA that is specific for AUG. The lysidine modification converts the codon specificity of the precursor tRNAIle with CAU anticodon from AUG to AUA and its amino acid specificity from Met to Ile and, thus, prevents the misreading of AUG as Ile and AUA as Met (see Fig. 4). Replacement of L34 with C34 in the E. coli tRNAIle2 molecule by the microsurgery technique has been shown to drastically reduce its Ile-accepting activity and conversely to raise its Met-accepting activity (Muramatsu et al. 1988a). This suggests that the conversion of both the codon and amino acid specificities of tRNAIle is governed by a single post-transcriptional modification at residue 34. We recently identified an essential gene, tilS (tRNAIlelysidine synthetase), to be responsible for L synthesis (Soma et al. 2003). I shall review the biosynthesis of L later in this chapter.

2 Ribonucleome analysis: systematic identification of genes involved in RNA modifications by reverse genetics methods 2.1 Non-essential RNA modifications The final goal of all genome projects is to clarify the functional role(s) played by every gene in the genome. Of the 4300 genes in the E. coli genome, about 2000 have not been characterized and their functions are unknown (Blattner et al. 1997). In yeast, about 3000 genes are uncharacterized (Goffeau et al. 1996). It is likely that many uncharacterized genes are responsible for RNA modifications. To investigate the functional roles played by the RNA modifications, it is necessary to identify the genes and enzymes that are responsible for their biosynthesis. We are currently engaged in a project that employs a reverse genetic approach to identify

Biosynthesis and function of tRNA wobble modifications 29

Fig. 2. Ribonucleome analysis: systematic gene identification for RNA modifications by reverse genetics. A, A series of knockout strains of E. coli or yeast were cultured in deep well plate and then transferred to 96 well-plate where total RNA were extracted and digested to nucleosides. They were automatically analyzed by LC/MS using ion trap mass spectrometer to observe absence of specific modified base due to the gene-deletion. This system allows us to analyze 500 strains per month. B, Mass spectrometric analysis identifies a non-essential RNA modification using a knockout strain in which a certain nonessential gene is depleted. Concerning essential RNA modifications, temperature sensitive strain or expression controlled strain is available for each essential gene. In mammal including human, RNA interfering using siRNA becomes available for knocking down a candidate genes. We can determine an essential gene responsible for RNA modification by observation of a specific reduction of the target RNA modification in such strains.

30 Tsutomu Suzuki

Fig. 3. Identification of a gene responsible for 2-thiouridine synthesis. In total nucleoside analysis of the genomic deletion strain lacking about 20 open reading frames, a specific defect of mnm5s2U and increased amount of its precursor, mnm5U could be found, suggesting that a gene responsible for 2-thiouridine synthesis is located in the deleted region. Then, genes that reside in this deleted region were systematically introduced in this strain one by one using plasmid library. When a gene responsible for biosynthesis of 2-thiouridine synthesis of mnm5s2U was introduced in this strain, mnm5s2U was clearly restored.

the uncharacterized genes that are involved in RNA modifications (Fig. 2A and B). This project utilizes a series of knockout strains of E. coli or yeast that are cultured in a deep well plate and then transferred to 96 well-plates. Thereafter, total RNA is extracted and digested into nucleosides and then automatically analyzed by LC/MS using an ion trap mass spectrometer. This allows us to determine whether a particular gene deletion leads to the absence of a specific modified base and, thus, permits us to identify the enzyme or protein responsible for this RNA modification (Fig. 2A). This analysis in E. coli usually reveals 25 modified nucleosides. Most are from tRNAs, although some are from rRNAs. This system helps us to identify the enzyme genes responsible for the biosynthesis of RNA modifications as well as the genes that encode proteins that lack enzymatic activity but are still involved in the biosynthesis of RNA modifications. The latter include carriers of the metabolic substrate used in the RNA modification and subunit proteins needed for RNA recognition. In theory, it is possible to identify all of the genes that are responsible for non-essential RNA modifications by using this system. We have denoted this system as ‘Ribonucleome’ analysis.

Biosynthesis and function of tRNA wobble modifications 31

Over the last few years, we have been carrying out the E. coli ribonucleome project and have, as a result, identified several genes involved in tRNA modifications. An example of the identification by this ribonucleome analysis of a new gene that is involved in wobble modification is shown in Figure 3. mnm5s2U is found at the wobble position of the tRNAs for Lys, Gln, and Glu. The total nucleoside analysis of a genomic deletion strain that lacks about 20 open reading frames revealed the specific absence of mnm5s2U along with an increase in the levels of its precursor, mnm5U. This suggested that a gene responsible for the 2thiouridine synthesis of mnm5s2U is located in the deleted region. By using a plasmid library, we then systematically introduced each missing gene, one by one, into the deletion strain and found that when tusA (tRNA-thiouridine synthesizing protein A) was introduced, mnm5s2U levels were restored. By using this system, we also identified three additional genes (tusB, tusC, and tusD) that are involved in 2thiouridine formation (Ikeuchi et al. submitted-a). Thus, it appears that in addition to the previously identified iscS and mnmA genes, at least four additional genes are required for the 2-thiouridine synthesis of mnm5s2U. 2.2 Essential RNA modifications Naturally, it is not possible to obtain deletion strains of essential genes that are involved in RNA modifications. Instead, to identify such essential genes, we can generate temperature-sensitive (ts) strains. The ts strains can then be cultured at the non-permissive temperature and the reduction of specific RNA modifications can be observed just before the cells die. We can also employ expressioncontrolled strains. For example, if the expression of an essential gene is controlled by the lac promotor, a reduction in an RNA modification can be observed when the cells are cultured in the absence of IPTG. This can also be applied to yeast, as when an essential yeast gene is expressed under the GAL1 promotor, it is possible to observe a reduction of an RNA modification when the cells are grown in glucose medium. In fact, we have used this approach to identify an essential gene that is responsible for lysidine synthesis (Soma et al. 2003) (reviewed in the next section). In addition, we found the essential yeast gene Nfs1 is involved in 2thiouridine formation at the wobble position of both mitochondrial and cytoplasmic tRNAs (Nakai et al. 2004) (see Fig. 13). I mammalian cells, RNA interference using a small interfering RNA (siRNA) (Elbashir et al. 2001) is now a powerful tool that can be used to explore both essential and non-essential genes with unknown function (Fig. 2B). In fact, we have applied RNAi for the first time to identify a human homolog of RNA modifying enzyme (Umeda et al. 2005). Our goal is to identify the genes involved in the RNA modifications in humans because many RNA modification genes are believed to be involved in a variety of human diseases. To this end, we have been carrying out ribonucleome analysis of E. coli and yeast cytoplasmic RNA modifications, as the human homologs of E. coli genes may be responsible for human mitochondrial RNA modifications while

32 Tsutomu Suzuki

Fig. 4. Lysidine converts both codon and amino acid specificities of tRNAIle. The precursor tRNA having C34 can be aminoacylated with methionine and decode AUG codon. After modification, tRNA having lysidine gains isoleucine-accepting activity and AUA-codon specificity. Thus, the conversion of both codon and aminoacid specificities is governed by this single modification.

the human homologs of yeast genes may be responsible for human cytoplasmic RNA modifications. To confirm that a human gene is indeed a functional homologue of an E. coli gene, we can complement the yeast gene with its human homolog. RNAi can also be used to rapidly identify various target genes.

3 Biosynthesis of lysidine 3.1 Identification of an essential gene responsible for lysidine formation The chemical structure of lysidine (Fig. 4) prompted us to speculate that lysine is introduced directly into the wobble position by a putative lysine-transferase. However, the existence of such an enzyme, its gene, its substrate, or formation of

Biosynthesis and function of tRNA wobble modifications 33

Fig. 5. Comparative genomics to refine the candidate genes for lysidine synthesis. 373 genes commonly found in E. coli, B. subtilis, and M. genitalium were retrieved as the first candidates. 48 of them were hypothetical or uncharacterized genes with unknown function. Finally, five candidates were chosen through bacterial essentiality analyses.

lysidine in vitro has not been reported previously. To identify candidate genes for lysidine synthesis, we combined ribonucleome analysis with a comparative genomics approach. The gene for tRNAIle with the CAT anticodon has been found in all of the complete genomic sequences of eubacteria and archaea obtained to date (Marck and Grosjean 2002), which indicates that a cytidine derivative with a lysidine-like function is a universal post-transcriptional modification of eubacterial and archaebacterial tRNAIle. Thus, we hypothesized that a gene responsible for the formation of lysidine should be present in the Clusters of Orthologous Groups of proteins (COGs) database (Tatusov et al. 1997, 2001). A phylogenetic pattern search to identify genes commonly found in E. coli, Bacillus subtilis, and Mycoplasma genitalium resulted in the retrieval of 373 COGs, 48 of which are hypothetical protein genes or uncharacterized genes with an unknown function (Fig. 5). As the gene responsible for lysidine formation is considered to be essential, we used bacterial gene essentiality analyses (Tatusov et al. 2001; Akerley et al. 2002; Kobayashi et al. 2003) to identify five COGs as candidate genes (Nos. 0037, 0061, 0344, 0536, and 1160) (Fig. 5). We initially analyzed COG Nos. 0037, 0061, and 0344 since IPTG-dependent conditional mutants of B. subtilis were available for these genes (yacA, yjbN, and yneS, respectively). In these mutants, each essential

34 Tsutomu Suzuki

gene is controlled by an IPTG-regulated promotor and, thus, these mutants are unable to grow without IPTG (Kobayashi et al. 2003). Mass spectrometric analyses of the total nucleosides in the digests of the tRNAs isolated from the yacA mutant strain revealed that when the cells were cultured in the absence of IPTG, the lysidine levels were specifically reduced. The reduced lysidine levels recovered when the cells were cultured in the presence of IPTG and, thus, again expressed yacA. To confirm this observation, we analyzed the tRNAIle2 molecules from a temperature-sensitive E. coli mutant of mesJ (mesJts), a homolog of the B. subtilis yacA gene. We found that even when the E. coli mesJts strain was cultured at 30˚C, the lysidine levels in the cells were reduced. These results demonstrate that partial inactivation of the yacA or mesJ gene results in the defective synthesis of lysidine, which suggests that these homologous genes are involved in the synthesis of lysidine (Soma et al. 2003). We also analyzed the E. coli tRNAIle2 molecules that were isolated from the mesJts strain and that bear a partial L modification (∆L34). The precursor form of lysidine in the mesJts strain turned out to be cytidine. These results indicate that lysidine formation is a single step reaction that is catalyzed by the mesJ protein. 3.2 In vitro lysidine synthesis by TilS B. subtilis yacA and E. coli mesJ belong to the same orthologous cluster, namely, COG0037, which includes 77 genes found in the complete eubacterial and archaebacterial genomes. Phylogenetically, this cluster can be divided into two families, ydaO and mesJ. The mesJ orthologs consist entirely of eubacterial homologues while the ydaO orthologs are found in eubacteria and archaebacteria as well as in eukaryotes, which indicates that ydaO is not responsible for lysidine synthesis. Recently, ydaO was found to be responsible for 2-thiocytidine synthesis at position 32 of tRNAs (Jager et al. 2004). The considerable homology between mesJ and ydaO suggests that the proteins encoded by these genes have similar reaction and tRNA recognition mechanisms. Sequence alignment of the mesJ homologs from various bacteria revealed the presence of a highly conserved N-terminal and a non-conserved C-terminal region. The N-terminal region bears a highly conserved SGGXDS sequence (amino acids 20-25) that was predicted to be a P-loop motif, which is a common motif in the ATP pyrophosphatase (PPi synthetase) family that is used for ATP-binding (Bork and Koonin 1994). A second conserved motif (amino acids 154-160) was also found in the N-terminal region. To test whether MesJ indeed synthesizes lysidine, the 48.5 KDa E. coli MesJ protein was recombinantly expressed in E. coli. A gel mobility-shift experiment showed that this protein specifically bound to isolated tRNAIle2, which indicates it is the enzyme that recognizes the substrate tRNA and, thus, is directly responsible for lysidine formation. In vitro lysidine synthesis using the recombinant MesJ protein was then performed. The precursor tRNAIle molecules with partial L34 modification that were isolated from the mesJts strain (~50% of all the tRNAIle molecules) were employed as the substrate for lysidine formation. We found that MesJ facilitated the ATP-dependent incorporation of [14C] lysine into these precursors

Biosynthesis and function of tRNA wobble modifications 35

Fig. 6. In vivo lysidine synthesis by the recombinant TilS. A, Analysis of the ability of recombinant TilS protein to synthesize lysidine in vitro using the ∆L34 tRNAIle2 from the mesJts mutant. The reaction was performed with (circles) or without (squares) ATP. B, Assessment of the in vitro aminoacylation of the ∆L34 tRNAIle2 (where only 50% of the tRNAsIle2 bear L34) and of the reconstituted tRNAIle2 by TilS and cold lysine (where most tRNAsIle2 bear L34). The squares indicate the unreconstituted ∆L34 and the circles the reconstituted ∆L34. Isoleucylation and methionylation are indicated by the filled and unfilled squares/circles, respectively.

(Fig. 6A). We, therefore, conclude that the MesJ protein is directly responsible for lysidine formation and that it has ATP-dependent lysine transfer activity. Thus, this protein was renamed as tRNAIle-lysidine synthetase (TilS) (Soma et al. 2003). 3.3 Mechanism of lysidine synthesis The amino acid sequence of TilS allowed us to hypothesize a possible catalytic mechanism by which it incorporates lysine into the tRNA. The highly conserved P-loop motif (SGGXDS) in the N-terminal domain of TilS is known to participate in the binding and hydrolysis of the α–β phosphate bond of ATP. In addition, residue R160 in the second conserved motif is predicted to interact with the γ phosphate of ATP. Moreover, the reaction that is catalyzed by the known PPi synthetase family proceeds via an adenylate intermediate accompanied by hydrolysis of the α–β bond of ATP. We then demonstrated that TilS hydrolyzes ATP and produces AMP and pyrophosphate during the reaction. Thus, the α-β phosphate bond of ATP is cleaved by TilS. Furthermore, a modification intermediate in lysidine formation could be observed. In the absence of lysine in the reaction mixture, the tRNA was specifically labeled by [α-32P] ATP, which suggests that the C-2 carbonyl group of C34 in tRNAIle is likely to be activated by the addition of AMP in the first step of lysidine synthesis. This intermediate is unstable even at

36 Tsutomu Suzuki

Fig. 7. Two step reaction of lysidine formation. TilS activates C-2 position of C34 by forming adenylate intermediate. Then epsilon amino group of lysidine attacks at this position to release AMP and complete the reaction.

neutral pH, and AMP is easily released from the tRNA, which indicates that the C2 carbonyl group is activated in the form of adenylate. These results allowed us to propose that lysidine formation involves two separate reactions (Fig. 7). First, TilS activates the C-2 position of C34 by forming an adenylate intermediate. Subsequently, nucleophillic attack by the ε-amino group of lysine on the C-2 of the intermediate completes the reaction. This reaction mechanism is similar to that employed by the PPi synthetase family member GMP synthetase (Tesmer et al. 1996), as it catalyzes the conversion of xanthosin 5’-monophosphate (XMP) to GMP by first adenylating XMP to form a covalent O2-adenyl XMP intermediate, after which the activated C-2 carbon of XMP is available for attack by the amido nitrogen of glutamine. 3.4 Direct conversion of the amino acid specificity of tRNAIle due to the lysidine modification It has been reported that tRNAIle with C34 at the wobble position accepts Met instead of Ile, since the anticodon CAU is a positive determinant for methionyltRNA synthetase (Muramatsu et al. 1988a). To examine this observation, the aminoacylation of the ∆L34 tRNAIle2 molecules, 50% of which are C34 precursors, was carried out. ∆L34 tRNAIle2 accepted both Met and Ile, as expected (Fig. 6B). To demonstrate that the in vitro lysidine formation of tRNAIle2 is directly responsible for converting its amino acid specificity from Met to Ile, the ∆L34 tRNAIle2 was subjected to the lysidine modification in vitro by using the recombinant TilS protein along with cold lysine as a substrate. The reconstituted tRNAIle showed efficient Ile-accepting activity and a loss of Met-accepting activity (Fig. 6B). Thus, the lysidine modification imposed by the TilS protein directly converts the amino acid specificity of tRNAIle from Met to Ile. These data also suggest that if only partial lysidine modification occurs in vivo, the unmodified tRNAs behave as Met tRNAs that decode AUG as Met.

Biosynthesis and function of tRNA wobble modifications 37

3.5 The lysidine modification is essential for decoding AUA codons in vivo To demonstrate that the formation of lysidine on the tRNAIle molecule is essential for the decoding of the AUA codon in the cell, we constructed a reporter plasmid carrying the β-galactosidase (lacZ) gene or a modified lacZ gene that bears two pairs of tandem AUA codons in its N-terminal region. Each plasmid was introduced into either a tilS+ strain or a tilS-deficient strain in which genomic tilS had been disrupted and tilS was rescued in trans by an another plasmid with a temperature-sensitive (ts) replication origin. At the nonpermissive temperature of 42°C, the tilS-containing plasmid gradually disappeared due to its ts replicon. The β-gal activity produced by the tilS-deficient strain was reduced at 42°C relative to that at 30°C. The reduction of the activity specified by the AUA-lacZ plasmid was particularly marked. These results indicate that the TilS-mediated formation of lysidine is an essential step for AUA decoding in the cell. 3.6 Recognition of tRNAIle by TilS It is of great interest to identify the elements in tRNAIle that are recognized by TilS because it is difficult to distinguish between tRNAMet (CAT) and tRNAIle (CAT) on the basis of their primary sequences, which are similar. In particular, they have identical anticodon loop sequences. To identify the elements in tRNAIle that are positively recognized by TilS, we constructed a series of tRNAIle variants by in vitro transcription. A point mutation at position 33 or 37 severely reduced lysidine formation, while mutations at positions 35 and 36 also reduced the lysidine formation. In addition, mutation of U43 to C reduced the lysidine formation, which suggests that the G27:U43 wobble pair is important for lysidine formation. In the acceptor stem, replacement of two base pairs, 4-69 and 5-68, strongly reduced lysidine formation (Ikeuchi et al. submitted-b). We also investigated the negative determinants in tRNAMet for lysidine formation. We found that tRNAMet became modified by lysidine when its acceptor stem was replaced with that of tRNAIle. Moreover, replacement of C27:G43 to G27:U43 in the anticodon stem also enhanced the lysidine formation in tRNAMet. Thus, in tRNAIle, the anticodon loop sequence and two base-pairs in the acceptor stem, 4-69 and 5-68, are mainly recognized by TilS while in tRNAMet, the same base pairs in the acceptor stem, 4-69 and 5-68, work as negative determinants for TilS-binding. The 27-43 base-pair is also important for both the recognition activity and discrimination between the different tRNAs by TilS (Ikeuchi et al. submitted-b). When we examined the TilSbinding site on tRNAIle by a footprinting experiment using ethylnitrosourea, we found that TilS specifically footprinted the anticodon loop and the 3’ side of the acceptor stem (positions 64-79) (Ikeuchi et al. submitted-b). These regions nicely overlap with the TilS-recognition determinants that we had identified.

38 Tsutomu Suzuki

Fig. 8. Crystal structure of E. coli TilS. The coordinates of E. coli mesJ (1NI5) were obtained from the Protein data bank. The three-dimensional structures were displayed using DeepView/SwissPdb Viewer Ver. 3.7 (Guex and Peitsch 1997). (A) Ribbon diagram of E. coli TilS. Conserved residues in the NTD are shown in red. The P-loop and R160 residues are shown in yellow. The green arrow indicates a viewing angle for the NTD shown in C and D. (B) The electrostatic potential of the E. coli mesJ protein. The protein surface is color-coded according to its electrostatic potential. Red=-1.8 kT/e, blue=+1.8kT/e. (C) Ribbon diagram of the NTD. Conserved arginine residues are indicated. (D) Surface electrostatic potential of the NTD.

Biosynthesis and function of tRNA wobble modifications 39

When E. coli MesJ was still only known as one of several conserved proteins with an unknown function, its crystal structure was determined by structural genomics and deposited in the protein data bank (1NI5, M. Gu, T. Burling, and C. D. Lima). As shown in Figure 8A and B, the highly conserved N-terminal region of this protein forms a globular domain (NTD) that is connected to the C-terminal domain by a long α-helix (H12). The C-terminal domain is divided into two subdomains (CTD1 and CTD2). The NTD has a crater with a small hole formed by highly conserved amino acids consisting of many arginine residues that form a positive surface (Fig. 8C and D). The P-loop and R160 are located in the depths of the crater. These structural features suggest that the anticodon stem-loop (ASL) of tRNAIle makes a landing on the crater of the NTD by interacting with its positive surface, thus, allowing the wobble base to come close to the ATP bound on the Ploop. This binding model is supported by the fact that a positively charged line runs down from the surface of the NTD to the valley of the CTD; this line may catch the tRNA by grasping its acceptor helix (Fig. 8B). The tRNAIle-binding sites that were predicted on the basis of the crystal structure of TilS correspond well with the TilS-recognition determinants and the TilS-footprinted sites that we identified. Thus, this binding model nicely explains our biochemical data. Determination of the crystal structure of the TilS-tRNAIle complex will further clarify the molecular basis of the tRNA recognition and discrimination of TilS. 3.7 Evolution of wobble modifications and genetic code assignment of AUA codon The universal genetic code basically consists of family boxes or two-codon sets. The AUN codons, however, are unique because three codons (AUU, AUC and AUA) encode Ile while the remaining AUG codon corresponds to Met. The wobble modification of the corresponding tRNAIle has evolved to decode the AUA codon and to differentiate between AUA and AUG. In prokaryotes, there are two tRNAIle species, namely, tRNAIle with the GAU anticodon that recognizes the AUU and AUC codons, while another species of tRNAIle bearing the LAU anticodon recognizes only the AUA codon. In contrast, the eukaryotic system utilizes a major tRNAIle with the IAU anticodon and that recognizes all three synonymous codons, in addition to a minor tRNAIle with the ΨAΨ anticodon (in yeast) responsible for AUA codon (Yokoyama and Nishimura 1995; Marck and Grosjean 2002). It is more economical to utilize inosine instead of lysidine if the AUA codon is to be decoded by a single tRNA. Inosine (I34) results from the deamination of adenosine, which is catalyzed by the tRNA adenosine deaminase (Tad) (Gerber and Keller 1999; Wolf et al. 2002; Auxilien et al. 1996). In the case of E. coli, only tRNAArg2 bears I34, which is formed by the E. coli tRNA adenosine deaminase (TadA) in a highly specific manner. In contrast, in yeast, seven cytoplasmic tRNAs contain I34, which are probably formed by a single Tad2p/Tad3p complex with lower substrate specificity. Thus, during the evolution from prokaryotes to eukaryotes, the lysidine-biosynthesizing enzyme TilS may have been lost together with the transition from L34 to I34 that is used to decode AUA as Ile.

40 Tsutomu Suzuki

In archaebacteria, it is not yet clear how tRNAIle decodes the AUA codon as Ile. Although tRNAIle genes with a CAT anticodon are commonly found in the archaebacterial genomes (Marck and Grosjean 2002), homologs of tilS have not been found (Soma et al. 2003). This observation suggests that the AUA-specific archaebacterial tRNAIle possess a modified cytidine that has a different chemical structure to lysidine but that functions similarly. This speculation is supported by the fact that mass spectrometric analyses have failed to detect lysidine in the total nucleosides of archaebacterial RNA (McCloskey et al. 2001). We are currently investigating the wobble modification of the archaebacterial tRNAIle. In mammalian mitochondria, the AUA codon is read as Met instead of Ile. This is a non-universal genetic code that is believed to have formed during the evolution of the mitochondrion in eukaryotic cells from its bacterial ancestor. According to the codon capture hypothesis (Osawa and Jukes 1989), once AUA codons disappeared in mtDNA to be unassigned, the AUA-specific tRNAIle bearing the LAU anticodon is dispensable, so that TilS might have disappeared and/or the tRNAIle(CAU) gene might have disappeared. The mt tRNAMet then acquired a new wobble modification, namely, 5-formyl cytidine (f5C), so that the AUA codon could be reassigned as Met, after making appearance of AUA codons in mt DNA. Therefore, the post-transcriptional modification of mt tRNAMet has resulted in a change in the genetic code (Yokobori et al. 2001).

4 Wobble modification and subcellular localization of tRNAs 4.1 Post-transcriptional modifications control the subcellular localization of RNA molecules RNA modifications play additional roles apart from codon recognition since it appears that post-transcriptional modifications also serve as signals that control the subcellular localization of eukaryotic RNA molecules. For example, after mRNA and UsnRNA (uridine-rich small nuclear RNA) are transcribed in the nucleus, they form a monomethylguanosine (MMG) cap on their 5’ends that functions as a nuclear export signal. This causes them to move to the cytoplasm, which is where mRNA is translated on the ribosome. In the case of UsnRNA, once it is in the cytoplasm, its MMG cap is hypermodified to a trimethylguanosine (TMG) cap, which works as a signal for its import back into the nucleus. Thus, the shuttle motion of the UsnRNA between the nucleus and the cytoplasm is governed by protein factors that recognize the post-transcriptional methylation of its cap structure (Mattaj and Englmeier 1998; Cougot et al. 2004). In addition, it has been reported that post-transcriptional modification of tRNA enhances its nuclear export activity (Kutay et al. 1998). These facts suggest that RNA modification plays a fundamental role in the subcellular targeting/localization of RNA molecules. Here we describe an example of tRNA localization that is associated with subcellular-specific wobble modifications (Kaneko et al. 2003).

Biosynthesis and function of tRNA wobble modifications 41

4.2 Role of wobble modifications in the tRNA sorting mechanism in Leishmania tarentolae Nuclear-encoded tRNA is imported into the mitochondria in a variety of organisms, including protozoa (Simpson et al. 1989; Hancock and Hajduk 1990; Schneider and Marechal-Drouard 2000), yeast (Tarassov and Martin 1996), and plants (Dietrich et al. 1996). In the kinetoplastid protists Leishmania tarentolae and Trypanosoma brucei, the mt DNA lacks tRNA genes while a complete set of mt tRNAs is encoded in the nuclear genome, which indicates that all mt tRNAs are encoded in the nuclear genome and are imported into the mitochondria from the cytoplasm (Simpson et al. 1989; Kapushoc et al. 2000; Rubio et al. 2000; Tan et al. 2002). The detailed story on Trypanosoma tRNAs is reviewed in the Chapter by Rubio and Alfonzo. The import of tRNAs into the mitochondria of Leishmania and Trypanosoma has been investigated both in vivo and in vitro but the mechanism involved is not yet fully understood. Leishmania tarentolae tRNAs can be classified into three major groups based on their subcellular localization: group I tRNAs that reside mainly in the cytoplasm, group II tRNAs that reside mainly in the mitochondria, and group III tRNAs that are shared between both the cytoplasm and the mitochondria(Kapushoc et al. 2000, 2002). It is possible that the subcellular localization of tRNAs in groups I and II is controlled by positive determinants in their sequences and/or tertiary structures. However, it is unclear whether the group III tRNAs are distributed passively between the mitochondria and cytoplasm in a concentration-dependent manner or whether their distribution involves a more active process that requires the participation of some protein factors. As previously proposed by Rusconi and Cech (1996), in species such as trypanosomatid protists, in which all the tRNAs required for mitochondrial translation are imported, the subcellular localization of tRNAs is likely to involve a negative determinant mechanism. In this model, a signal(s) embedded in the tRNA structure and sequence would inhibit import into the mitochondria. This model can be contrasted to a positive determinant mechanism where some import signal(s) or positive determinant directs the import. On the basis of the negative determinant mechanism, it can be speculated that a cytoplasm-specific modification of group III tRNAs may function as a negative determinant for their importation. We, thus, hypothesized that structural differences induced by specific post-transcriptional modifications in the group III tRNAs may be involved in their shared localization. To test this, we analyzed the RNA modifications in two group III tRNAs (tRNAGlu and tRNAGln) purified from both the cytoplasm and the mitochondria by chaplet column chromatography (described in section 5.1). Mass spectrometric analysis revealed a unique modification difference at the wobble position of both tRNAs: the cytoplasmic (cy) tRNAs bear 5-methoxycarbonylmethyl-2-thiouridine (mcm5s2U) while the mitochondrial (mt) tRNAs bear 5-methoxycarbonylmethyl2’-O-methyluridine (mcm5Um). Apart from the wobble modification, there were no other changes in the RNA modifications of these group III tRNAs. In addition, a small number of the cy tRNAs (4%) were found to bear 5methoxycarbonylmethyluridine (mcm5U) at the wobble position, which could represent a common modification intermediate for the modified uridines in both the

42 Tsutomu Suzuki

Fig. 9. Proposed tRNA sorting mechanism by wobble modifications. Precursor tRNAs are transcribed from genes in the nucleus and are exported into the cytosol with the mcm5U modication at the wobble position. Cytosolic tRNA is matured by 2-thiolation of the modication intermediate and localized in the cytosol. A portion of the modication intermediate is imported into the mitochondrion, where 2’-O-methylation of mcm5U follows.

cy and mt tRNAs. Furthermore, we could isolate mitochondria-specific tRNALys from both the cytoplasm and mitochondria. Like the tRNAsGlu/Gln, tRNALys belongs to the class that is responsible for purine-ending NAR codons. While tRNALys is classified in group II, it was shown by Northern analysis that ~4% of the tRNALys are distributed in the cytoplasm (Kapushoc et al. 2002). When these cy tRNALys molecules were subjected to LC/MS analysis, the wobble modification of these molecules was found to be mcm5U, which is the same modification intermediate that we found in a few of the cytoplasmic forms of the group III tRNAs. The mitochondrial tRNALys counterpart, in contrast, bore the mcm5Um modification. Given the absence of the 2-thiolated wobble modification in cy tRNALys (group II) as well as in mt tRNAGlu/Gln (group III), it can be speculated that only cy tRNAGlu/Gln with the mcm5U modification is imported into the mitochondria, while the mature cytoplasmic form with mcm5s2U is not imported. This suggests that 2thiolation at the wobble position works as a negative determinant against tRNA import into mitochondria. This hypothesis is also supported by the fact that the cytoplasmic form of mitochondria-specific tRNALys has mcm5U as the wobble modification, which is regarded as the counterpart of the modification intermediate for the tRNAGlu/Gln molecules. This hypothesis was further tested by comparing the importation of the mature form of cy tRNAGlu with mitochondria-specific tRNALys and unmodified tRNAGlu into the mitochondria in vitro. This would reveal any inhibitory effect of 2-

Biosynthesis and function of tRNA wobble modifications 43

thiolation at the wobble position. Thus, 32P-labeled purified tRNAs were subjected to in vitro importation into isolated mitochondria and the efficiency of importation was determined by a nuclease protection assay (Mahapatra et al. 1994, 1998; Mahapatra and Adhya 1996; Rubio et al. 2000; Kapushoc et al. 2002). We found that mt tRNALys, which lacks 2-thiolation, was efficiently imported, as expected, while cy tRNAGlu with 2-thio modification was imported into the mitochondria much less efficiently than mt tRNALys. This is in favor that 2-thiolated cy tRNAGlu is not an appropriate substrate for the tRNA import machinery. Then, to determine the effect of post-transcriptional modification, including the 2-thiolation, on the import of tRNAGlu into mitochondria, the importation efficiency of native cy tRNAGlu was compared to that of unmodified tRNAGlu that had been transcribed in vitro. The unmodified tRNA was efficiently imported while the native cy tRNAGlu was imported at a significantly lower level. This suggests that the primary sequence of tRNAGlu does not determine its mitochondrial importation efficiency and that the post-transcriptional modifications in cy tRNAGlu have an inhibitory effect on its importation into mitochondria. In Figure 9, we propose a model of the wobble modification-associated mechanisms that dictate the subcellular distribution of the L. tarentolae tRNAs that are responsible for the NAG codons. In this model, the precursor tRNAs are first transcribed from nuclear genes and exported into the cytosol with the mcm5U modification at their wobble position. In the case of the Group III tRNAsGlu/Gln, the cytoplasmic tRNAs mature through the 2-thiolation of the wobble modification. Since the 2-thio modification appears to work as a negative determinant for mitochondrial importation, the tRNAs localize in the cytosol. This localization may involve a protein factor(s) that specifically recognizes the 2-thiocarbonyl group on the tRNAGlu/Gln molecules. A proportion of the Group III tRNAGlu/Gln molecules with the mcm5U modification are imported into the mitochondrion, where mcm5U is subjected to 2’-O-methylation to form mcm5Um. With regard to the Group II tRNALys, the mcm5U-bearing cy tRNALys is not 2-thiolated in the cytosol, since it is not a substrate for the 2-thiolation modification enzyme. Consequently, most of the transcribed cy tRNALys enters the mitochondrion, where the wobble base is 2’O-methylated to form mcm5Um.

5 Mitochondrial wobble modifications and human diseases 5.1 Mitochondrial wobble modifications and the minimal decoding system The animal mitochondrial decoding system utilizes a limited set of tRNAs (22 species) that are capable of deciphering the 60 sense codons in the 13 protein genes encoded in mt DNA (Watanabe and Osawa 1995). Wobble modifications play an essential role in this decoding system. However, identifying the wobble

44 Tsutomu Suzuki

Fig. 10. The chaplet column chromatography. A, 3’-Biotinylated DNA probe complementary to target RNA was immobilized on streptoavidin sepharose. B, Several DNA columns were connected in tandem (the ‘chaplet’ column). Crude tRNA fraction was circulated through this chaplet column by a pomp at a temperature of 65 degree to entrap the desired tRNA. After washing out non-specific RNA, the trapped tRNA was eluted from the column with low salt buffer at 65 degree. C, The ‘chaplet’ column can be applied to parallel isolation of multiple species of RNA molecules at the same time. The DNA column for each mt tRNA were tandemly connected. Crude tRNA fraction from bovine liver or human placenta was circulated through this chaplet column to entrap each mitochondrial tRNA. We could successfully isolate total species of mitochondrial tRNA at the same time.

Biosynthesis and function of tRNA wobble modifications 45 Table 2. Codon-anticodon pairing pattern of mammalian mitochondrial decoding system

modifications involved is impeded by the fact that it is extremely difficult to isolate each mt tRNA by conventional column chromatography due to the limited amounts of the mt tRNAs in the cell. To overcome this problem, we devised a new tRNA isolation technique employing a solid-phase DNA probe method that is a sort of affinity chromatography (Fig. 10). In this method, to entrap the desired tRNA, a biotinylated DNA probe complementary to the target mt tRNA is immobilized onto streptavidin sepharose (Fig. 10A) and a crude tRNA fraction that includes about 0.1% of each mt tRNA is circulated through the column by a pump at a temperature of 65 degrees. After washing out the non-specific tRNAs, the trapped tRNA is eluted from the column with low salt buffer at 65 degrees (Fig. 10B). We found that continuous circulation was the most important factor in maximizing the individual tRNA yields. When several columns that are specific for different tRNAs are connected in tandem, it is possible to simultaneously isolate multiple tRNA species from the same crude tRNA preparation (Fig. 10C). Since this system works very well, we employed it to detail studies of the mechanisms used by the minimal decoding system in mitochondria. We were successful in isolating all 22 mt tRNA species from bovine liver as well as from human placenta. To do this, we tandemly connected 22 DNA columns that were each specific for a particular mt tRNA (Fig. 10C). We have denoted this system as ‘chaplet’ column chromatography. The crude tRNA fraction from bovine liver or human placenta was then circulated through this chaplet column. The purity of each tRNA was nearly 100%. Each tRNA was then subjected to LC/MS analysis to identify the modified nucleosides. Table 2 summarizes our analyses on wobble bases for all 22 individual mt tRNAs. Four kinds of modified nucleotides were

46 Tsutomu Suzuki

Fig. 11. Chemical structures of two taurine-containing uridines.

found at the wobble position of ten tRNA species; these ten tRNAs correspond to two codon sets. We found that Met tRNA has 5-formyl cytidine (f5C) at its wobble position, as reported previously (Moriya et al. 1994), while four tRNAs for Tyr, His, Asn, and Asp that are responsible for two pyrimidine-ending codon sets have Q at the wobble position. In addition, five tRNAs have novel taurine-containing uridine derivatives (Fig. 11) that were identified by our group (Suzuki et al. 2002). Thus, 5-taurinomethyluridine (τm5U) is found in the tRNAs for Leu(UUR) and Trp, while 5-taurinomethyl-2-thiouridine (τm5s2U) is found in the tRNAs for Lys, Glu and Gln. The remaining 12 tRNA species have unmodified G or U at the wobble position; four tRNAs responsible for two pyrimidine-ending codon sets have G, while eight that correspond to family boxes have U. The four way wobble rule of unmodified uridine helps to reduce the total number of tRNA species. This result nicely explains how all 60 sense codons can be decoded by only 22 tRNA species and shows that post-transcriptional modification plays a critical role in this minimal decoding system (Table 2). Our observations show that while there are only four species of wobble modifications, namely, f5C, τm5U, τm5s2U and Q, that are required for deciphering the minimal decoding system, they play essential roles in the proper functioning of mitochondrial translation. This suggests that if the wobble base is not modified correctly, mitochondrial proteins will be incorrectly synthesized and this may lead to mitochondrial diseases. Supporting this notion, we have recently found mitochondrial diseases that are caused by a wobble modification defect of the novel taurine-containing uridines that we identified in the mitochondrial tRNAs for Leu(UUR) and Lys. I shall review this issue later in this chapter.

Biosynthesis and function of tRNA wobble modifications 47

5.2 Biosynthesis of taurinomethyluridines We found a novel 381 Da uridine derivative in the tRNAs for Leu(UUR) and Trp and its 397 Da 2-thiouridine derivative in the tRNAs for Lys, Glu, and Gln. NMR analysis of the purified nucleosides showed the absence of an H6 proton cross peak in the 1H-COSY spectrum. This indicates the presence of a substituent at position 5 in the uracil ring. The molecular weight of the 2-thiouridine derivative was determined with a high degree of precision by using a Fourier-transform ion cyclotron resonance (FT-ICR) mass spectrometer. Its atomic composition was ascertained with excellent accuracy (0.03 p.p.m.) to be C12H19N3O8S2. These findings indicate that the main modification occurs in the uracil base, the most plausible structure in both cases being a taurinomethyl that possesses a sulfonic acid group derived from taurine. The two nucleosides were named 5taurinomethyluridine (τm5U) and 5-taurinomethyl-2-thiouridine (τm5s2U) (Fig. 11). We determined the novel uridine derivative to be τm5U by comparing the synthetic product using LC/MS and NMR (Suzuki et al. 2002). Taurine is attached at the C5 position of the uracil ring through a methylene group. These taurine-containing uridines were found in the mt tRNAs from humans, cows, cats, flounder, and sea squirt (our unpublished observations). In the mt tRNAs from C. elegans and yeast, taurine-containing modifications were not observed but 5-carbonylmethylaminomethyluridine (cmnm5U) is used as a wobble modification (Sakurai et al. submitted). Thus, taurine-containing uridines seem to be common in vertebrate and prochordate mitochondria. Taurine is one of the most abundant free amino acids. In human body fluids, including plasma, taurine concentrations range from 10 to 100 µM, while intracellular concentrations can exceed these levels by several hundred times (Huxtable 1992). It is significant that taurine appears to be an essential nutrient for cats, and possibly also for primates, including humans. Taurine is reported to play physiological roles in bile salt synthesis, modulation of calcium fluxes, cardiac contractility, maintenance of photoreceptor cells, modulation of neuronal excitability, osmoregulation, and cell proliferation and viability. However, the exact role(s) that taurine plays in these functions is not fully understood. Notably, it has not yet been reported that taurine can be a component of biological macromolecules such as proteins and RNAs. To confirm that taurine is indeed a direct component of the modified uridines in five of the mitochondrial tRNAs, we synthesized a stable isotopic taurine that contains [18O] oxygen and cultured HeLa cells in its presence. After two days, the mt tRNAs were isolated from the cells and subjected to LC/MS analysis. The mass chromatogram analyzing the total nucleosides in the mt tRNALys molecules from these cells clearly reveals τm5s2U with an increased mass due to the incorporation of [18O] taurine (Suzuki et al. 2002). This is the first time that it has been shown that intracellular taurine is used as a component of biological macromolecules. Moreover, the fact that isolated bovine mitochondria take up taurine indicates taurine is actively transported across the mitochondrial inner membrane. These results strongly suggest a novel pathway for the transport of cytoplasmic taurine into the mitochondria (Fig. 12). It is already known that plasma taurine is pumped into the cytoplasm through the high affinity taurine

48 Tsutomu Suzuki

Fig. 12. Catabolic flow of intracellular taurine - new pathway to mitochondria. Plasma taurine is taken up through the taurine transporter. Cytoplasmic taurine is excreted as such or in the form of bile salts like taurocholate. Cytoplasmic taurine is imported into mitochondria through a putative mitochondrial taurine transporter to be used as a constituent for τm5(s2)U synthesis in mt tRNAs.

transporter on the cytoplasmic membrane. As cytoplasmic taurine is known to accumulate at concentrations up to 40 mM, it appears that the taurine concentration gradient across the cytoplasmic cell membrane is maintained by this taurine transporter (Uchida et al. 1992). Cytoplasmic taurine is excreted as such or in the form of bile salts such as taurocholate (Huxtable 1992). With regard to the import of cytoplasmic taurine into the mitochondria, this may also involve a putative mitochondrial taurine transporter on the mitochondrial inner membrane. Our observations suggest that this imported taurine is used as a constituent for τm5U synthesis in mt tRNAs. In E. coli, mnm5U, a bacterial homolog of τm5U, is found at wobble position of some tRNAs. The initial step of mnm5U synthesis is known to require mnmE gene, since the disruption of mnmE results in defective mnm5U synthesis (Elseviers et al. 1984; Hagervall et al. 1998). The gidA gene also appears to be involved, as suggested by indirect genetic evidence (Nakayashiki and Inokuchi 1998; Bregeon et al. 2001). MSS1 and MTO1 have been found to be the respective homologs of the mnmE and gidA genes in both humans and yeast (Li and Guan 2002; Li et al. 2002, 2003) (Fig. 13). However, it has never been demonstrated that these human and yeast mitochondrial proteins are actually responsible for biosynthesis of the C5-modification of the wobble uridines in the mt tRNAs. To test this notion,

Biosynthesis and function of tRNA wobble modifications 49

Fig. 13. Putative biosynthetic pathway that introduces the τm5s2U (cmnm5s2U) modification of mitochondrial tRNAs. MSS1 and MTO1 are involved in the initial step of τm5s2U (cmnm5s2U) synthesis on mt tRNAs. Mitochondrial taurine or glycine is subsequently incorporated into mt tRNAs by unidentified transferases to build τm5s2U or cmnm5s2U, respectively. Nfs1 is responsible for the initial step in the 2-thiolation of τm5s2U (cmnm5s2U) in mt tRNAs and mcm5s2U in cytoplasmic tRNAs (Nakai et al. 2004). The sulfur from cysteine is transferred to unknown sulfur mediators by Nfs1p. MTU1 then acts as a mitochondria-specific 2-thiouridylase for τm5s2U (cmnm5s2U) by using the activated sulfur from the mediators.

yeast mt tRNAsLys molecules were isolated from MTO1 and MSS1 deletion strains and subjected to LC/MS analyses. Total nucleoside analysis and RNA fragment analysis by LC/MS revealed that the mt tRNAsLys from both deletion strains both bear 2-thiouridine (s2U) instead of the cmnm5s2U found in the wild type tRNALys. This shows that MSS1 and MTO1 genes are both involved in the biosynthesis of the 5-carboxymethylaminomethyl group of cmnm5s2U of mt tRNALys (Umeda et al. 2005). Since human MSS1 and MTO1 have been shown to be the functional homologs for yeast MSS1 and MTO1, respectively, by complementation tests in yeast (Li and Guan 2002; Li et al. 2002), it is likely that they are responsible for the biosynthesis of τm5U in humans. 5.3 Role of mitochondrial tRNA-specific 2-thiouridylase (MTU1) in the synthesis of τm5s2U It has been reported that the E. coli iscS and mnmA (trmU, asuE) genes encode enzymes that are responsible for biosynthesis of the 2-thio group of mnm5s2U (Sullivan et al. 1985; Mihara et al. 2002) (Fig. 13). In addition, we recently found four additional proteins (tusA, B, C, and D) are required for this biosynthetic event as described above. Moreover, recombinant IscS has been shown to cooperate with

50 Tsutomu Suzuki

MnmA to synthesize the 2-thio modification of tRNA in vitro (Kambampati and Lauhon 2003). MnmA is believed to act in 2-thiouridylation by recognizing the substrate tRNA. Thus, we speculated that the mitochondrial mnmA homolog may be a candidate gene that introduces the 2-thio group of τm5s2U into mt tRNALys. To test this, we identified the mnmA homologs in yeast, Caenorhabditis elegans, Drosophila, mice and humans by BLASTP searches of protein databases. Sequence alignment revealed considerable conservation between these sequences. The N-terminal regions contain a highly conserved SGGXDS sequence that is predicted to be a P-loop motif, which is a common ATP-binding motif found in the ATP pyrophosphatase (PPi synthetase) family (Bork and Koonin 1994). YDL033c is the Saccharomyces cerevisiae yeast homolog of mnmA. It shows 29% and 24% homology to the sequences of the E. coli mnmA gene and its human homolog, respectively. It is a non-essential gene whose deletion-strain is available (Winzeler et al. 1999). Thus, to examine the involvement of YDL033c in the 2thio modification of the mt tRNAs for Lys, Glu, and Gln and the cytosolic tRNALys molecule, total RNAs obtained from the YDL033c-deletion strain were subjected to polyacrylamide gel electrophoresis containing (N-acryloylamino) phenyl mercuric chloride (APM) combined with Northern blotting (Igloi 1988; Shigi et al. 2002). The four tRNAs from the wild type strain show specific retardation in the APM-containing polyacrylamide gel due to the strong affinity of the 2-thio group in these tRNAs with the mercuric compound in the gel. However, in the case of the tRNAs from the deletion strain, the retarded bands for the three mt tRNAs were not observed, while the cytosolic tRNALys molecule was still specifically retarded. This suggests that YDL033c is responsible for the 2-thio modification of mt tRNAs (Umeda et al. 2005). To confirm these observations, the mt tRNALys molecules were isolated from the ∆MTU1 deletion strain, subjected to LC/MS analyses and compared to the mt tRNALys molecules that bear cmnm5s2U at the wobble position which were obtained from of the wild type strain. The mt tRNALys molecules from the ∆MTU1 deletion strain clearly showed a marked increase in the cmnm5U peak while the cmnm5s2U peak disappeared, which demonstrates that YDL033c encodes an enzyme that is responsible for synthesizing the 2-thio group on cmnm5s2U in the three mt tRNAs for Lys, Glu and Gln. Thus, this gene was denoted MTU1 (mitochondrial tRNA-specific 2-thiouridylase 1) (Fig. 13) (Umeda et al. 2005). The human homolog of MTU1 was introduced into the yeast ∆MTU1 strain to see whether it can complement yeast MTU1. The ∆MTU1 strain bearing the human MTU1 gene was able to grow on non-fermentable YPG plates and its rate of oxygen consumption was increased slightly relative to the ∆MTU1 strain. In addition, the 2-thio modification of the mt tRNALys was partially restored. The subcellular localization of human MTU1 was then examined by transiently expressing EGFP-fused MTU1 (MTU1-EGFP). This revealed that MTU1-EGFP indeed localizes mainly in the mitochondria. This is consistent with the observation that MTU1 is a mitochondrial tRNA-specific 2-thiouridylase. Furthermore, to obtain direct evidence that human MTU1 is responsible for the 2-thio modification of τm5s2U in the mt tRNAs, siRNAs that target human MTU1 were designed by an

Biosynthesis and function of tRNA wobble modifications 51

algorithm that predicts efficacious siRNA sequences (Katoh et al. 2003; Katoh and Suzuki submitted). HeLa cells were transfected with the siRNAs and harvested 72 hours later. The MTU1 siRNAs potently reduced the MTU1 mRNA levels. APM gel-Northern analysis was then performed to measure the degree of 2-thio modification of the mt tRNALys in the MTU1 siRNA-transfected cells. About 70% of the 2-thiolated tRNALys molecules from control luciferase siRNA-transfected cells that were retarded in the APM gel disappeared when MTU1-targeting siRNAs were introduced. The rate of oxygen consumption of the MTU1 siRNA-transfected cells was also clearly decreased as compared to the wild type cells. We also examined the mitochondrial membrane potential of the knockdown cells by staining the cells with both Mito Tracker Red and Green, which are fluorescent indicators of mitochondria. Mito Tracker Red is an indicator of mitochondrial membrane potential (∆Ψ). The mitochondria of the control luciferase siRNA-transfected cells stained well with both dyes and when the two images were superimposed they revealed a well-developed mitochondrial mesh. In contrast, the MTU1-knockdown cells had granular-shaped mitochondria that stained poorly with Mito Tracker Red. Thus, knocking down human MTU1 results in mitochondria with a defective membrane potential. Notably, this is consistent with a phenotypic feature of cells from MERRF (myoclonus epilepsy associated with ragged-red fibers) patients who carry a mutant mt tRNALys that lacks its τm5s2U wobble modification (James et al. 1996; Antonicka et al. 1999). 5.4 Wobble modification defects in mitochondrial diseases Mitochondrial DNA (mtDNA) mutations are responsible for a wide spectrum of human diseases that are caused by mitochondrial dysfunction. Point mutations in mt tRNA genes are particularly frequently found in mitochondrial diseases (Schon et al. 1997; Wallace and Lott 2003). Mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes (MELAS), one of the major clinical subgroups of the mitochondrial encephalomyopathies, is caused by a single base replacement in the tRNALeu gene that is responsible for the translation of the UUR (R = A or G) leucine codons (tRNALeu(UUR)) (Kobayashi et al. 1991). The majority (80%) of MELAS patients possess an A to G transition at nucleotide position (np) 3243 (Goto et al. 1990; Kobayashi et al. 1990), whereas in about 10% of the patients, a T to C transition is observed at np 3271 (Goto et al. 1991). The mutation at np 3243 has also been observed in maternally inherited diabetes with deafness (MIDD) (van den Ouweland et al. 1992), and in progressive external ophthalmoplegia (PEO) (Johns and Hurko 1991; Moraes et al. 1993). On the other hand, an A to G transition at np 8344 in the tRNALys gene is found in most patients with myoclonus epilepsy associated with ragged-red fibers (MERRF) (Shoffner et al. 1990), another major clinical subgroup of the mitochondrial encephalomyopathies. Thus, the clinical features of the mitochondrial diseases depend on the tRNA species and/or positions of the mutations. However, the exact relationship between the location of the mutations and their clinical phenotypic consequences are not fully understood.

52 Tsutomu Suzuki

Fig. 14. Wobble modification defect in mutant tRNAs from mitochondrial diseases. Wobble modification defect was commonly found in three mutant tRNAs from MELAS3243 and 3271 and MERRF8344. These point mutations work as negative determinants for taurine-containing wobble uridines in mt tRNAs.

Cybrid cell lines, in which mutant mtDNA derived from patients has been transferred into human cells lacking mtDNA (ρ0 cells), have been used to demonstrate that the three mutations described above (A3243G, T3271C, and A8344G) are directly involved in the mitochondrial dysfunction associated with these mutations (King and Attardi 1989; Hayashi et al. 1991). In the case of the MELAS mutations, cybrid cells containing high ratio of mutated MELAS mtDNA showed a decline in enzymatic activity and a decrease in protein synthesis (Chomyn et al. 1991; Hayashi et al. 1993; Dunbar et al. 1996). Several studies proposed that the MELAS mutations directly impair the proper function of the tRNAsLeu(UUR) molecules and that this decreases the respiratory activity of the mitochondria in these patients (Jacobs 2003). However, as yet, conclusive evidence showing that the point mutations are directly responsible for the mitochondrial dysfunction that is associated with their presence is still lacking. We have previously shown with cybrid cells possessing homoplasmic pathogenic mutations that the taurine-containing modified uridine (τm5U; 5taurinomethyluridine) (Fig. 11) (Suzuki et al. 2002) that normally occurs at the anticodon wobble position of mt tRNALeu(UUR) remains unmodified in the mt tRNALeu(UUR) bearing the A3243G or T3271C mutation (Fig. 14) (Yasukawa et al. 2000b). These results nicely explain why these different point mutations are asso-

Biosynthesis and function of tRNA wobble modifications 53

ciated with the same clinical phenotype. In addition, we have shown with cybrid cells from MERRF patients that the mutant mt tRNALys bearing the A8344G mutation also lacks the appropriate taurine-modification (τm5s2U; 5-taurinomethyl-2thiouridine) (Fig. 14) (Yasukawa et al. 2000a; Suzuki et al. 2002). These two types of mitochondrial diseases, thus, have in common the lack of taurinemodification of their respective mutant tRNAs. Thus, the point mutations can apparently hinder the biosynthesis of the wobble taurine-modification of mt tRNAs. As appropriate uridine modifications at the wobble position are responsible for precise and efficient codon recognition (Bjork 1995; Yokoyama and Nishimura 1995), it is likely that a wobble modification-deficiency may result in a considerable decoding disorder. 5.5 Molecular pathogeneses of mitochondrial diseases The MELAS cybrid cells bearing the A3243G or U3271C point mutations show distinct translational activities, which suggests that the point mutations themselves have a negative effect on decoding (Chomyn et al. 1992; Hayashi et al. 1993). Thus, it may be that the molecular pathogenesis of MELAS arises from both the wobble modification-deficiency and the pathogenic point mutation. Therefore, it is necessary to discriminate the specific effect of the wobble modification deficiency in isolation from the effect of the different point mutations (3243 or 3271). To do this, we operated on the human native mt tRNALeu(UUR) molecule by using a molecular surgery technique (Suzuki et al. 1997) to construct an artificial mt tRNALeu(UUR) that has a completely normal sequence with all the modified bases but lacks τm5U at the wobble position. For this purpose, a large amount of mt tRNALeu(UUR) (160 µg) was isolated from human placenta (27kg) by chaplet column chromatography. The purified mt tRNALeu(UUR) was then cut in half at the wobble position by the hammerhead ribozyme and the τm5U in the 5’ half fragment was removed by periodate oxidation and replaced with an unmodified uridine by enzymatic ligation. The altered 5’ half was then relegated with the 3’ half to construct a mt tRNALeu(UUR) lacking the τm5U modification (Kirino et al. 2004). We then examined whether this modified mt tRNALeu(UUR) could function properly in an in vitro mitochondrial translation system (Hanada et al. 2001). The wild type mt tRNALeu(UUR) was efficient in decoding both the UUA and UUG codons and showed no activity with the UUC non-cognate codon. The MELAS mutant tRNAsLeu(UUR) purified from the relevant mutant cybrid cells were also examined. These mutant tRNAs, which not only possess the MELAS A3243G or T3271C mutation but also lack the wobble modification, showed a considerable reduction in UUA decoding as well as a severe reduction in UUG decoding. In the case of the operated tRNALeu(UUR) whose wobble modification had been surgically removed, no appreciable reduction was observed in UUA decoding but a severe reduction was observed in UUG decoding (Kirino et al. 2004) (Fig. 15). These results demonstrate two major points. First, the severe reduction in UUG decoding by the MELAS mutant tRNAs can be mainly attributed to the lack of the wobble modification. Second, since there was a considerable reduction in UUA decoding

54 Tsutomu Suzuki

Fig. 15. Distinct patterns of codon recognition found in mutant tRNAs lacking wobble modification. Pathogenic point mutation (A3243G or U3271C) in mutant tRNALeu(UUR) from MELAS patients causes a τm5U-modification deficiency, which results in a UUG codon–specific translational defect. The MERRF 8344 mutation also causes a τm5s2Umodification deficiency that results in a translational defect for both cognate codons (AAA and AAG).

when the point mutations were present but not when they were absent, the MELAS point mutations themselves impose a certain negative effect on translation. Thus, by analyzing the translational efficiency of the operated tRNA lacking the wobble modification, we can estimate for the first time the negative effect of the pathogenic A3243G and U3271C point mutations in decoding the cognate UUA codon. The MELAS tRNALeu(UUR) molecule with the A3243G mutation showed a more severe reduction in UUA decoding than the tRNALeu(UUR) molecule with the T3271C mutation. This result is consistent with the translational activities of MELAS cybrid cells with these point mutations (Chomyn et al. 1992; Hayashi et al. 1993). To confirm that the wobble modification is responsible for UUG decoding, we carried out a ribosomal A-site binding experiment. Native mt tRNALeu(UUR) bound efficiently to both the UUA and UUG codons while the operated tRNA bearing the unmodified wobble uridine showed strong binding to the UUA codon but weak binding affinity for the UUG codon (Kirino et al. 2004). This finding suggests that the UUG codon-specific translational defect of the mt tRNALeu(UUR) molecule that lacks the wobble modification is caused by an inability to form codon-anticodon base pairs on the ribosomal A-site. From these results,

Biosynthesis and function of tRNA wobble modifications 55

Fig. 16. Usage of UUR codons in human mtDNA. Leucine codon (UUA/G) usage for 13 protein genes encoded by mtDNA. Number of UUA/G codons were shown for each gene.

we conclude that the modified wobble uridine plays a functional role in the decoding of the UUG codon by stabilizing the U:G wobble base-pairing on the ribosomal A-site. This suggests that deficient decoding of the UUG codon arising from the lack of wobble modification is one of the primary causes of MELAS. We have noticed a specific bias of leucine codon usage in the 13 proteins encoded by human mtDNA genes (Fig. 16). For example, despite the minor usage of the UUG codon by most of the proteins, the ND6 gene, which encodes a component of respiratory chain Complex I (NADH-coenzyme Q reductase), contains 8 UUG codons that constitute 42.1% of the total leucine codons and 4.6% of the total codons in ND6. It has been reported that when the A3243G or T3271C mtDNA levels in cybrid cells are increased, the translational activity of ND6 is specifically and markedly reduced without a decrease in total mitochondrial protein synthesis (Hayashi et al. 1993; Dunbar et al. 1996). Furthermore, a point mutation (G14453A) in the structural gene for ND6 was found to be associated with a severe MELAS syndrome (Ravn et al. 2001). Considering the UUG codon–specific translational defect described in this study, these facts support the idea that MELAS patients experience a translational depression of ND6. This nicely explains why a specific reduction of Complex I activity is characteristic of MELAS patients (Koga et al. 1988; Goto et al. 1992). These results indicate that the UUG codon–specific translational disorder caused by defective wobble taurine modification is primarily responsible for the molecular pathogenesis of MELAS. In addition, our study suggests that the point mutation itself, in particular the A3243G

56 Tsutomu Suzuki

mutation, contributes markedly to the tRNALeu(UUR) translational defect. Thus, the degree of the decoding disorder for each MELAS mutant tRNA should vary depending on the pathogenic point mutation that is present. We previously examined the translational ability of the mutant mt tRNALys molecules from MERRF patients that bear the A8344G mutation. This analysis showed that tRNALys lacking the τm5s2U modification are incapable of translating both codons (AAA and AAG). This is due to a complete loss of codon–anticodon pairing on the ribosome (Fig. 15) (Yasukawa et al. 2001), because the 2-thio modification of the wobble base is known to be critical for decoding AAR codons (Ashraf et al. 1999). This result explains why MERRF patients show a marked defect in whole mitochondrial translation (Yoneda et al. 1994; Enriquez et al. 1995; Yasukawa et al. 2001). Thus, the different symptoms exhibited by MELAS and MERRF patients may be explained by the fact that the mutant tRNAs lacking the wobble modification in these patients show a distinct pattern of codon recognition (Fig. 15). In conclusion, our study has unraveled the essential molecular mechanism causing the mitochondrial dysfunction in MELAS patients. A point mutation at nucleotide position 3243 or 3271 in the mtDNA results in a taurine modification deficiency at the anticodon wobble position of the mutant tRNALeu(UUR), which subsequently causes a UUG-codon–specific translational defect that may lead to a translational depression of ND6. This defective codon-specific translation indicates that the deficiency in taurine-modification could be a key to the clinical phenotypes characterizing the various mitochondrial diseases.

6 RNA modification disorders as a cause of human diseases Our observations have provided a new understanding of the molecular causes of some human diseases. We have shown that wobble modification defects in mutant mt tRNAs are a primary cause for various mitochondrial diseases, although, the point mutations in the tRNAs that cause them to bear aberrant wobble modifications are also pathogenic in their own right. These are the first reported instances of human diseases that have arisen from RNA modification disorders. In retrospect, it is not surprising that a qualitative disorder of RNA molecules can cause disease, since non-coding RNAs are functional molecules that must mature by undergoing post-transcriptional modifications. Table 3 summarizes some of the human diseases that are thought to arise from RNA modification disorders. One of these is Hyper-IgM Syndrome, which is thought to result from the loss of activation-induced cytidine deaminase (AID) (Muramatsu et al. 2000; Revy et al. 2000). Loss of AID causes a deficit in class switch recombination of immunoglobulin genes and also induces somatic hypermutations of the variable region of IgG. AID is apparently a paralog of the cytidine deaminase APOBEC 1 that is responsible for the C to U editing of ApoB mRNA (Teng et al. 1993). Although it has been reported that AID works as a DNA-mutating agent that catalyzes the

Biosynthesis and function of tRNA wobble modifications 57 Table 3. Human diseases associated with RNA modification disorder Diseases MELAS

Genes mt tRNALeu

Defect or disorder Wobble modification

MERRF

mt tRNALys

Wobble modification

Hyper-IgM syndrome Dyskeratosis congenita Malignant gliomas ALS B-cell lymphoma Prader-Willi syndromes

AID

RNA editing or DNA mutation rRNA modification

References Suzuki et al. 2002; Yasukawa et al. 2001 Suzuki et al. 2002; Yasukawa et al. 2000b Muramatsu et al.2000; Revy et al. 2000 Ruggero et al. 2003

RNA editing RNA editing (rRNA modification) unknown

Maas et al. 2001 Kawahara et al. 2004 Tanaka et al. 2000 Cavaille et al. 2000

DKC1 GluRB GluR2 U50 snoRNA Orphan snoRNAs

deamination of DNA (Harris et al. 2002; Petersen-Mahrt et al. 2002), it is also possible that AID may function as an RNA-editing enzyme for a putative target mRNA(s) that is responsible for class switch recombination and somatic hypermutations (Honjo et al. 2004). Another human disease that is thought to arise from disordered RNA modification is dyskeratosis congenita, which is characterized by premature aging and increased susceptibility to tumors. The DKC1 gene has been suggested to be responsible for this disease. DKC1 encodes a dyskerin that serves as a pseudouridine synthase in the boxH/ACA snoRNP ribonucleoprotein complex. It has been shown that mutation of DKC1 causes a specific reduction of pseudouridines in rRNAs (Ruggero et al. 2003), supporting the view that this is phenotype associated with the disease. However, since dyskerin is also a component of telomerase (Mitchell et al. 1999), it may be that the initiation and exacerbation of dyskeratosis congenita involves both defective ribosome function and telomere shortening, which both arise from dyskerin dysfunction. Yet another human disease that is thought to arise from a RNA modification disorder is sporadic amyotrophic lateral sclerosis (ALS), which is a fatal paralytic disease. ADAR2 is an adenosine deaminase that catalyzes A to I editing of several mRNAs, including glutamate receptor subunit B (GluR2) (Sommer et al. 1991) and serotonin receptor 5-HT(2C) (Burns et al. 1997). A recent report has suggested that the aetiology of ALS may involve defective editing of the Q/R-site of GluR2 mRNA in the spinal motor neurons (Kawahara et al. 2004). A perturbation of this RNA editing may contribute to the neuronal death in ALS patients. That such defective Glu2R mRNA editing may be pathogenic is further suggested by studies of malignant human brain tumors, which found that Q/R-site editing of GluR2 is substantially reduced in the tumors compared with control tissues (Maas et al. 2001). In addition, altered editing and alternative splicing of 5-HT(2C) transcripts were observed. Altered editing of serotonin receptor 5-HT(2C) mRNA may also participate in the pathogenesis of Prader-Willi syndrome (PWS), which is a neurogenetic disease resulting from a deficiency of paternal gene expression. PWS

58 Tsutomu Suzuki

has been suggested to result from the absence of three brain-specific human C/D box snoRNAs that are all encoded by the same region of the genome (Cavaille et al. 2000), since these snoRNAs were absent from the cortex of a PWS patient and from a PWS mouse model. Notably, while these brain-specific snoRNAs lack guide sequences that are complementary to rRNAs, one of these snoRNAs (HBII52) has an 18-nt phylogenetically conserved complementarity to a critical segment of 5-HT(2C) mRNA. This suggests that this snoRNA may play a role in modifying 5-HT(2C) mRNA and that its absence may have pathogenic consequences that lead to the clinical features of PWS. Many RNA modifications are generated by using small metabolites, including amino acids. This suggests that there may be a link between RNA modification disorders and metabolic disorders. In other words, it may be that many RNA modification disorder arise from various inborn errors of metabolism. Our studies on the taurine-containing wobble modifications have led us to propose that a deficiency of dietary taurine may result in deficient wobble modifications that could be pathogenic. It is significant that taurine appears to be an essential nutrient for cats and foxes, and possibly also for primates, including humans (Hayes et al. 1975, 1985; Geggel et al. 1985). In the case of humans, infants and young children biosynthesize very little taurine, so dietary taurine is essential for normal human development (Sturman 1993). Cats and foxes lack a biosynthetic pathway for this amino acid and a deficiency of dietary taurine has been shown to cause cardiomyopathy in both species (Pion et al. 1987; Moise et al. 1991). Significantly, cardiomyopathy is a major manifestation of the human mitochondrial encephalomyopathies (Wallace 2000). Although taurine plays crucial roles in myocardial functions such as calcium flux modulation and cardiac contractility, our evidence strongly suggests that incomplete modification of τm5s2U in mt tRNAs due to low taurine levels in the plasma may be one of the main causative factors of cardiomyopathy in cats. We speculate on the basis of this that mitochondrial encephalopathies in humans could similarly be generated by taurine deficiency. Further studies on the association of taurine deficiency with wobble modification defects will shed new light on the biochemical roles of taurine.

7 Conclusion and outlook 7.1 Amino acid conjugation involved in RNA modifications Several examples of RNA modifications that result from the direct incorporation of particular amino acid have been reported so far. As reviewed in this chapter, the lysidine (L) in eubacterial tRNAs has been shown to be synthesized by directly incorporating lysine into the precursor tRNAIle (Soma et al. 2003). It completely alters the decoding property and amino acid specificity of tRNAIle. In addition, two taurine-containing wobble uridines, τm5U and τm5s2U, in human mt tRNAs, were shown to be synthesized by the direct incorporation of dietary taurine (Suzuki et al. 2002). Mutant mt tRNAs from mitochondrial diseases lack these modifications

Biosynthesis and function of tRNA wobble modifications 59

and this leads to decoding disorders. Our study revealed that τm5U of mt tRNALeu(UUR) is critical for decoding the UUG codon (Kirino et al. 2004). Thus, the 5-taurinomethyl group functions to stabilize U:G wobble base-pairing on the ribosomal A site. In yeast and C. elegans, mt tRNAs use cmnm5(s2)U as the wobble modification instead of τm5(s2)U. According to the chemical structure of cmnm5(s2)U, as glycine replaces the taurine of τm5(s2)U, and on the basis of the mechanism by which τm5(s2)U is synthesized, glycine is probably the direct substrate in the synthesis of the 5-carbonylmethyl group of cmnm5(s2)U. In addition, as mnm5(s2)U in bacterial tRNAs are derivatives of cmnm5(s2)U, glycineconjugation may be extensively involved in synthesizing the xm5(s2)U-type modifications. Furthermore, it has been reported that thio-modifications, such as mnm5(s2)U, s2C and s4U, originate from the sulfur of Cys (Lauhon 2002; Nilsson et al. 2002), although Cys itself is not a direct substrate. Synthesis of thiomodifications proceeds though multistep reactions that require iron sulfur cluster proteins. It has been demonstrated that the threonyl group of N6threonylcarbamoyladenosine (t6A), which is found at position 37 of prokaryotic and eukaryotic tRNAs, comes from free L-threonine, although the enzyme responsible for the reaction has not yet been identified (Chheda et al. 1972; Powers and Peterkofsky 1972). More recently, it has been reported that E. coli yadB gene transfers glutamic acid to Q to form glutamyl-queuosine (GluQ) (Dubois et al. 2004; Salazar et al. 2004). The use of amino acids in the synthesis of modified nucleosides, thus, appears to be a common strategy to create functional variation in RNA. It is also possible that this RNA-amino acid conjugation may have originated from the early processes that evolved life (the RNA world).

Acknowledgement I am grateful to the people in my lab, especially Takeo Suzuki, Yohei Kirino, Yoshiho Ikeuchi, and Noriko Umeda for their contributions. Special thanks are due to Dr. Kimitsuna Watanabe (AIST) for his contributions over the years and for our fruitful discussions on the mitochondrial genetic code and tRNAs. I also thank our collaborators, especially A. Soma and Y. Sekine of Rikkyo University and S. Ohta, S. Akira, and K. Ishihara of Nippon Medical School. I also wish to express my gratitude to Drs. Susumu Nishimura (Banyu Pharmaceutical), Shigeyuki Yokoyama (University of Tokyo), Henri Grosjean (CNRS), and Glenn Björk (Umea University) for many helpful suggestions and comments on our lysidine work, and to Dr. Larry Simpson (UCLA) for our productive collaboration in studying the subcellular localization of Leishmania tRNAs in the HFSP project. This work was supported by grants-in-aid for scientific research on priority areas from the Ministry of Education, Science, Sports, and Culture of Japan, as well as by a grant from the New Energy and Industrial Technology Development Organization (NEDO) and a grant from the Human Frontier Science Program (RG0349). Lastly, I would like to pray sincerely for the repose of Dr. Francis Crick’s soul.

60 Tsutomu Suzuki

References Agris PF, Soll D, Seno T (1973) Biological function of 2-thiouridine in Escherichia coli glutamic acid transfer ribonucleic acid. Biochemistry 12:4331-4337 Akerley BJ, Rubin EJ, Novick VL, Amaya K, Judson N, Mekalanos JJ (2002) A genomescale analysis for identification of genes required for growth or survival of Haemophilus influenzae. Proc Natl Acad Sci USA 99:966-971 Andachi Y, Yamao F, Iwami M, Muto A, Osawa S (1987) Occurrence of unmodified adenine and uracil at the first position of anticodon in threonine tRNAs in Mycoplasma capricolum. Proc Natl Acad Sci USA 84:7398-7402 Antonicka H, Floryk D, Klement P, Stratilova L, Hermanska J, Houstkova H, Kalous M, Drahota Z, Zeman J, Houstek J (1999) Defective kinetics of cytochrome c oxidase and alteration of mitochondrial membrane potential in fibroblasts and cytoplasmic hybrid cells with the mutation for myoclonus epilepsy with ragged-red fibres ('MERRF') at position 8344 nt. Biochem J 342:537-544 Ashraf SS, Sochacka E, Cain R, Guenther R, Malkiewicz A, Agris PF (1999) Single atom modification (O-->S) of tRNA confers ribosome binding. Rna 5:188-194 Auxilien S, Crain PF, Trewyn RW, Grosjean H (1996) Mechanism, specificity and general properties of the yeast enzyme catalysing the formation of inosine 34 in the anticodon of transfer RNA. J Mol Biol 262(4):437-458 Barrell BG, Anderson S, Bankier AT, de Bruijn MH, Chen E, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJ, Staden R, Young IG (1980) Different pattern of codon recognition by mammalian mitochondrial tRNAs. Proc Natl Acad Sci USA 77:3164-3166 Bjork GR (1995) Biosynthesis and function of modified nucleosides. In tRNA: Structure, Biosynthesis, and function, D.R. Soll and U.L. RajBhandary, eds. (Washington, DC: American Society for Microbiology), pp165-205 Blattner FR, Plunkett G 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y (1997) The complete genome sequence of Escherichia coli K-12. Science 277:1453-1474 Bonitz SG, Berlani R, Coruzzi G, Li M, Macino G, Nobrega FG, Nobrega MP, Thalenfeld BE, Tzagoloff A (1980) Codon recognition rules in yeast mitochondria. Proc Natl Acad Sci USA 77:3167-3170 Bork P, Koonin EV (1994) A P-loop-like motif in a widespread ATP pyrophosphatase domain: implications for the evolution of sequence motifs and enzyme activity. Proteins 20:347-355 Bregeon D, Colot V, Radman M, Taddei F (2001) Translational misreading: a tRNA modification counteracts a +2 ribosomal frameshift. Genes Dev 15:2295-2306 Burns CM, Chu H, Rueter SM, Hutchinson LK, Canton H, Sanders-Bush E, Emeson RB (1997) Regulation of serotonin-2C receptor G-protein coupling by RNA editing. Nature 387:303-308 Caskey CT, Beaudet A, Nirenberg M (1968) RNA codons and protein synthesis. 15. Dissimilar responses of mammalian and bacterial transfer RNA fractions to messenger RNA codons. J Mol Biol 37:99-118 Cavaille J, Buiting K, Kiefmann M, Lalande M, Brannan CI, Horsthemke B, Bachellerie JP, Brosius J, Huttenhofer A (2000) Identification of brain-specific and imprinted

Biosynthesis and function of tRNA wobble modifications 61 small nucleolar RNA genes exhibiting an unusual genomic organization. Proc Natl Acad Sci USA 97:14311-14316 Chheda GB, Hong CI, Piskorz CF, Harmon GA (1972) Biosynthesis of N-(purin-6ylcarbamoyl)-L-threonine riboside. Incorporation of L-threonine in vivo into modified nucleoside of transfer ribonucleic acid. Biochem J 127:515-519 Chomyn A, Martinuzzi A, Yoneda M, Daga A, Hurko O, Johns D, Lai ST, Nonaka I, Angelini C, Attardi G (1992) MELAS mutation in mtDNA binding site for transcription termination factor causes defects in protein synthesis and in respiration but no change in levels of upstream and downstream mature transcripts. Proc Natl Acad Sci USA 89:4221-4225 Chomyn A, Meola G, Bresolin N, Lai ST, Scarlato G, Attardi G (1991) In vitro genetic transfer of protein synthesis and respiration defects to mitochondrial DNA-less cells with myopathy-patient mitochondria. Mol Cell Biol 11:2236-2244 Cougot N, van Dijk E, Babajko S, Seraphin B (2004) 'Cap-tabolism'. Trends Biochem Sci 29(8):436-444 Crick FH (1966) Codon--anticodon pairing: the wobble hypothesis. J Mol Biol 19:548-555 Curran JF (1998) Modified nucleosides in translation. In Modification and Editing of RNA, H. Grosjean, and R. Benne, eds. (Washington, D.C.: ASM Press), pp463-516 Dietrich A, Small I, Cosset A, Weil JH, Marechal-Drouard L (1996) Editing and import: strategies for providing plant mitochondria with a complete set of functional transfer RNAs. Biochimie 78:518-529 Dubois DY, Blaise M, Becker HD, Campanacci V, Keith G, Giege R, Cambillau C, Lapointe J, Kern D (2004) An aminoacyl-tRNA synthetase-like protein encoded by the Escherichia coli yadB gene glutamylates specifically tRNAAsp. Proc Natl Acad Sci USA 101:7530-7535. Epub 2004 Apr 7519 Dunbar DR, Moonie PA, Zeviani M, Holt IJ (1996) Complex I deficiency is associated with 3243G:C mitochondrial DNA in osteosarcoma cell cybrids. Hum Mol Genet 5:123-129 Elbashir SM, Harborth J, Lendeckel W, Yalcin A, Weber K, Tuschl T (2001) Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411:494-498 Elseviers D, Petrullo LA, Gallagher PJ (1984) Novel E. coli mutants deficient in biosynthesis of 5-methylaminomethyl-2-thiouridine. Nucleic Acids Res 12:3521-3534 Enriquez JA, Chomyn A, Attardi G (1995) MtDNA mutation in MERRF syndrome causes defective aminoacylation of tRNA(Lys) and premature translation termination. Nat Genet 10:47-55 Geggel HS, Ament ME, Heckenlively JR, Martin DA, Kopple JD (1985) Nutritional requirement for taurine in patients receiving long-term parenteral nutrition. N Engl J Med 312:142-146 Gerber AP, Keller W (1999) An adenosine deaminase that generates inosine at the wobble position of tRNAs. Science 286:1146-1149 Goffeau A, Barrell BG, Bussey H, Davis RW, Dujon B, Feldmann H, Galibert F, Hoheisel JD, Jacq C, Johnston M, Louis EJ, Mewes HW, Murakami Y, Philippsen P, Tettelin H, Oliver SG (1996) Life with 6000 genes. Science 274:546, 563-547 Goto Y, Nonaka I, Horai S (1990) A mutation in the tRNA(Leu)(UUR) gene associated with the MELAS subgroup of mitochondrial encephalomyopathies. Nature 348:651653

62 Tsutomu Suzuki Goto Y, Nonaka I, Horai S (1991) A new mtDNA mutation associated with mitochondrial myopathy, encephalopathy, lactic acidosis and stroke-like episodes (MELAS). Biochim Biophys Acta 1097:238-240 Goto Y, Tojo M, Tohyama J, Horai S, Nonaka I (1992) A novel point mutation in the mitochondrial tRNA(Leu)(UUR) gene in a family with mitochondrial myopathy. Ann Neurol 31:672-675 Grosjean H, Sankoff D, Jou WM, Fiers W, Cedergren RJ (1978) Bacteriophage MS2 RNA: a correlation between the stability of the codon: anticodon interaction and the choice of code words. J Mol Evol 12:113-119 Hagervall TG, Pomerantz SC, McCloskey JA (1998) Reduced misreading of asparagine codons by Escherichia coli tRNALys with hypomodified derivatives of 5methylaminomethyl-2-thiouridine in the wobble position. J Mol Biol 284:33-42 Hanada T, Suzuki T, Yokogawa T, Takemoto-Hori C, Sprinzl M, Watanabe K (2001) Translation ability of mitochondrial tRNAsSer with unusual secondary structures in an in vivo translation system of bovine mitochondria. Genes Cells 6:1019-1030 Hancock K, Hajduk SL (1990) The mitochondrial tRNAs of Trypanosoma brucei are nuclear encoded. J Biol Chem 265:19208-19215 Harada F, Nishimura S (1972) Possible anticodon sequences of tRNA His, tRNA Asm , and tRNA Asp from Escherichia coli B. Universal presence of nucleoside Q in the first postion of the anticondons of these transfer ribonucleic acids. Biochemistry 11:301308 Harada F, Nishimura S (1974) Purification and characterization of AUA specific isoleucine transfer ribonucleic acid from Escherichia coli B. Biochemistry 13:300-307 Harris RS, Petersen-Mahrt SK, Neuberger MS (2002) RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Mol Cell 10:1247-1253 Hayashi J, Ohta S, Kikuchi A, Takemitsu M, Goto Y, Nonaka I (1991) Introduction of disease-related mitochondrial DNA deletions into HeLa cells lacking mitochondrial DNA results in mitochondrial dysfunction. Proc Natl Acad Sci USA 88:10614-10618 Hayashi J, Ohta S, Takai D, Miyabayashi S, Sakuta R, Goto Y, Nonaka I (1993) Accumulation of mtDNA with a mutation at position 3271 in tRNA(Leu)(UUR) gene introduced from a MELAS patient to HeLa cells lacking mtDNA results in progressive inhibition of mitochondrial respiratory function. Biochem Biophys Res Commun 197:1049-1055 Hayes KC (1985) Taurine requirement in primates. Nutr Rev 43:65-70 Hayes KC, Carey RE, Schmidt SY (1975) Retinal degeneration associated with taurine deficiency in the cat. Science 188:949-951 Honjo T, Muramatsu M, Fagarasan S (2004) AID: how does it aid antibody diversity? Immunity 20:659-668 Huxtable RJ (1992) Physiological actions of taurine. Physiol Rev 72:101-163 Igloi GL (1988) Interaction of tRNAs and of phosphorothioate-substituted nucleic acids with an organomercurial. Probing the chemical environment of thiolated residues by affinity electrophoresis. Biochemistry 27:3842-3849 Ikeuchi Y, Shigi N, Kato J, Nishimura A, Suzuki T (submitted-a) Identification and characterization of four genes responsible for 2-thiouridine formation of tRNA wobble modification. Ikeuchi Y, Soma A, Kanemasa S, Ote T, Kato J, Sekine M, Suzuki T (submitted-b) Molecular mechanism of lysidine synthesis and substrate discrimination of tRNAIle lysidine synthetase.

Biosynthesis and function of tRNA wobble modifications 63 Inagaki Y, Kojima A, Bessho Y, Hori H, Ohama T, Osawa S (1995) Translation of synonymous codons in family boxes by Mycoplasma capricolum tRNAs with unmodified uridine or adenosine at the first anticodon position. J Mol Biol 251:486-492 Jacobs HT (2003) Disorders of mitochondrial protein synthesis. Hum Mol Genet 12:293301 Jager G, Leipuviene R, Pollard MG, Qian Q, Bjork GR (2004) The conserved Cys-X1-X2Cys motif present in the TtcA protein is required for the thiolation of cytidine in position 32 of tRNA from Salmonella enterica serovar typhimurium. J Bacteriol 186:750757 James AM, Wei YH, Pang CY, Murphy MP (1996) Altered mitochondrial function in fibroblasts containing MELAS or MERRF mitochondrial DNA mutations. Biochem J 318:401-407 Johns DR, Hurko O (1991) Mitochondrial leucine tRNA mutation in neurological diseases. Lancet 337:927-928 Kambampati R, Lauhon CT (2003) MnmA and IscS are required for in vitro 2-thiouridine biosynthesis in Escherichia coli. Biochemistry 42:1109-1117 Kaneko T, Suzuki T, Kapushoc ST, Rubio MA, Ghazvini J, Watanabe K, Simpson L (2003) Wobble modification differences and subcellular localization of tRNAs in Leishmania tarentolae: implication for tRNA sorting mechanism. Embo J 22:657-667 Kapushoc ST, Alfonzo JD, Rubio MA, Simpson L (2000) End processing precedes mitochondrial importation and editing of tRNAs in Leishmania tarentolae. J Biol Chem 275:37907-37914 Kapushoc ST, Alfonzo JD, Simpson L (2002) Differential localization of nuclear-encoded tRNAs between the cytosol and mitochondrion in Leishmania tarentolae. Rna 8:57-68 Kasai H, Nakanishi K, Macfarlane RD, Torgerson DF, Ohashi Z, McCloskey JA, Gross HJ, Nishimura S (1976) Letter: The structure of Q* nucleoside isolated from rabbit liver transfer ribonucleic acid. J Am Chem Soc 98:5044-5046 Katoh T, Susa M, Suzuki T, Umeda N, Watanabe K, Suzuki T (2003) Simple and rapid synthesis of siRNA derived from in vivo transcribed shRNA. Nucleic Acids Res Suppl:249-250 Katoh T, Suzuki T (submitted) Specific residues at every third position of siRNAs involve efficient RNAi activity. Kawahara Y, Ito K, Sun H, Aizawa H, Kanazawa I, Kwak S (2004) Glutamate receptors: RNA editing and death of motor neurons. Nature 427:801 Kawai G, Yamamoto Y, Kamimura T, Masegi T, Sekine M, Hata T, Iimori T, Watanabe T, Miyazawa T, Yokoyama S (1992) Conformational rigidity of specific pyrimidine residues in tRNA arises from posttranscriptional modifications that enhance steric interaction between the base and the 2'-hydroxyl group. Biochemistry 31:1040-1046 King MP, Attardi G (1989) Human cells lacking mtDNA: repopulation with exogenous mitochondria by complementation. Science 246:500-503 Kirino Y, Yasukawa T, Ohta S, Akira S, Ishihara K, Watanabe K, Suzuki T (2004) Codonspecific translational defect caused by a wobble modification deficiency in mutant tRNA from a human mitochondrial disease. Proc Natl Acad Sci USA: in press Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P, Boland F, Brignell SC, Bron S, Bunai K, Chapuis J, Christiansen LC, Danchin A, Debarbouille M, Dervyn E, Deuerling E, Devine K, Devine SK, Dreesen O, Errington J, Fillinger S, Foster SJ, Fujita Y, Galizzi A, Gardan R, Eschevins C, Fukushima T, Haga K, Harwood CR, Hecker M, Hosoya D, Hullo MF,

64 Tsutomu Suzuki Kakeshita H, Karamata D, Kasahara Y, Kawamura F, Koga K, Koski P, Kuwana R, Imamura D, Ishimaru M, Ishikawa S, Ishio I, Le Coq D, Masson A, Mauel C, Meima R, Mellado RP, Moir A, Moriya S, Nagakawa E, Nanamiya H, Nakai S, Nygaard P, Ogura M, Ohanan T, O'Reilly M, O'Rourke M, Pragai Z, Pooley HM, Rapoport G, Rawlins JP, Rivas LA, Rivolta C, Sadaie A, Sadaie Y, Sarvas M, Sato T, Saxild HH, Scanlan E, Schumann W, Seegers JF, Sekiguchi J, Sekowska A, Seror SJ, Simon M, Stragier P, Studer R, Takamatsu H, Tanaka T, Takeuchi M, Thomaides HB, Vagner V, van Dijl JM, Watabe K, Wipat A, Yamamoto H, Yamamoto M, Yamamoto Y, Yamane K, Yata K, Yoshida K, Yoshikawa H, Zuber U, Ogasawara N (2003) Essential Bacillus subtilis genes. Proc Natl Acad Sci USA 100:4678-4683 Kobayashi Y, Momoi MY, Tominaga K, Momoi T, Nihei K, Yanagisawa M, Kagawa Y, Ohta S (1990) A point mutation in the mitochondrial tRNA(Leu)(UUR) gene in MELAS (mitochondrial myopathy, encephalopathy, lactic acidosis and stroke-like episodes). Biochem Biophys Res Commun 173:816-822 Kobayashi Y, Momoi MY, Tominaga K, Shimoizumi H, Nihei K, Yanagisawa M, Kagawa Y, Ohta S (1991) Respiration-deficient cells are caused by a single point mutation in the mitochondrial tRNA-Leu (UUR) gene in mitochondrial myopathy, encephalopathy, lactic acidosis, and strokelike episodes (MELAS). Am J Hum Genet 49:590-599 Koga Y, Nonaka I, Kobayashi M, Tojyo M, Nihei K (1988) Findings in muscle in complex I (NADH coenzyme Q reductase) deficiency. Ann Neurol 24:749-756 Kuchino Y, Borek E, Grunberger D, Mushinski JF, Nishimura S (1982) Changes of posttranscriptional modification of wye base in tumor-specific tRNAPhe. Nucleic Acids Res 10:6421-6432 Kurata S, Ohtsuki T, Wada T, Kirino Y, Takai K, Saigo K, Watanabe K, Suzuki T (2003) Decoding property of C5 uridine modification at the wobble position of tRNA anticodon. Nucleic Acids Res Suppl:245-246 Kutay U, Lipowsky G, Izaurralde E, Bischoff FR, Schwarzmaier P, Hartmann E, Gorlich D (1998) Identification of a tRNA-specific nuclear export receptor. Mol Cell 1:359-369 Lauhon CT (2002) Requirement for IscS in biosynthesis of all thionucleosides in Escherichia coli. J Bacteriol 184:6820-6829 Li R, Li X, Yan Q, Qin Mo J, Guan MX (2003) Identification and characterization of mouse MTO1 gene related to mitochondrial tRNA modification. Biochim Biophys Acta 1629:53-59 Li X, Guan MX (2002) A human mitochondrial GTP binding protein related to tRNA modification may modulate phenotypic expression of the deafness-associated mitochondrial 12S rRNA mutation. Mol Cell Biol 22:7701-7711 Li X, Li R, Lin X, Guan MX (2002) Isolation and characterization of the putative nuclear modifier gene MTO1 involved in the pathogenesis of deafness-associated mitochondrial 12 S rRNA A1555G mutation. J Biol Chem 277:27256-27264 Lustig F, Boren T, Claesson C, Simonsson C, Barciszewska M, Lagerkvist U (1993) The nucleotide in position 32 of the tRNA anticodon loop determines ability of anticodon UCC to discriminate among glycine codons. Proc Natl Acad Sci USA 90:3343-3347 Maas S, Patt S, Schrey M, Rich A (2001) Underediting of glutamate receptor GluR-B mRNA in malignant gliomas. Proc Natl Acad Sci USA 98:14687-14692 Mahapatra S, Adhya S (1996) Import of RNA into Leishmania mitochondria occurs through direct interaction with membrane-bound receptors. J Biol Chem 271:2043220437

Biosynthesis and function of tRNA wobble modifications 65 Mahapatra S, Ghosh S, Bera SK, Ghosh T, Das A, Adhya S (1998) The D arm of tRNATyr is necessary and sufficient for import into Leishmania mitochondria in vitro. Nucleic Acids Res 26:2037-2041 Mahapatra S, Ghosh T, Adhya S (1994) Import of small RNAs into Leishmania mitochondria in vitro. Nucleic Acids Res 22:3381-3386 Marck C, Grosjean H (2002) tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. Rna 8:1189-1232 Matsugi J, Murao K, Ishikura H (1996) Characterization of a B. subtilis minor isoleucine tRNA deduced from tDNA having a methionine anticodon CAT. J Biochem (Tokyo) 119:811-816 Matsuyama S, Ueda T, Crain PF, McCloskey JA, Watanabe K (1998) A novel wobble rule found in starfish mitochondria. Presence of 7-methylguanosine at the anticodon wobble position expands decoding capability of tRNA. J Biol Chem 273:3363-3368 Mattaj IW, Englmeier L (1998) Nucleocytoplasmic transport: the soluble phase. Annu Rev Biochem 67:265-306 McCloskey JA, Graham DE, Zhou S, Crain PF, Ibba M, Konisky J, Soll D, Olsen GJ (2001) Post-transcriptional modification in archaeal tRNAs: identities and phylogenetic relations of nucleotides from mesophilic and hyperthermophilic Methanococcales. Nucleic Acids Res 29:4699-4706 Meier F, Suter B, Grosjean H, Keith G, Kubli E (1985) Queuosine modification of the wobble base in tRNAHis influences in vivo decoding properties. EMBO J 4(3):823827 Mihara H, Kato S, Lacourciere GM, Stadtman TC, Kennedy RA, Kurihara T, Tokumoto U, Takahashi Y, Esaki N (2002) The iscS gene is essential for the biosynthesis of 2selenouridine in tRNA and the selenocysteine-containing formate dehydrogenase H. Proc Natl Acad Sci USA 99:6679-6683 Mitchell JR, Wood E, Collins K (1999) A telomerase component is defective in the human disease dyskeratosis congenita. Nature 402:551-555 Moise NS, Pacioretty LM, Kallfelz FA, Stipanuk MH, King JM, Gilmour RF Jr (1991) Dietary taurine deficiency and dilated cardiomyopathy in the fox. Am Heart J 121:541-547 Moraes CT, Ciacci F, Silvestri G, Shanske S, Sciacco M, Hirano M, Schon EA, Bonilla E, DiMauro S (1993) Atypical clinical presentations associated with the MELAS mutation at position 3243 of human mitochondrial DNA. Neuromuscul Disord 3:43-50 Moriya J, Yokogawa T, Wakita K, Ueda T, Nishikawa K, Crain PF, Hashizume T, Pomerantz SC, McCloskey JA, Kawai G, Hayashi N, Yokoyama S, Watanabe K (1994) A novel modified nucleoside found at the first position of the anticodon of methionine tRNA from bovine liver mitochondria. Biochemistry 33:2234-2239 Munz P, Leupold U, Agris P, Kohli J (1981) In vivo decoding rules in Schizosaccharomyces pombe are at variance with in vitro data. Nature 294:187-188 Muramatsu M, Kinoshita K, Fagarasan S, Yamada S, Shinkai Y, Honjo T (2000) Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102:553-563 Muramatsu T, Nishikawa K, Nemoto F, Kuchino Y, Nishimura S, Miyazawa T, Yokoyama S (1988a) Codon and amino-acid specificities of a transfer RNA are both converted by a single post-transcriptional modification. Nature 336:179-181 Muramatsu T, Yokoyama S, Horie N, Matsuda A, Ueda T, Yamaizumi Z, Kuchino Y, Nishimura S, Miyazawa T (1988b) A novel lysine-substituted nucleoside in the first posi-

66 Tsutomu Suzuki tion of the anticodon of minor isoleucine tRNA from Escherichia coli. J Biol Chem 263:9261-9267 Nakai Y, Umeda N, Suzuki T, Nakai M, Hayashi H, Watanabe K, Kagamiyama H (2004) Yeast Nfs1p is involved in thio-modification of both mitochondrial and cytoplasmic tRNAs. J Biol Chem 279:12363-12368 Nakayashiki T, Inokuchi H (1998) Novel temperature-sensitive mutants of Escherichia coli that are unable to grow in the absence of wild-type tRNA6Leu. J Bacteriol 180:29312935 Nilsson K, Lundgren HK, Hagervall TG, Bjork GR (2002) The cysteine desulfurase IscS is required for synthesis of all five thiolated nucleosides present in tRNA from Salmonella enterica serovar typhimurium. J Bacteriol 184:6830-6835 Nishimura S (1983) Structure, biosynthesis, and function of queuosine in transfer RNA. Prog Nucleic Acid Res Mol Biol 28:49-73 Ogle JM, Murphy FV, Tarry MJ, Ramakrishnan V (2002) Selection of tRNA by the ribosome requires a transition from an open to a closed form. Cell 111:721-732 Osawa S (1995) Evolution of the genetic code. Oxford Univ. Press Osawa S, Jukes TH (1989) Codon reassignment (codon capture) in evolution. J Mol Evol 28(4):271-278 Petersen-Mahrt SK, Harris RS, Neuberger MS (2002) AID mutates E. coli suggesting a DNA deamination mechanism for antibody diversification. Nature 418:99-103 Pion PD, Kittleson MD, Rogers QR, Morris JG (1987) Myocardial failure in cats associated with low plasma taurine: a reversible cardiomyopathy. Science 237:764-768 Powers DM, Peterkofsky A (1972) The presence of N-(purin-6-ylcarbamoyl)threonine in transfer ribonucleic acid species whose codons begin with adenine. J Biol Chem 247:6394-6401 Ravn K, Wibrand F, Hansen FJ, Horn N, Rosenberg T, Schwartz M (2001) An mtDNA mutation, 14453G-->A, in the NADH dehydrogenase subunit 6 associated with severe MELAS syndrome. Eur J Hum Genet 9:805-809 Revy P, Muto T, Levy Y, Geissmann F, Plebani A, Sanal O, Catalan N, Forveille M, Dufourcq-Labelouse R, Gennery A, Tezcan I, Ersoy F, Kayserili H, Ugazio AG, Brousse N, Muramatsu M, Notarangelo LD, Kinoshita K, Honjo T, Fischer A, Durandy A (2000) Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102:565-575 Rubio MA, Liu X, Yuzawa H, Alfonzo JD, Simpson L (2000) Selective importation of RNA into isolated mitochondria from Leishmania tarentolae. Rna 6:988-1003 Ruggero D, Grisendi S, Piazza F, Rego E, Mari F, Rao PH, Cordon-Cardo C, Pandolfi PP (2003) Dyskeratosis congenita and cancer in mice deficient in ribosomal RNA modification. Science 299:259-262 Rusconi CP, Cech TR (1996) The anticodon is the signal sequence for mitochondrial import of glutamine tRNA in Tetrahymena. Genes Dev 10:2870-2880 Sakurai M, Ohtsuki T, Suzuki T, Watanabe K (submitted) Characteristic modification at wobble uridine in mitochondrial tRNAs of the nematode Ascaris suum. Salazar JC, Ambrogelly A, Crain PF, McCloskey JA, Soll D (2004) A truncated aminoacyltRNA synthetase modifies RNA. Proc Natl Acad Sci USA 101:7536-7541, Epub 2004 Apr 7519 Samuelsson T, Elias P, Lustig F, Axberg T, Folsch G, Akesson B, Lagerkvist U (1980) Aberrations of the classic codon reading scheme during protein synthesis in vitro. J Biol Chem 255:4583-4588

Biosynthesis and function of tRNA wobble modifications 67 Schneider A, Marechal-Drouard L (2000) Mitochondrial tRNA import: are there distinct mechanisms? Trends Cell Biol 10:509-513 Schon EA, Bonilla E, DiMauro S (1997) Mitochondrial DNA mutations and pathogenesis. J Bioenerg Biomembr 29:131-149 Shigi N, Suzuki T, Tamakoshi M, Oshima T, Watanabe K (2002) Conserved bases in the TPsi C loop of tRNA are determinants for thermophile-specific 2-thiouridylation at position 54. J Biol Chem 277:39128-39135 Shoffner JM, Lott MT, Lezza AM, Seibel P, Ballinger SW, Wallace DC (1990) Myoclonic epilepsy and ragged-red fiber disease (MERRF) is associated with a mitochondrial DNA tRNA(Lys) mutation. Cell 61:931-937 Sibler AP, Dirheimer G, Martin RP (1986) Codon reading patterns in Saccharomyces cerevisiae mitochondria based on sequences of mitochondrial tRNAs. FEBS Lett 194:131138 Simpson AM, Suyama Y, Dewes H, Campbell DA, Simpson L (1989) Kinetoplastid mitochondria contain functional tRNAs which are encoded in nuclear DNA and also contain small minicircle and maxicircle transcripts of unknown function. Nucleic Acids Res 17:5427-5445 Soll D, Jones DS, Ohtsuka E, Faulkner RD, Lohrmann R, Hayatsu H, Khorana HG (1966) Specificity of sRNA for recognition of codons as studied by the ribosomal binding technique. J Mol Biol 19:556-573 Soma A, Ikeuchi Y, Kanemasa S, Kobayashi K, Ogasawara N, Ote T, Kato J, Watanabe K, Sekine Y, Suzuki T (2003) An RNA-modifying enzyme that governs both the codon and amino acid specificities of isoleucine tRNA. Mol Cell 12:689-698 Sommer B, Kohler M, Sprengel R, Seeburg PH (1991) RNA editing in brain controls a determinant of ion flow in glutamate-gated channels. Cell 67:11-19 Sturman JA (1993) Taurine in development. Physiol Rev 73:119-147 Sullivan MA, Cannon JF, Webb FH, Bock RM (1985) Antisuppressor mutation in Escherichia coli defective in biosynthesis of 5-methylaminomethyl-2-thiouridine. J Bacteriol 161:368-376 Suzuki T, Suzuki T, Wada T, Saigo K, Watanabe K (2002) Taurine as a constituent of mitochondrial tRNAs: new insights into the functions of taurine and human mitochondrial diseases. Embo J 21:6581-6589 Suzuki T, Ueda T, Watanabe K (1997) The 'polysemous' codon--a codon with multiple amino acid assignment caused by dual specificity of tRNA identity. Embo J 16:11221134 Suzuki T, Ueda T, Yokogawa T, Nishikawa K, Watanabe K (1994) Characterization of serine and leucine tRNAs in an asporogenic yeast Candida cylindracea and evolutionary implications of genes for tRNA(Ser)CAG responsible for translation of a non-universal genetic code. Nucleic Acids Res 22:115-123 Tan TH, Pach R, Crausaz A, Ivens A, Schneider A (2002) tRNAs in Trypanosoma brucei: genomic organization, expression, and mitochondrial import. Mol Cell Biol 22:37073717 Tarassov IA, Martin RP (1996) Mechanisms of tRNA import into yeast mitochondria: an overview. Biochimie 78:502-510 Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631-637 Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV (2001) The COG database: new develop-

68 Tsutomu Suzuki ments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29:22-28 Teng B, Burant CF, Davidson NO (1993) Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260:1816-1819 Tesmer JJ, Klem TJ, Deras ML, Davisson VJ, Smith JL (1996) The crystal structure of GMP synthetase reveals a novel catalytic triad and is a structural paradigm for two enzyme families. Nat Struct Biol 3:74-86 Tomita K, Ueda T, Watanabe K (1998) 7-Methylguanosine at the anticodon wobble position of squid mitochondrial tRNA(Ser)GCU: molecular basis for assignment of AGA/AGG codons as serine in invertebrate mitochondria. Biochim Biophys Acta 1399:78-82 Uchida S, Kwon HM, Yamauchi A, Preston AS, Marumo F, Handler JS (1992) Molecular cloning of the cDNA for an MDCK cell Na(+)- and Cl(-)-dependent taurine transporter that is regulated by hypertonicity. Proc Natl Acad Sci USA 89:8230-8234 Umeda N, Suzuki T, Yukawa M, Ohya Y, Shindo H, Watanabe K, Suzuki T (2005 Mitochondria-specific RNA-modifying enzymes responsible for the biosynthesis of the wobble base in mitochondrial tRNAs: implications for the molecular pathogenesis of human mitochondrial diseases. J Biol Chem, in press van den Ouweland JM, Lemkes HH, Ruitenbeek W, Sandkuijl LA, de Vijlder MF, Struyvenberg PA, van de Kamp JJ, Maassen JA (1992) Mutation in mitochondrial tRNA(Leu)(UUR) gene in a large pedigree with maternally transmitted type II diabetes mellitus and deafness. Nat Genet 1:368-371 Wallace DC (2000) Mitochondrial defects in cardiomyopathy and neuromuscular disease. Am Heart J 139:S70-85 Wallace DC, Lott MT (2003) MITOMAP: A Human Mitochondrial Genome Database" http://www.mitomap.org. Watanabe K, Osawa S (1995) Evolution of the Genetic Code. ASM press, Washington, DC Weber F, Dietrich A, Weil JH, Marechal-Drouard L (1990) A potato mitochondrial isoleucine tRNA is coded for by a mitochondrial gene possessing a methionine anticodon. Nucleic Acids Res 18:5027-5030 Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H, Chu AM, Connelly C, Davis K, Dietrich F, Dow SW, El Bakkoury M, Foury F, Friend SH, Gentalen E, Giaever G, Hegemann JH, Jones T, Laub M, Liao H, Davis RW, et al. (1999) Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285:901-906 Wolf J, Gerber AP, Keller W (2002) tadA, an essential tRNA-specific adenosine deaminase from Escherichia coli. Embo J 21:3841-3851 Yarian C, Marszalek M, Sochacka E, Malkiewicz A, Guenther R, Miskiewicz A, Agris PF (2000) Modified nucleoside dependent Watson-Crick and wobble codon binding by tRNALysUUU species. Biochemistry 39:13390-13395 Yasukawa T, Suzuki T, Ishii N, Ohta S, Watanabe K (2001) Wobble modification defect in tRNA disturbs codon-anticodon interaction in a mitochondrial disease. Embo J 20:4794-4802 Yasukawa T, Suzuki T, Ishii N, Ueda T, Ohta S, Watanabe K (2000a) Defect in modification at the anticodon wobble nucleotide of mitochondrial tRNA(Lys) with the MERRF encephalomyopathy pathogenic mutation. FEBS Lett 467:175-178 Yasukawa T, Suzuki T, Suzuki T, Ueda T, Ohta S, Watanabe K (2000b) Modification defect at anticodon wobble nucleotide of mitochondrial tRNAs(Leu)(UUR) with patho-

Biosynthesis and function of tRNA wobble modifications 69 genic mutations of mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes. J Biol Chem 275:4251-4257 Yokobori S, Suzuki T, Watanabe K (2001) Genetic code variations in mitochondria: tRNA as a major determinant of genetic code plasticity. J Mol Evol 53(4-5):314-326 Yokoyama S, Nishimura S (1995) Modified nucleosides and codon recognition, Soll, D. and Rajbandary, U. L. edn. ASM press, Washington, DC Yokoyama S, Watanabe T, Murao K, Ishikura H, Yamaizumi Z, Nishimura S, Miyazawa T (1985) Molecular mechanism of codon recognition by tRNA species with modified uridine in the first position of the anticodon. Proc Natl Acad Sci USA 82:4905-4909 Yokoyama S, Yamaizumi Z, Nishimura S, Miyazawa T (1979) 1H NMR studies on the conformational characteristics of 2-thiopyrimidine nucleotides found in transfer RNAs. Nucleic Acids Res 6:2611-2626 Yoneda M, Miyatake T, Attardi G (1994) Complementation of mutant and wild-type human mitochondrial DNAs coexisting since the mutation event and lack of complementation of DNAs introduced separately into a cell within distinct organelles. Mol Cell Biol 14:2699-2712

Suzuki, Tsutomu Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan. [email protected]

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs Mary Anne T. Rubio and Juan D. Alfonzo

Abstract Trypanosomatids include a number of protozoan parasites that infect over 27 million people worldwide. Besides their medical importance, these organisms have also provided a wealth of novel biological discoveries including: RNA editing, mRNA trans-splicing, eukaryotic poly-cistronic transcription, and a mechanism for large-scale mitochondrial tRNA import. For many years, the study of RNA post-transcriptional modification in trypanosomatids has lagged behind when compared to bacterial, yeast, and animal systems. However, the discovery of editing in tRNA and 7SL RNAs has produced renewed interest in the processing of non-coding RNAs in these organisms. This chapter will compile what is currently known about RNA editing and modification in trypanosomatids, emphasizing the role these processes play in the structural reshaping of non-coding RNAs. Due to a number of substantive recent reviews, mRNA editing will not be the subject of this chapter. In addition, snoRNA-mediated modification of ribosomal RNAs will be covered in chapter 8 of this book.

1 Introduction This chapter will address what is currently known about the editing and modification of non-coding RNAs in trypanosomatids. Despite their medical importance and the prospect of RNA processing events as therapeutic targets, research on editing and modification of non-coding RNAs in trypanosomatids has lagged behind when compared to other model systems such as S. cerevisiae, E. coli, and Xenopus (to name a few). It was the discovery of mRNA editing in 1988 by Rob Benne and colleagues that led to a renewed interest in RNA processing in these organisms. Since its discovery, much of the research on RNA editing in trypanosomatids has concentrated in the study of mRNA editing in mitochondria. Indeed great strides have been made towards understanding this important mechanism of mRNA processing and many of the factors involved have now been identified. Owing to the novelty and serendipitous nature of discovering editing, mRNA editing was widely accepted as the rule in trypanosomatids, perhaps to the exclusion of other types of editing. The more recent discovery of C to U editing in non-coding RNAs both in the cytosol and the mitochondria then expanded the variety of editing

Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106363 / Published online: 7 January 2005 © Springer-Verlag Berlin Heidelberg 2005

72 Mary Anne T. Rubio and Juan D. Alfonzo

mechanisms in trypanosomatids, highlighting the connection between editing and modification in these medically important organisms. Rather than concentrating on a particular editing mechanism, we will discuss a number of independent examples of editing and/or modification of various RNAs, highlighting how by affecting RNA structure both editing and modification may act alone or in concert to modulate RNA function. Unlike other systems (animal cells, yeast, and bacteria), little is known about how editing and modification is specified in trypanosomatids and neither the enzymes, the factors nor the actual mechanism(s) has been elucidated. To date, there are only two examples of editing of non-coding RNAs in trypanosomatids: editing of the 7SL RNA in the cytosol and the C to U editing of tRNATrp in mitochondria. As we will further elaborate in this chapter, in the former example, the connection between editing and structural reshaping is clear, while in the latter example, editing directly affects decoding but may still impart subtle changes that affect tRNA anticodon structure. This chapter will cover 4 specific examples in trypanosomatids where editing and/or modification may play a role in affecting structure and function of noncoding RNAs: 1) we will discuss the role of modifications in trans-splicing, 2) the possible role of editing in protein secretion, 3) the role of modification on tRNA trafficking, and 4) the role of editing and modification in mitochondrial tRNA function.

2 RNA modification and trans-splicing In trypanosomatids, most, if not all, protein-coding genes in the nucleus are transcribed into long poly-cistronic pre-mRNAs that contain multiple open reading frames (Muhich and Boothroyd 1988). Individual mRNAs are then posttranscriptionally processed into defined units by polyadenylation and transsplicing (Campbell et al. 1984; Parsons et al. 1984; Walder et al. 1986; Ullu and Tschudi 1990; Laird et al. 1985). In trans-splicing, a 39-41 nucleotides-long leader is attached to the 5' end of every nucleus-encoded mRNA creating a mature transcript (Parsons et al. 1984). Thus, every mature nucleus-encoded mRNA in these cells has precisely the same 5' end. Prior to trans-splicing, the 5' end of the spliced leader RNA (SL RNA) is heavily modified to form the cap structure (Laird et al. 1985; Lenardo et al. 1985; Freistadt et al. 1987; Perry et al. 1987). In this sense, trans-splicing serves two main functions: it provides a mature 5' UTR to mRNAs and it also adds the cap structure (Fig. 1), which is a trademark of translation for most eukaryotic mRNAs. Among eukaryotes the structure of the 5' end cap is highly conserved. Synthesis of 7-methylguanosine (m7G) is preceded by the non-templated addition of GMP to the 5' end of mRNAs followed by the methylation of the newly added guanosine, at the N7 position, to form m7G. This process creates the minimal cap structure, m7GpppN, where N is any nucleotide (Quiocho et al. 2000). Cap structure formation in trypanosomatids differs from that of other eukaryotes in both the number and type of post-transcriptional modifications it contains (Bangs et al. 1992). Like

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 73

Fig. 1. The cap structure of nucleus-encoded mRNAs in trypanosomes. A) the cap-4 structure unique to trypanosomatids, numbers refer to nucleotide positions at the 5’ end of nucleus-encoded mRNAs following the non-templated addition of m7G (not numbered). B) the structure of the two highly modified nucleotides only found in nature at the 5’ end of trypanosomatid mRNAs as part of the cap-4 structure.

in most eukaryotes, in trypanosomatids the 5' most nucleotide is a posttranscriptionally added m7G that requires a guanylyltransferase activity (Gunzl et al. 2000). Different from most eukaryotes, however, cap formation occurs on the SL RNA prior to trans-splicing. In addition, the 4 nucleotides following the m7G get methylated both at the ribose and the base to form the mature cap structure (cap-4 structure) that thus far is unique to trypanosomatids (Fig. 1A). Bangs et al. showed that this structure also includes two additional modifications: N6, N6 dimethyladenosine to form (m62Am) at the second position and N3 methyluridine (m3Um). Both of these modifications are unique to trypanosomatids and not found elsewhere in nature (Bangs et al. 1992) (Fig. 1B). This unusual structure together with the fact that the cap is formed on the spliced-leader RNA, and not directly onto the pre-mRNAs, raised the question of a possible function for cap formation in trans-splicing. Initial studies in Trypanosoma brucei, showed that in permeabilized cells cap-4 formation was co-transcriptional and proceeded in a general 5’-3’ direction (Ullu and Tschudi 1990, 1991). Following m7G addition, modification of the first two encoded nucleotides (A1 and A2) could be observed in partial transcripts of the SL RNA. These modifications were present even before the transcripts reached the conserved sm-like sequence of the SL RNA, suggesting that complete folding of a mature SL RNA was not required for these modifications. On the contrary, modifications of position 3 and 4 (C3 and U4) could only be observed when a nearly mature SL RNA was synthesized. This observation implies that either the structure

74 Mary Anne T. Rubio and Juan D. Alfonzo

of a full-length RNA or the length itself might be important for cap-4 formation. Whichever the case may be, there appears to be interdependence between modifications at the first two and the last two positions of the SL RNA. It is possible that the enzymes that modified the 3rd and 4th position of the RNA have a different specificity than the enzymes that modify the first two positions. As previously proposed by Grosjean and co-workers for tRNA modifying enzymes (Grosjean et al. 1996), modification of the first two positions may be architecture independent whereas modification of the next two positions requires a well-defined structure on the SL RNA. Along these lines, Mair et al. also noted, in studies with T. brucei, that only transcripts that contained a full-length SL RNA, and inherently a mature cap-4, could associate with core SL RNP proteins becoming available for transsplicing (Mair et al. 2000). This observation again raises the question of whether or not the cap-4 structure is required for trans-splicing or rather the presence of a cap-4 structure is just needed for some other function (i.e. translation). More recently, Michaeli and co-workers demonstrated that in Leptomonas collosoma (a close relative of T. brucei) deletion of the cap-4 nucleotides led to the abrogation of trans-splicing (Mandelboim et al. 2002). Similar results were obtained by Bindereif and co-workers in Leptomonas seymouri (Lucke et al. 1996). Additionally, mutations of the conserved sm-like sequence led to reduced levels of cap-4 modifications and trans-splicing, indicating that the lack of modifications observed in T. brucei in transcripts that lacked an sm-like sequence are indeed due to a requirement of this specific sequence for cap-4 formation rather than a simple matter of sequence length. Consequently, cap-4 formation indeed plays a role in trans-splicing. However, conflicting results were obtained when a similar analysis was performed with Leishmania tarentolae (yet another close relative of T. brucei). Campbell and co-workers mutagenized different regions of the SL RNA and analyzed the in vivo pattern of both cap-4 modifications and trans-splicing with the different mutants (Sturm and Campbell 1999; Zeiner et al. 2003). They found that different mutations could affect the levels of cap-4 modification, where mutations in the 28-39 region of the SL RNA had no effect, 20-29 region mutants were undermodified and the 10-19 region mutants were not modified (Fig. 2). However, regardless of the mutations and or the modification at cap-4 all the mutants transspliced normally. Rather than an effect on trans-splicing, these authors found that defects in cap-4 formation correlated with defects in association of trans-spliced RNAs with polysomes (Zeiner et al. 2003). Indicating that, like in most eukaryotes, cap-4 formation is important in translation. From the studies above, we have learned that although several trypanosomatid systems have been analyzed for cap-4 formation, no unifying principle exists that explains what role the cap 4 structure might play in SL RNA trans-splicing. However, it is evident that in T. brucei and Leptomonas exists a correlation between cap-4 formation and the efficiency of trans-splicing. It could well be that the observed discrepancies between different systems just reflect the state of current methodology used to analyze cap-4 modification, where most laboratories use primer extension type assays which are rather indirect and do not specifically reveal the chemical identity of the modified nucleotides for different mutants.

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 75

Fig. 2. Secondary structure of the SL RNA from L. tarentolae. Arrows highlight the effect that mutations of a given region of the RNA have on modifications in vivo. Asterisks denote the modified nucleotides in the CAP-4 structure. Nucleotides in boldface indicate the sequenced that is spliced onto the mRNA. The arrowhead indicates the boundary between the 5’ site and the mature SL RNA.

Indeed, earlier studies by Ullu and Tschudi (Ullu and Tschudi 1991) showed that in T. brucei inhibition of cap-4 methylation by the S-adenosyl homocysteine (SAH), a SAM analog, lead to impairment of trans-splicing. Similar studies by McNally and Agabian (McNally and Agabian 1992) using sinefungin (a different SAM analog) confirmed these results. Both of these groups utilize in vivo labeling approaches combined with thin-layer chromatography analysis to quantify the level of modification of the SL RNA under various growth conditions, namely in the presence and absence of the methylation inhibitors. It could be argued that methylation is so broadly distributed amongst various non-coding cellular RNAs that inhibitor studies could simply reflect more global and indirect effects on transsplicing due to a primary effect on a different RNA molecule (i.e. rRNA or tRNA), which leads to destabilization of the SL RNA. However, both groups above could also show that the impairment of trans-splicing was not the result of effects in the synthesis, intracellular localization or stability of SL RNA. Unfortunately, neither an in vivo trans-splicing system nor the enzymes that are involved in cap-4 modification have been identified in trypanosomatid. Clarification of the specific role cap-4 modifications play in the trans-splicing process will thus await future experimentation. It is appealing to propose that, at least, in the T. brucei and Leptomonas systems, the lack of trans-splicing in mutants devoid of a mature cap-4, reflects a necessity for a proper structure imparted on the SL RNA by cap modification. In this sense, cap-4 modification may modulate SL RNA

76 Mary Anne T. Rubio and Juan D. Alfonzo

utilization by causing small changes in the SL RNA local structure that may cause changes in its ability to associate with RNP particles required for trans-splicing. Evidently, it is also possible that structural modulation by cap-4 is gene-specific and will thus require a case-by-case analysis to properly explain the discrepancies observed in the different trypanosomatid systems.

3 RNA editing and protein secretion In both bacterial and eukaryotic cells, a subset of nascent pre-secretory and membrane proteins possess a hydrophobic signal peptide at the amino terminal end (pre-protein) as it emerges from the ribosome. In all organisms, the signal recognition particle (SRP) recognizes and binds the signal peptide, which triggers a transient pausing of translation, termed "elongation arrest" (Althoff et al. 1994) (Weichenrieder et al. 2000). SRP then delivers the nascent chain-ribosome complex to the eukaryotic endoplasmic reticulum (ER) membrane, bacterial plasma membrane, or chloroplast thylakoid membranes by docking with a membranebound SRP receptor. Upon the successful coupling of nascent protein synthesis to translocation across the appropriate membrane by SRP, the signal sequence is subsequently cleaved by proteolysis (Althoff et al. 1994). SRP is a cytoplasmic ribonucleoprotein complex composed of a single RNA, 7SL or SRP RNA, and 6 proteins, in yeast and mammals and two heterodimeric and two monomeric proteins in canines (Althoff et al. 1994). Compared to all known SRP complexes, the trypanosomatid SRP is uniquely composed of not one, but two RNA molecules, the 7SL RNA and a tRNA-like molecule, sRNA-76 in T. brucei (Beja et al. 1993) and sRNA-85 in L. collosoma (Liu et al. 2003). Although these sRNAs do not have similar sequences, both are tRNA-like molecules that deviate from the canonical tRNA structure. The T. brucei sRNA-76 possesses a leucine anticodon and has highest sequence similarity to yeast suppressor tRNAGly (69.5%), mouse and rat tRNAAsp(GUC) (69%) and mouse tRNAGly(UUC) (68.1%) (Beja et al. 1993). The L. collosoma sRNA-85 resembles tRNAHis and tRNAAsn, containing an asparagine anticodon and a 3'-end that is nearly identical to the Drosophila tRNAHis(GUG) (95%) and the H. volcanii tRNAAsn(GUU) (100%) (Liu et al. 2003). The D-arm and anticodon stem (nt 26-42) region also harbors a putative 7SL RNA binding site. No significant sequence similarity is observed at the 5’end of any of the snRNAs from any organisms. Comparison of SRP complexes found in T. brucei and L. collosoma suggest that the unique composition of the SRP is conserved in the trypanosomatid family. The interaction of sRNA with 7SL RNA by base-pairing and their co-existence in a single complex have been confirmed by conventional chromatography, affinityselection using an antisense oligonucleotide to sRNA, and in vivo cross-linking (Liu et al. 2003). Together, the sRNA and 7SL RNA comprise a small ribonucleoprotein complex which sediments at 14S in L. collosoma and 11S in T. brucei and transiently binds ribosomes. In situ hybridization suggests that both 7SL RNA and

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 77

Fig. 3. Secondary structure of the L. collosoma 7SL RNA and its associated tRNA-like structure (sRNA85). Dashed lines denote the proposed interaction between the two RNAs. The arrow marks the position of the C to U editing event (C133). Adapted from Liang et al. (2003).

sRNA-85 are mainly confined to the cytoplasm but also exist in the nucleolus (Liu et al. 2003). Mutations that disrupted the potential base-pairing between sRNA85and 7SL RNA in L. collosoma completely abolished interaction with either 7SL RNA and/or ribosomes. These mutations led to aberrant cell distribution of the SRP complex and showed accumulation of these complexes both inside and around the nucleus. Other than C to U editing, the 7SL RNA does not undergo post-transcriptional modifications and what role the presumed post-transcriptional modifications present in the sRNA play in secretion are unclear. It is appealing to think that posttranscriptional modifications in the tRNA-like structure of the sRNA might play a role in modulating complex formation and/or elongation arrest. A plausible connection between post-transcriptional changes and SRP function in trypanosomatids has been provided by the recent discovery of the single C to U editing event in the 7SL of L. collosoma (Ben-Shlomo et al. 1999). The single-copy 7SL RNA gene codes for a C at position 133, while two versions of the RNA are found in nature: one containing the DNA encoded C133 (7SLII) and a second version containing U133 resulting from RNA editing (7SLI). Thus, 7SL RNA exists in two stable conformations: a fast migrating form, 7SL I, and a slow migrating form, 7SL II, during electrophoresis under denaturing conditions. In actively growing

78 Mary Anne T. Rubio and Juan D. Alfonzo

cells, the predominant form is 7SLI. 7SL I is mainly bound to ribosomes, whereas 7SL II is more abundant in ribosome-free particles. Both species are structurally stable, where no inter-conversion between the two species was detected. Furthermore, cell fractionation studies showed that the edited form (7SL I) was found predominantly in the cytoplasm, and the pre-edited form (7SLII) in the nucleus. The 7SL RNA is one of the few small RNAs that posses no modified nucleotides (other than the edited cytosine), as demonstrated by LC/MS in the L. collosoma 7SL RNA (Ben-Shlomo et al. 1999). The mechanism by which the 7SL RNA undergoes the conformational change has yet to be determined. One possibility involves the alteration of a protein-binding site (SRP19 binding), since the edited nucleotide is located in a small region of Domain III important for SRP19 recognition in other systems. An alternative model proposes that the editing event may affect putative RNA long-range tertiary interaction(s). However, the editing event is not the only factor that may determine whether 7SL RNA will undergo structural changes, since 7SL RNA mutants, altered in regions outside the editing site, have also been found in a single conformation. Nonetheless, this editing may provide a novel role for RNA editing in controlling the conformation of the 7SL RNA in vivo. Furthermore, one may envisage a situation where the conformational change imparted by editing on the 7SL RNA alters its interaction with Srna-85. In this scenario editing may not only modulate ribonucleoprotein assembly, but also RNA-RNA interaction.

4 Role of modifications on sub-cellular localization of tRNAs Most organisms encode a full set of tRNAs for the translation of nuclear genes and a separate set is encoded in the mitochondria and/or chloroplasts for organellar translation. However, there are an ever-increasing number of organisms that do not encode all the tRNAs necessary for organellar protein synthesis in either their mitochondrial or chloroplast genomes (Schneider and Marechal-Drouard 2000). These organisms have instead evolved a mechanism to import nucleus-encoded tRNAs from the cytoplasm into the organelles. The number of imported tRNAs described so far varies to 1 in yeast, a few in plants and more than half of the tRNAs in Tetrahymena (Schneider and Marechal-Drouard 2000). Most recently the import of a single tRNA (tRNALys(UUU)) has been described in marsupials (Dorner et al. 2001). Trypanosomatids represent an extreme case where, no tRNA genes are encoded in their mitochondria and a full complement of tRNAs have to be imported from the cytosol for mitochondrial translation (Simpson et al. 1989). Due to the presence of such a robust import pathway, all tRNAs in trypanosomatid have a “split personality”. Where, with the exception of initiator tRNAMet(CAU) (Tan et al. 2002), the same set of tRNAs that are utilized in the cytoplasm are also used in the mitochondria. The question has been raised as to how eukaryotic tRNAs have managed to function in the bacteria-like translation system of the mitochondria. An ample number of examples show inter-kingdom differences in how

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 79

tRNAs interact and are recognized by various translation components including synthetases, translation factors, and ribosomes. The solution to the “schizophrenic” tRNA conundrum may well lie in the number and nature of the post-transcriptional modifications present in tRNAs that are specific for various compartments. Earlier work by Schneider and Agabian (Schneider et al. 1994b), indeed demonstrated differences in the modification set of at least three tRNAs in Trypansoma brucei (tRNALys(CUU), tRNALeu(CAA) and tRNATyr(GUA)). These authors labeled tRNAs, isolated from both the cytoplasm and mitochondria of T. brucei, with radioactive nucleotides using a “splint” labeling technique. With this technique a particular tRNA can be specifically labeled at the 3'-terminus using an oligonucleotide that hybridizes to the very 3'-end of the tRNA but leaving a stretch of three Gs as a 5'-overhang. This oligonucleotide then permits addition of up to three radioactive cytidines to the 3'-end of the tRNA by DNA polymerase, using the overhanging Gs in the oligo as a template (Hausner et al. 1990). The labeled tRNA is then purified, by denaturing acrylamide electrophoresis, followed by determination of the tRNA sequence by enzymatic digestion. Using this technique, the authors were able to show that differences in the anticodon sequence patterns of the same tRNA could be observed when tRNAs purified from the cytosol and mitochondria were compared. These mitochondria-specific modifications where observed at positions 32 of tRNALeu(CAA) and tRNALys(CUU). In addition, modification of U33 was also observed in tRNATyr(GUA) following mitochondria import. Although modifications at position 32 occur commonly in all organisms, prior to this example position 33 was never seen modified in any naturally occurring tRNA from any organism. Although the splint-labeling technique has its merits in that it allowed the identification of sequence differences in native tRNAs with very limited sample size, the chemical nature of the modifications could not be directly disclosed. Despite these limitations, the observation of mitochondria-specific modifications raised immediate questions about a possible role for modifications as import signals. However, Schneider et al. had previously shown that certain mutations in the intron sequence of tRNATyr(GUA) led to accumulation of unspliced tRNA within cells (Schneider et al. 1994a). They also showed that a portion of the unspliced tRNA could still be imported into mitochondria in vivo. When this pre-tRNA was analyzed for modifications by the “splint-labeling” method as described above, no differences were observed between the unspliced tRNA purified from mitochondria and that purified from the cytoplasm. This indicated that the mitochondriaspecific modifications were not part of the import signal and in fact must be there for a different function. However, these authors could not “formally exclude that the tRNAs are modified in the cytosol and immediately imported into mitochondria”. Of course, the above observations also do not rule out the possibility that some modifications may serve as positive or negative determinants of import in other tRNAs. In a more recent study, Suzuki and co-workers (Kaneko et al. 2003) compared the sub-cellular localization of tRNAs in Leishmania tarentolae and found differences in the modification pattern of the anticodon of tRNAGlu(UUC). These authors utilized mass spectrometry to compare the modification set of cytoplasmic

80 Mary Anne T. Rubio and Juan D. Alfonzo

Fig. 4. A model for the control of tRNA import by post-transcriptional modification. Thiolation of U34 leads to cytoplasmic retention and only the un-thiolated fraction is a substrate for mitochondrial import (adapted from Kaneko et al. 2003).

and mitochondrial tRNAGlu(UUC) and tRNAGln(UUG). Both tRNAs show a similar steady-state distribution when equal amounts of total sub-cellular RNAs from either compartment is separated by 2D-gel electrophoresis, followed by northern blot analysis with tRNA-specific probes. Unlike the earlier work, these authors could specifically determine the chemical nature of the modification differences. Both tRNAGlu(UUC) and tRNAGln(UUG) contained a common mcmU34 at the wobble nucleotide (position 34) regardless of their intracellular localization. In addition, these tRNAs contain 2-thiouridine in the cytosol (cytoplasm-specific modification) and 2’-O-methyl uridine in the mitochondria (mitochondria-specific modification). On the contrary, tRNALys, which also contains U34 as the wobble nucleotide, was not thiolated in either compartment but was ribose methylated at position 32 (Cm32), following mitochondrial import, as previously suggested by Schneider et al. In trypanosomatids, tRNAs have been divided into three groups according to their intracellular distribution: mainly cytosolic (group I), mainly-mitochondrial (group II) and equally shared between the two compartments (group III). Suzuki and co-workers noted that both thiolated tRNAs above belong to group III while tRNALys(UUU) is a group II tRNA. This led to a model for cytoplasmic retention of tRNAs based on the presence of cytoplasm-specific modifications (Fig. 4). In this model, U34 thiolation serves as a negative determinant of import as previously proposed by Rusconi and Cech to explain tRNA distribution in Tetrahymena. In this scenario, U34 thiolation is a signal for cytosolic retention. The fraction of tRNA that is imported into mitochondria escapes thiolation, and its

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 81

negative effect, by a yet unknown mechanism. The observation above was supported by experiments with a reconstituted mitochondrial import system, whereby radioactive tRNAs could be efficiently imported in vitro into isolated mitochondria. This assay previously showed that in vitro import of tRNAIle(UAU) and Gln (UUG) faithfully reproduced the in vivo situation. Surprisingly, when the import of native tRNAGlu(UUC) was compared to its synthetic equivalent, the native tRNA was imported poorly. Suggesting that some of the modifications present in the tRNA were responsible for the discrepancy in the in vitro import behavior. When similar experiments were performed with native and synthetic tRNALys (UUU) no difference in import efficiency was observed. Taken together, this data supports the view that modifications may serve as determinants for some, but not all, tRNAs. Evidently several conditions may determine thiolation-mediated cytoplasmic retention of tRNAs: either a factor binds thiolated tRNAs in the cytoplasm preventing import, the inability of the mitochondrial import machinery to import thiolated tRNAs or both may be acting in concert to regulate tRNAGlu(UUC) and Gln (UUG) distribution. It is worth noting, however, that thiolation might not by itself impart cytoplasmic retention of a particular tRNA, but rather the combination of thiolation of the wobble nucleotide with some other feature of a particular tRNA structure might be required. In this realm, different combinations of modified nucleotides and primary sequence determinants may cause similar folding of a particular tRNA leading to similar localization behavior. Regardless of what factors are responsible for the retention phenotype, clearly, modifications may induce subtle yet significant structural changes on a tRNA that again impact its localization and function in this system.

5 The story of tRNATrp in trypanosomatids: where editing meets modification To date, tRNATrp is by far the best-characterized tRNA in trypanosomatids. Like in many other organisms the trypanosomatid mitochondrial genetic code deviates from the universal code and some codons have been reassigned through evolution to have a new meaning. Perhaps the most prominent example of codon reassignment is that of tryptophan codons. For nuclear genes, UGG is used as a codon for tryptophan and UGA is a canonical stop codon (amber codon). Most mitochondrial genomes, with the exception of plants, read both UGG and UGA codons as tryptophan and in fact UGA is never used as a stop codon (Jukes et al. 1987; Osawa and Jukes 1989; Jukes and Osawa 1990). To decode both codons, mitochondrial genomes usually encode a tRNATrp with UCA anticodon that may decode both UGG and UGA codons if wobbling is considered. T. brucei and L. tarentolae (and presumably all other trypanosomatids) are then faced with a decoding problem where, due to the lack of mitochondria-encoded tRNAs, tRNATrp has to be imported from the cytoplasm. This raised the question as to how these cells avoid suppression of cytoplasmic UGA stop codons as the nucleus-encoded

82 Mary Anne T. Rubio and Juan D. Alfonzo

tRNATrp(UCA) transits thought the cytoplasm. It was discovered that the solution to this decoding problem is rather simple; trypanosomatids encode a single tRNATrp with anticodon CCA which following mitochondrial import is edited to form the new anticodon UCA for UGA decoding (Alfonzo et al. 1999). It is presumably through compartmentalization of the editing activity that these cells avoid the decoding road block. Due to a predominance of UGA codons in the mitochondria (where almost every gene has at least one UGA codon), it is inferred that tRNATrp editing is essential for mitochondrial biogenesis and cell viability. Like in other mitochondria, it is assumed that, following editing, the UCA anticodon may decode both UGA by canonical base pairing and UGG by wobbling. Interestingly, under steady-state conditions only ~50% of the imported tRNATrp gets edited (Alfonzo et al. 1999), one wonders whether both tRNAs are in fact, not redundant and dedicated to each tryptophan codon. Where tRNATrp(CCA) may only decode UGG and tRNATrp(UCA) is used exclusively for UGA decoding. Lending credence to this proposal is the presence of an unprecedented number of posttranscriptional modifications in the anticodon arm of the imported tRNATrp (Crain et al. 2002). Following import, tRNATrp is heavily modified at the anticodon loop. Furthermore, the modification set differs for both the edited and unedited tRNAs. As previously proposed by Schneider and most recently by Crain et al., many of these mitochondria-specific modifications might be used to structurally readapt the tRNA, enabling it to function in a bacteria-like translation system. The observed modification differences between the two tRNAs also raised questions as to what role modifications play in determining editing specificity. We have proposed that, following import, modifications are used to ensure specific recognition of tRNATrp by the editing enzyme (Fig. 5). We have proposed an interdependence model for anticodon modification and editing, where some, if not all, the anticodon modifications in tRNATrp do not occur randomly but follow a specific sequence. In our model, following import, tRNATrp gets ribose methylated at position 32 (Cm32), the cell then has to make a decision. It either thiolates position 33 (s2U33), which in turn signals anticodon editing followed by methylation to form Um34 or it methylates C34 (Cm34), which serves as a negative determinant of editing. In this model, ribose methylation at C34 ensures that not all of the tRNA is converted to UCA by editing, while s2U33 prevents C34 methylation, and thus, creates the proper editing substrate. This proposal suggests that,minor changes in the anticodon structure imparted on the tRNA by different modifications are key determinants of editing specificity. In fact, Crain et al. (2002) showed that nucleotide changes at the base of the stem were sufficient to alter editing efficiency in vivo. Whether this effect was due to alterations in the modification pattern of the tRNA or to editing directly remains the topic of investigation in our laboratory. The editing and modification interdependence model considers that editing of a tRNA other than tRNATrp, or at a different anticodon position, may lead to undesirable decoding errors. However, this model does not rule out the possibility that C to U editing has been used multiple times to change or expand the decoding capacity of other. tRNAs, provided that tRNA editing is regulated and does not cause deleterious decoding defects

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 83

Fig. 5. The “interdependence” model for editing and modification. In this model thiolation of U33 (2-4) provides the substrate for further editing and modification. C34 methylation (5) acts as a negative determinant of editing. In this model the CCA and UCA anticodon are not redundant and exclusively decode the mitochondrial CCA and UCA anticodons respectively.

6 Concluding remarks This chapter has attempted to review what is currently known about the editing and modification of non-coding RNAs in trypanosomatids, although the examples described are unrelated to one another. Each of the examples illustrates how a possible interdependence between different modifications may help change, in a stepwise fashion, the structural landscape of a given RNA substrate providing the specificity that a given system demands. In the specific case of tRNATrp editing and modification, in Leishmania mitochondria, we suggest that the interrelation between these two processes ensures that only tRNATrp and no other C34 containing tRNA is edited, thus, maintaining decoding fidelity. To date, none of the enzymes involved in editing and/or modification of non-coding RNAs in trypanosomatids have been identified. It will only be with the discovery of the different enzymes that the mechanism, the basis of substrate specificity and the hypothesis of interdependence between both processes will be further tested. We have cited a number of examples in trypanosomatids that indicate a possible role for post-transcriptional RNA changes in reshaping both structural and genetic information. These examples are by no means either conclusive or exhaustive. As it is often the case, and especially with RNA editing, these discoveries are serendipitous depending on an unusual observation followed by intuitive curiosity

84 Mary Anne T. Rubio and Juan D. Alfonzo

on the part of the investigator. With the new and recent developments of bioinformatics and the growing number of sequenced genomes, it is anticipated that the discovery of new examples or RNA editing will become routine. These approaches combined with detailed structural studies should illuminate how changes at the RNA level ultimately lead to genetic reshaping.

Acknowledgements MA T. R and J.D.A. are supported by a grant to J.D.A. from the American Heart Association Ohio affiliate.

References Alfonzo JD, Blanc V, Estevez AM, Rubio MA, Simpson L (1999) C to U editing of the anticodon of imported mitochondrial tRNA(Trp) allows decoding of the UGA stop codon in Leishmania tarentolae. EMBO J 18:7056-7062 Althoff S, Selinger D, Wise JA (1994) Molecular evolution of SRP cycle components: functional implications. Nucleic Acids Res 22:1933-1947 Bangs JD, Crain PF, Hashizume T, McCloskey JA, Boothroyd JC (1992) Mass spectrometry of mRNA cap 4 from trypanosomatids reveals two novel nucleosides. J Biol Chem 267:9805-9815 Beja O, Ullu E, Michaeli S (1993) Identification of a tRNA-like molecule that copurifies with the 7SL RNA of Trypanosoma brucei. Mol Biochem Parasitol 57:223-229 Ben-Shlomo H, Levitan A, Shay NE, Goncharov I, Michaeli S (1999) RNA editing associated with the generation of two distinct conformations of the trypanosomatid Leptomonas collosoma 7SL RNA. J Biol Chem 274:25642-25650 Campbell DA, Thornton DA, Boothroyd JC (1984) Apparent discontinuous transcription of Trypanosoma brucei variant surface antigen genes. Nature 311:350-355 Crain PF, Alfonzo JD, Rozenski J, Kapushoc ST, McCloskey JA, Simpson L (2002) Modification of the universally unmodified uridine-33 in a mitochondria-imported edited tRNA and the role of the anticodon arm structure on editing efficiency. RNA 8:752761 Dorner M, Altmann M, Paabo S, Morl M (2001) Evidence for import of a lysyl-tRNA into marsupial mitochondria. Mol Biol Cell 12:2688-2698 Freistadt MS, Cross GA, Branch AD, Robertson HD (1987) Direct analysis of the miniexon donor RNA of Trypanosoma brucei: detection of a novel cap structure also present in messenger RNA. Nucleic Acids Res 15:9861-9879 Grosjean H, Edqvist J, Straby KB, Giege R (1996) Enzymatic formation of modified nucleosides in tRNA: dependence on tRNA architecture. J Mol Biol 255:67-85 Gunzl A, Bindereif A, Ullu E, Tschudi C (2000) Determinants for cap trimethylation of the U2 small nuclear RNA are not conserved between Trypanosoma brucei and higher eukaryotic organisms. Nucleic Acids Res 28:3702-3709 Hausner TP, Giglio LM, Weiner AM (1990) Evidence for base-pairing between mammalian U2 and U6 small nuclear ribonucleoprotein particles. Genes Dev 4:2146-2156

Editing and modification in trypanosomatids: the reshaping of non-coding RNAs 85 Jukes TH, Osawa S (1990) The genetic code in mitochondria and chloroplasts. Experientia 46:1117-1126 Jukes TH, Osawa S, Muto A, Lehman N (1987) Evolution of anticodons: variations in the genetic code. Cold Spring Harb Symp Quant Biol 52:769-776 Kaneko T, Suzuki T, Kapushoc ST, Rubio MA, Ghazvini J, Watanabe K, Simpson L (2003) Wobble modification differences and subcellular localization of tRNAs in Leishmania tarentolae: implication for tRNA sorting mechanism. EMBO J 22:657-667 Laird PW, Kooter JM, Loosbroek N, Borst P (1985) Mature mRNAs of Trypanosoma brucei possess a 5' cap acquired by discontinuous RNA synthesis. Nucleic Acids Res 13:4253-4266 Lenardo MJ, Dorfman DM, Donelson JE (1985) The spliced leader sequence of Trypanosoma brucei has a potential role as a cap donor structure. Mol Cell Biol 5:2487-2490 Liang Xu, Xu YX and Michaeli S (2002) The spliced leader-associated RNA is a trypanosome-specific sn(o) RNA that has the potential to guide pseudouridine formation on the SL RNA. RNA 2:237-246. Liu L, Ben-Shlomo H, Xu YX, Stern MZ, Goncharov I, Zhang Y, Michaeli S (2003) The trypanosomatid signal recognition particle consists of two RNA molecules, a 7SL RNA homologue and a novel tRNA-like molecule. J Biol Chem 278:18271-18280 Lucke S, Xu GL, Palfi Z, Cross M, Bellofatto V, Bindereif A (1996) Spliced leader RNA of trypanosomes: in vivo mutational analysis reveals extensive and distinct requirements for trans splicing and cap4 formation. EMBO J 15:4380-4391 Mair G, Ullu E, Tschudi C (2000) Cotranscriptional cap 4 formation on the Trypanosoma brucei spliced leader RNA. J Biol Chem 275:28994-28999 Mandelboim M, Estrano CL, Tschudi C, Ullu E, Michaeli S (2002) On the role of exon and intron sequences in trans-splicing utilization and cap 4 modification of the trypanosomatid Leptomonas collosoma SL RNA. J Biol Chem 277:35210-35218 McNally KP, Agabian N (1992) Trypanosoma brucei spliced-leader RNA methylations are required for trans splicing in vivo. Mol Cell Biol 12:4844-4851 Muhich ML, Boothroyd JC (1988) Polycistronic transcripts in trypanosomes and their accumulation during heat shock: evidence for a precursor role in mRNA synthesis. Mol Cell Biol 8:3837-3846 Osawa S, Jukes TH (1989) Codon reassignment (codon capture) in evolution. J Mol Evol 28:271-278 Parsons M, Nelson RG, Watkins KP, Agabian N (1984) Trypanosome mRNAs share a common 5' spliced leader sequence. Cell 38:309-316 Perry KL, Watkins KP, Agabian N (1987) Trypanosome mRNAs have unusual "cap 4" structures acquired by addition of a spliced leader. Proc Natl Acad Sci U S A 84:81908194 Quiocho FA, Hu G, Gershon PD (2000) Structural basis of mRNA cap recognition by proteins. Curr Opin Struct Biol 10:78-86 Schneider A, Marechal-Drouard L (2000) Mitochondrial tRNA import: are there distinct mechanisms? Trends Cell Biol 10:509-513 Schneider A, Martin J, Agabian N (1994a) A nuclear encoded tRNA of Trypanosoma brucei is imported into mitochondria. Mol Cell Biol 14:2317-2322 Schneider A, McNally KP, Agabian N (1994b) Nuclear-encoded mitochondrial tRNAs of Trypanosoma brucei have a modified cytidine in the anticodon loop. Nucleic Acids Res 22:3699-3705

86 Mary Anne T. Rubio and Juan D. Alfonzo Simpson AM, Suyama Y, Dewes H, Campbell DA, Simpson L (1989) Kinetoplastid mitochondria contain functional tRNAs, which are encoded in nuclear DNA and also contain small minicircle and maxicircle transcripts of unknown function. Nucleic Acids Res 17:5427-5445 Sturm NR, Campbell DA (1999) The role of intron structures in trans-splicing and cap 4 formation for the Leishmania spliced leader RNA. J Biol Chem 274:19361-19367 Tan TH, Bochud-Allemann N, Horn EK, Schneider A (2002) Eukaryotic-type elongator tRNAMet of Trypanosoma brucei becomes formylated after import into mitochondria. Proc Natl Acad Sci U S A 99:1152-1157 Ullu E, Tschudi C (1990) Permeable trypanosome cells as a model system for transcription and trans-splicing. Nucleic Acids Res 18:3319-3326 Ullu E, Tschudi C (1991) Trans splicing in trypanosomes requires methylation of the 5' end of the spliced leader RNA. Proc Natl Acad Sci U S A 88:10074-10078 Walder JA, Eder PS, Engman DM, Brentano ST, Walder RY, Knutzon DS, Dorfman DM, Donelson JE (1986) The 35-nucleotide spliced leader sequence is common to all trypanosome messenger RNA's. Science 233:569-571 Weichenrieder O, Wild K, Strub K, Cusack S (2000) Structure and assembly of the Alu domain of the mammalian signal recognition particle. Nature 408:167-173 Zeiner GM, Sturm NR, Campbell DA (2003) The Leishmania tarentolae spliced leader contains determinants for association with polysomes. J Biol Chem 278:38269-38275

Alfonzo, Juan D. Department of Microbiology and the Ohio State Biochemistry Program, The Ohio State University, 484 West 12th Avenue, Columbus, Ohio 43210, USA. [email protected] Rubio, Mary Anne T. Department of Microbiology, The Ohio State University, 484 West 12th Avenue, Columbus, Ohio 43210, USA.

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae Marcus J.O. Johansson and Anders S. Byström

Abstract Transfer RNAs are adaptor molecules, which decode mRNA into protein and, thereby, play a central role in gene expression. During the maturation of a primary tRNA transcript, specific subsets of the four normal nucleosides adenosine, cytidine, guanosine, and uridine are modified. The formation of a modified nucleoside can require more than one gene product and may involve several enzymatic steps. In the last few years, the identification of gene products required for formation of modified nucleosides in tRNA has dramatically increased. In this review, proteins involved in modification of cytoplasmic tRNAs in Saccharomyces cerevisiae are described, emphasizing phenotypic characteristics of modification deficient strains and genetic approaches used to determine the in vivo role of modified nucleosides/ modifying enzymes.

1 Introduction Transfer RNAs are characterized by a variety of different modified nucleosides, which are derivatives of adenosine (A), cytidine (C), guanosine (G), and uridine (U). In total, 86 different modified nucleosides have been described in tRNA from organisms within the three domains of life, Archaea, Bacteria, and Eukarya (Rozenski et al. 1999). Some modified nucleosides are found in all three domains and some are even found in identical positions of the tRNA, suggesting a conserved function (Björk 1986). In this review, we will discuss modified nucleosides and their modifying enzymes in the budding yeast Saccharomyces cerevisiae, but where appropriate, also include descriptions from other organisms. The discussion is focused on modification of cytoplasmic tRNAs.

2 tRNA maturation In eukaryotic cells, tRNA genes are transcribed by RNA polymerase III generating precursor forms with a 5´ leader, a U-rich 3´ trailer, and sometimes an intervening sequence that undergo a series of processing events to yield mature functional tRNAs. The enzyme responsible for endonucleolytic cleavage of the 5´ leader is Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b105814 / Published online: 7 January 2005 © Springer-Verlag Berlin Heidelberg 2005

88 Marcus J.O. Johansson and Anders S. Byström

Ribonuclease P (RNase P), a ribonucleoprotein found in all domains of life (Altman et al. 1995; Xiao et al. 2002; Hartmann and Hartmann 2003). One tRNA species (tRNAHis) has an unusual 5´ end with an extra 5´ GMP residue (G-1), which is added after transcription and RNase P cleavage by a guanylyltransferase (Cooley et al. 1982; Pande et al. 1991; Gu et al. 2003). The removal of the 3´ trailer of pre-tRNA in eukaryotes occurs by either an endo- or exonucleolytic mechanism (Papadimitriou and Gross 1996). A conserved endoribonuclease capable of processing the 3´ end has been identified in several organisms, including yeast (Schiffer et al. 2002; Takaku et al. 2003). The primary pathway of 3' processing is endonucleolytic due to the presence of the Lhp1 (La) protein, which binds to 3´-terminal U residues of the pre-tRNA and protects the end from exonucleolytic digestion (Yoo and Wolin 1994, 1997). The Lhp1 protein also promotes correct folding of pre-tRNAs (Chakshusmathi et al. 2003). The 3´ CCA end present in mature tRNA is not encoded in eukaryotic tRNA genes and has to be added posttranscriptionally by the ATP(CTP):tRNA nucleotidyl transferase (Aebi et al. 1990; Chen et al. 1990). In S. cerevisiae, genes for ten different tRNA species contain introns (Hani and Feldmann 1998) that are always located one nucleotide 3´ of the anticodon. Splicing of the pre-tRNAs requires three different enzymes, a tRNA splicing endonuclease that excises the intron, a tRNA ligase to join the two tRNA half molecules and a phosphotransferase to remove a 2´ phosphate at the splice junction (Abelson et al. 1998). Steps in the maturation of tRNA are ordered. The removal of the 5´ leader normally precedes processing of the 3´-end and in S. cerevisiae splicing usually follows end maturation (O'Connor and Peebles 1991). However, a specific order of maturation events is not obligatory as removal of the 3´ trailer can occur before 5´end processing (Kufel and Tollervey 2003) and some pre-tRNAs can be spliced before end maturation (O'Connor and Peebles 1991). RNase P and some pretRNAs are localized primarily to the nucleolus, indicating that 5´-end processing is performed in this compartment (Bertrand et al. 1998). In contrast, the tRNA splicing endonuclease is predominantly cytoplasmic and associated with the mitochondrial surface (Huh et al. 2003; Yoshihisa et al. 2003). Thus, the trafficking of pre-tRNA through various cellular compartments is likely to influence the order of processing events. A nuclear system that senses the status of tRNA processing and transmits a signal to the translational machinery has been proposed (Qiu et al. 2000). This model is based on the evidence that mutations or conditions that interfered with 5’- and 3’-end tRNA processing derepressed translation of the GCN4 mRNA (Qiu et al. 2000). Recent evidence suggests the presence of a nuclear mechanism to degrade aberrant pre-tRNAs (Kadaba et al. 2004). For a detailed discussion about nuclear surveillance, see Chapter by Anderson and Droogmans in this volume.

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 89

Fig. 1. Location of modified nucleosides in cytoplasmic tRNAs from S. cerevisiae. Abbreviations: I, inosine; m1I, 1-metylinosine; m1A, 1-methyladenosine; t6A, N6threonylcarbamoyladenosine; i6A, N6-isopentenyladenosine; Ar(p), 2´-O-ribosyladenosine (phosphate); Am, 2´-O-methyladenosine; m5C, 5-methylcytidine; ac4C, N4-acetylcytidine; m3C, 3-methylcytidine; Cm, 2´-O-methylcytidine; m1G, 1-methylguanosine; m2G, N22 methylguanosine; m 2 G, N2,N2-dimethylguanosine; Gm, 2´-O-methylguanosine; m7G, 7methylguanosine; yW, wybutosine; Ψ, pseudouridine; D, dihydrouridine; m5U, 5methyluridine; Um, 2'-O-methyluridine; mcm5U, 5-methoxycarbonylmethyluridine ; mcm5s2U, 5-methoxycarbonylmethyl-2-thiouridine; ncm5U, 5-carbamoylmethyluridine; ncm5Um, 5-carbamoylmethyl-2'-O-methyluridine.

3 Modified nucleosides in tRNA Modified nucleosides are found throughout the tRNA molecule (Fig.1) and their formation occurs posttranscriptionally in concurrence with the other processing events. In S. cerevisiae, 274 nucleus-encoded tRNA genes code for the 42 different cytoplasmic tRNA species (1 initiator and 41 elongator tRNAs) that are

90 Marcus J.O. Johansson and Anders S. Byström

Table 1. Modified nucleosides in S. cerevisiae cytoplasmic tRNAs and gene products required for their formation. aBased on the sequence of 30 of the 42 tRNA species (Table 2). b Indicates that no gene product has been described. c B. Lapeyre personal communication d Val species has m2G rather than m 22 G at position 26 (Gorbulev et al. 1977), the The tRNA CAC former is presumably also catalyzed by Trm1p. Modified nucleoside I m1I m1A t6A i6A Ar(p) Am m5C ac4C m3C Cm

m1G m2G m22G Gm m7G yW Ψ

D

m5U Um mcm5U mcm5s2U ncm5U ncm5Um

Position (occurrence of modified nucleoside)a

34 (6) 37 (1) 58 (20) 37 (9) 37 (5) 64 (1) 4 (1) 34 (1), 40 (1), 48 (15), 49 (10) 12 (6) 32 (1) 32 (3), 34 (1) 4 (3) 9 (9) 37 (6) 10 (18) 26 (1) 26 (18) 18 (9) 34 (1) 46 (11) 37 (1) 26 (1), 27 (12), 28 (3), 34 (1), 36 (1), 65 (1), 67 (2) 38 (3), 39 (16) 55 (29) 31 (1) 13 (8), 35 (1) 32 (11) 1 (2) 16 (27), 17 (6) 20 (24) 47 (16) 20A (14), 20B (4) 54 (29) 44 (3) 34 (1) 34 (2) 34 (2) 34 (1)

Genes required for modification TAD2, TAD3 TAD1, TRM5 TRM6, TRM61 -b MOD5 RIT1 TRM4 TAN1 TRM7 TRM10 TRM5 TRM11c -d TRM1 TRM3 TRM7 TRM8, TRM82 TRM5 PUS1 PUS3 PUS4 PUS6 PUS7 PUS8, PUS9 DUS1 DUS2 DUS3 DUS4 TRM2 TRM9 TRM9, NFS1 -

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 91

responsible for reading the 61 sense codons (Percudani et al. 1997; Hani and Feldmann 1998; Marck and Grosjean 2002). Of the 50 different modified nucleosides found in eukaryotic tRNAs (Rozenski et al. 1999), 25 have been identified in cytoplasmic tRNAs from S. cerevisiae (Table 1, Fig. 2). The numbers given for occurrence of a modified nucleoside in table 1 are based on the RNA sequences of 30 of the 42 tRNA species (Table 2). Table 2 shows the distribution of modified nucleosides in these 30 tRNAs and the number of genes encoding each species. RNA sequences are from either bakers or brewers yeast, and some are not complete or contain uncertainties. A few modified nucleosides are present in essentially all tRNA species whereas others are present in specific or subgroups of tRNAs (Table 2). In addition, certain modified nucleosides (Ψ, D, m1G, Cm, Gm, m5C, m2G) are found in more than one position of the tRNA and the formation of the same modified nucleoside at several different positions could involve multisite, region-specific and site-specific modification enzymes or a combination thereof (see Section 4). Some nucleoside modifications are introduced early into the precursor tRNA and are sometimes dependent on the presence of an intron, whereas others occur after splicing and end maturation (Grosjean et al. 1997). Enzymes modifying cytosolic tRNAs can localize to different subnuclear compartments and to the cytoplasm, suggesting that sub-cellular localization influences the succession of modifications (Hopper and Phizicky 2003). However, a specific order of modification is not obligatory as mutations in a gene encoding a tRNA modifying enzyme, causing a lack of one modified nucleoside, do not influence other modifications. The presence of certain modified nucleosides in bacterial tRNA varies dependent on growth rate, growth phase, or stress conditions (Björk 1995). Similarly, developmental changes, cell cycle progression, and rate of cell proliferation are known to affect the relative level of some modified nucleosides in eukaryotic tRNAs (Björk 1995). Thus, the modification status of a tRNA may vary with different cellular conditions.

4 Genes required for formation of modified nucleosides in tRNA To date, 31 gene products involved in formation of modified nucleosides in S. cerevisiae cytosolic tRNAs are known (Table 1). For synthesis of the complex nucleosides yW, mcm5U, and mcm5s2U, some of the gene products are identified but other remains to be discovered. This section will outline the identification method of genes required for modification of cytoplasmic tRNAs and whether the gene product catalyzes formation of the modified nucleoside. Several gene products (Pus3p, Pus4p, Pus6p, Pus9p, Trm1p, Trm2p, Nfs1p, and Mod5p) involved in modification of cytoplasmic tRNA species are also required for modification of mitochondrial tRNAs (see below). In addition to these proteins, the MTO1, MSS1, and MTU1 gene products are known to specifically affect a mitochondrial tRNA modification (cmnm5s2U, see Chapter 1 by T. Suzuki).

92 Marcus J.O. Johansson and Anders S. Byström

4.1 Pseudouridine (Ψ) Pseudouridine (Ψ) is the most widely distributed modified nucleoside and is found at 15 different positions in cytoplasmic tRNAs from S. cerevisiae (Fig. 1). Genes encoding Ψ-synthases required for the isomerization of U to Ψ at 14 positions are known (Table 1). The enzyme catalysing formation of Ψ at position 1 in cytoplasmic tRNAs remains to be discovered. A mutant allele of the nonessential PUS1 gene, required for formation of Ψ at position 26-28, 34, 36, 65, and 67 in tRNA and at position 44 in U2 snRNA, was identified in a screen for mutations lethal in combination with a temperature sensitive (Ts) allele of NSP1 encoding a nuclear pore protein (Simos et al. 1996; Motorin et al. 1998; Massenet et al. 1999). In addition to the pus1 allele, the strain also contained a mutation in the LOS1 gene, encoding a protein involved in export of tRNA from the nucleus to the cytoplasm (Hurt et al. 1987; Hellmuth et al. 1998; Sarkar and Hopper 1998). By using a set of T7-transcribed tRNAs and purified recombinant Pus1p, this protein was shown to catalyze formation of Ψ at position 27-28 and 34-36 in vitro (Simos et al. 1996; Motorin et al. 1998). Although the purified Pus1p catalyzed Ψ35 formation in pre- tRNA Tyr GUA (mature form is tRNA Tyr GΨA ), a Ψ35-synthase activity was still present in an extract from a PUS1 disrupted strain. Furthermore, tRNATyr isolated from the pus1 mutant contained Ψ35, indicating that another Ψ-synthase modified the position in vivo (Motorin et al. 1998). Sequencing of tRNAs from a pus1 mutant revealed that the strain lacked Ψ residues at positions 27, 28, 34, and 36. In addition, positions 26, 65, and 67 were not modified, suggesting that Pus1p is also involved in modification of these residues (Motorin et al. 1998). However, Pus1p did not modify positions 26, 65, and 67 in the in vitro reaction (Motorin et al. 1998). The PUS3 (DEG1) gene encoding the tRNA:Ψ38-39-synthase was originally identified as a gene important for normal growth since disruption of this gene caused a slow growth phenotype (Carbone et al. 1991). Pus3p showed homology to the Escherichia coli tRNA:Ψ38-40-synthase (TruAp) as well as to the yeast Pus1p (Carbone et al. 1991; Simos et al. 1996). Recombinant Pus3 protein purified from E. coli or a tagged protein purified from yeast catalyzed Ψ formation at Phe position 38 in tRNA Ala ( tRNA Ala tRNA GAA AGC IGC ) and at position 39 in Phe ( tRNA GmAA ) using T7 transcripts as the substrate (Lecointe et al. 1998). In contrast to the wild type, an extract from a PUS3 disrupted strain showed no Ψsynthase activity towards U38 and U39. Moreover, position 38 in cytoplasmic Arg tRNA Gly GCC and position 39 in mitochondrial tRNA ACG were not modified in the pus3 mutant (Lecointe et al. 1998).

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 93

Fig. 2. Structures of modified nucleosides found in S. cerevisiae cytoplasmic tRNAs. R represents ribose, X any base, and Nm symbolizes a 2´-O methylated nucleoside.

94 Marcus J.O. Johansson and Anders S. Byström

The yeast tRNA:Ψ55-synthase (Pus4p) was identified based on homology to the corresponding enzyme in E. coli (TruBp, Becker et al. 1997). Pus4 protein puriPhe fied from E. coli catalyzed Ψ55 formation in vitro using T7-transcribed tRNA GAA Phe Asp ( tRNA GmAA ) or tRNA GUC as substrates. Extracts from a PUS4 disrupted strain

contained no Ψ55-synthase activity and both cytoplasmic tRNA Arg ICG and mitochondrial tRNA Arg ACG isolated from the pus4 mutant lacked Ψ55 (Becker et al. 1997). The Pus6 protein, which acts at position 31 in tRNA was also identified based on homology to known Ψ-synthases (Ansmant et al. 2001). Recombinant Pus6 protein purified from E. coli catalyzed formation of Ψ31 in cytoplasmic and mitochondrial tRNA Met isolated from a pus6∆ mutant. Moreover, cytoplasmic m Lys Ser tRNA Met as well as mitochondrial tRNA Met m m , tRNA cmnm5s 2 UUU and tRNA GCU

lacked Ψ31 in the pus6∆ strain (Ansmant et al. 2001). The PUS7 gene coding for the Ψ-synthase, which catalyzes formation of Ψ at position 35 in U2 snRNA and positions 13 and 35 in tRNA, was identified by a biochemical genomic approach using 6144 yeast strains each expressing a unique glutathione S-transferase-ORF (GST-ORF) fusion protein (Martzen et al. 1999; Behm-Ansmant et al. 2003; Ma et al. 2003). By screening pools and subsequently subpools of purified yeast GST-ORF fusions, using single-site radiolabelled U2 snRNA as the substrate, the Pus7 protein was shown to be responsible for Ψ35 formation in U2 snRNA (Ma et al. 2003). As Pus1p is a multisubstrate specific enzyme, it was considered whether Pus7p could also use tRNA as a substrate. Accordingly, a deletion of the PUS7 gene was found to cause the loss of Ψ13 in Asp tRNA GUC and tRNA Glu (Behm-Ansmant et al. 2003). Pus7 protein purified mcm5 s 2 UUC Asp from E. coli catalyzed formation of Ψ at position 13 in tRNA GUC and at position Tyr 35 in pre- tRNA Tyr GUA ( tRNA GΨA ) using in vitro transcripts as substrates. Extracts from the wild type but not the pus7 deletion strain, catalyzed the U to Ψ converAsp His and tRNA GUG , and at position 35 in presion at position 13 in tRNA GUC

tRNA Tyr GUA (Behm-Ansmant et al. 2003). The absence of Ψ35-synthase activity in the pus7 mutant suggests that the in vitro activity by Pus1p towards this position is not relevant in vivo (Behm-Ansmant et al. 2003). The two different genes PUS8 (RIB2) and PUS9 encoding tRNA:Ψ32-synthases were identified based on homology to known Ψ-synthases (Behm-Ansmant et al. 2004). Pus9p modifies both cytoplasmic and mitochondrial tRNAs, whereas the Pus8p activity is exclusively cytoplasmic. The PUS8 (RIB2) gene product was earlier implicated in riboflavin biosynthesis and contains a domain with sequence similarity to the RibD family of deaminases (Oltmanns et al. 1969; Oltmanns and Bacher 1972; Behm-Ansmant et al. 2004). This deaminase domain is not required for the tRNA:Ψ32-synthase activity (Behm-Ansmant et al. 2004). Recombinant Pus8p and Pus9p catalyzed formation of Ψ at position 32 in in vitro transcribed Asp Trp tRNA GUC . Mitochondrial tRNA Ser GCU and tRNA UCA isolated from a pus9∆ strain

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 95

lacked Ψ32. In contrast, cytoplasmic tRNA Gly GCC from either a pus8∆ or a pus9∆ strain contained Ψ32. No U32 to Ψ32 conversion in cytoplasmic tRNA Gly GCC and mitochondrial tRNA Ser GCU was detected in a pus8∆ pus9∆ strain (Behm-Ansmant et al. 2004). 4.2 Dihydrouridine (D)

Dihydrouridine is the second most widely distributed modified nucleoside and is found at 6 different positions in tRNA (Fig. 1). The proteins required for formation of D in these positions are Dus1p (D16, D17), Dus2p (D20), Dus3p (D47), and Dus4p (D20A, D20B) (Xing et al. 2002, 2004). The Dus1 protein was identified using the genomic collection of purified GST-ORF fusion proteins and T7Phe Phe ( tRNA GmAA ) as the substrate, in which transcribed radiolabelled pre- tRNA GAA position 16 and 17 should be modified to D (Xing et al. 2002). By a BLAST search using Dus1p as a query, the Dus2p, Dus3p and Dus4p were identified (Xing et al. 2002). Transfer RNA isolated from a strain with all four genes deleted (DUS1-DUS4), showed no detectable D nucleoside, suggesting that the Dus1pDus4p are required for all the D residues present in tRNA (Xing et al. 2004). By using different tRNA transcripts as substrates, the coenzymes NADH/NADPH and FAD, and Dus1 or Dus2 proteins purified from E. coli, the D-synthase activity of these proteins was confirmed (Xing et al. 2002). The activity of the Dus1 protein Phe in the in vitro reaction using pre- tRNA GAA as the substrate was toward position 17. By analysing tRNA from a strain deleted for the DUS1 gene, it was shown that a dus1 mutant lacked both D16 and D17. Similarly, D20 was absent in tRNA from a dus2∆, D47 from a dus3∆, and D20A,20B from a dus4∆ strain (Xing et al. 2004). 4.3 5-methyluridine (m5U)

An m5U residue (also known as ribothymidine (rT)) at position 54 is likely to be present in all yeast tRNAs, except in the A54 containing initiator methionine tRNA ( tRNA iMet , Table 2). A trm2 mutant defective in formation of m5U in both cytoplasmic and mitochondrial tRNA was identified based on a slightly aberrant migration pattern of tRNA on denaturing polyacrylamide gels (Hopper et al. 1982). The TRM2 gene was identified in a BLAST homology search using the E. coli tRNA(m5U54) methyltransferase (TrmAp) as a query (Nordlund et al. 2000). Introduction of the TRM2 gene into the original trm2 or an E. coli trmA mutant restored the ability to form m5U in tRNA. Total tRNA isolated from a trm2 null strain lacked m5U and a recombinant Trm2 protein purified from E. coli catalyzed the formation of m5U54 in total tRNA from the trm2 null strain using S-adenosylmethionine (AdoMet) as the methyl donor (Nordlund et al. 2000). Others had

96 Marcus J.O. Johansson and Anders S. Byström Table 2.

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 97 Table 2 (continued).

98 Marcus J.O. Johansson and Anders S. Byström Footnotes to Table 2, previous pages: Distribution of modified nucleosides in yeast cytoplasmic tRNAs. In S. cerevisiae, 274 nucleus-encoded tRNA genes code for 42 different cytoplasmic tRNA species (Percudani et al. 1997; Hani and Feldmann 1998; Marck and Grosjean 2002). Of the 42 tRNA species, 30 RNA sequences derived from either brewers or bakers yeast are known. 1 (Holley et al. 1965; Penswick et al. 1975), 2(Weissenbach et al. 1975), 3(Kuntzel et al. 1972; Kuntzel et al. 1975), 4(Keith and Pixa 1984), 5(Gangloff et al. 1972), 6(Holness and Atfield 1976), 7(Kobayashi et al. 1974), 8(Yoshida 1973), 9(Mendenhall et al. 1987), 10 (Keith et al. 1983), 11(Pixa et al. 1984), 12(Szweykowska-Kulinska et al. 1994), 13(Chang et al. 1973), 14(Randerath et al. 1979), 15(el Adlouni et al. 1991; Glasser et al. 1992), 16 (Smith et al. 1973), 17(Madison et al. 1972), 18(Simsek and RajBhandary 1972; Keith et al. 1990b), 19(Gruhl and Feldmann 1976; Koiwai and Miyazaki 1976), 20(RajBhandary et al. 1967), 21(Keith et al. 1983; Winey et al. 1986; Keith et al. 1990a), 22(Zachau et al. 1966), 23,24 (Piper 1978; Etcheverry et al. 1979), 25(Weissenbach et al. 1977), 26(Keith et al. 1972), 27 (Madison and Kung 1967), 28(Bonnet et al. 1974), 29(Gorbulev et al. 1977), 30(Axel'rod et Arg al. 1974; Yamamoto et al. 1985). For twelve tRNA species, tRNA Ala UGC (5), tRNA CCG (1), Arg ln ln Gly Leu tRNA CCU (1), tRNA GUUG (9), tRNA GCUG (1), tRNA Glu CUC (2), tRNA CCC (2), tRNA GAG (1), o Thr Thr tRNA PrAGG (2), tRNASer GCU (4), tRNA UGU (4), and tRNA CGU (1) the RNA sequence is not known. Indicated within parenthesis is the number of genes for each species. a In standard S. cerevisiae laboratory strains, such as S288C, D273-10B and A364A the number of genes His encoding a tRNA species can vary (Byström et al. 1993). b Two subspecies of tRNA GUG

was detected in brewers yeast, one contains C39 and the other Ψ39. The latter subspecies is most likely not present in S. cerevisiae S288C. c m5C is present in tRNA Prncmo 5 UGG , but the positions are not known.

previously identified the TRM2 gene as the RNC1/ NUD1 gene (Chow et al. 1992; Van Vliet-Reedijk and Planta 1993), which was proposed to encode an endoexonuclease (Chow et al. 1992; Sadekova and Chow 1996; Asefa et al. 1998). However, several lines of experimental evidence contradict an endo-exonuclease activity for the Trm2p (Nordlund et al. 2000). Moreover, studies using the rnc1/nud1 (trm2) null strain (Asefa et al. 1998) showed that the disruption was mistargeted and that the strain contains a wild type TRM2 locus (M. E. Nordlund and A. S. Byström, unpublished data). 4.4 5-methoxycarbonylmethyluridine (mcm5U) and 5methoxycarbonylmethyl-2-thiouridine (mcm5s2U)

In S. cerevisiae, mcm5U34 is present in tRNA Arg , and mcm5s2U34 is present in mcm5 UCU tRNA Glu and tRNA Lys (Table 2). The esterified methyl constituent mcm 5s 2 UUC mcm5 s 2 UUU

of the mcm5 group is transferred by the mcm5U/mcm5s2U tRNA carboxyl methyltransferase, Trm9p, using AdoMet as the donor (Kalhor and Clarke 2003). The TRM9 gene was identified in a search for yeast proteins that contained putative AdoMet binding motifs (Niewmierzycka and Clarke 1999; Kalhor and Clarke 2003). A strain deleted for the TRM9 gene did not contain mcm5U or mcm5s2U in

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 99

total tRNA, and lacked methyl esterified nucleosides in purified tRNAArg and tRNAGlu. Extracts from wild type but not the mutant catalyzed significant incorporation of methyl esters into saponified tRNA. Incorporation into nonsaponified tRNAs was significantly lower, indicating that the carboxyl group of the cm5 sidechain is the substrate for Trm9p. Recombinant Trm9 protein purified from E. coli catalyzed the formation of methyl esters in both mcm5U and mcm5s2U using AdoMet as the donor and saponified total tRNA as the substrate (Kalhor and Clarke 2003). The enzyme catalyzing formation of the 2-thio group in mcm5s2U is not known. However, the yeast IscS homolog Nfs1p is required for this modification (Muhlenhoff et al. 2004; Nakai et al. 2004). The bacterial IscS protein is a cysteine desulfurase involved in the distribution of sulphur in the cell (Mihara and Esaki 2002). IscS is required for the synthesis of all thiolated nucleosides in tRNA of Salmonella enterica and E. coli and is involved in the transfer of sulphur to tRNA modifying enzymes (Kambampati and Lauhon 2000; Lauhon 2002; Nilsson et al. 2002; Kambampati and Lauhon 2003). The essential yeast Nfs1 protein also serves as a sulfur supplier (Strain et al. 1998; Kispal et al. 1999; Li et al. 1999; Muhlenhoff et al. 2004) and depletion of Nfs1p resulted in a decrease of thiomodified nucleosides in both mitochondrial and cytoplasmic tRNAs (Nakai et al. 2004). Furthermore, Nfs1 depleted cells showed reduced amounts of mcm5s2U and increased amounts of mcm5U, suggesting that Nfs1p affects the 2-thiomodification (Nakai et al. 2004). 4.5 7-methylguanosine (m7G)

The two subunit tRNA(m7G46) methyltransferase (Trm8p/ Trm82p) was identified by screening a genomic set of purified GST-ORF fusion proteins for tRNA(m7G) Phe Phe methyltransferase activity using in vitro transcribed pre- tRNA GAA ( tRNA GmAA ) as the substrate and AdoMet as the methyldonor (Alexandrov et al. 2002). Two GST-ORF preparations (Trm8p and Trm82p) were associated with the activity. Extracts from either a trm8∆ or a trm82∆ mutant lacked m7G methyltransferase activity and low molecular weight RNA from either strain contained no detectable m7G. Thus, both proteins were required for the activity and it was shown that the Trm8 and Trm82 proteins co-purified and functioned as a complex (Alexandrov et al. 2002). Recombinant Trm8/ Trm82 complex purified from E. coli catalyzed Phe using AdoMet as the methyldonor (Alexandrov m7G46 formation in pre- tRNA GAA et al. 2002). 4.6 1-methylguanosine (m1G)

The modified nucleoside m1G are found at both position 9 and position 37 in tRNAs (Fig. 1, Table 2). The identification of the yeast TRM5 gene encoding the tRNA(m1G37) methyltransferase was accomplished by first identifying an archeal

100 Marcus J.O. Johansson and Anders S. Byström

homologue. A Methanococcus vannielii genomic library was used to complement the temperature sensitivity and the m1G deficiency phenotypes of a S. enterica trmD mutant (Björk et al. 2001). The Methanococcus jannaschii homologue to the complementing open reading frame was used as a query in a BLAST search against the S. cerevisiae genome and a candidate tRNA(m1G37) methyltransferase (Trm5p) was identified. The TRM5 gene complemented the temperature sensitive phenotype of the S. enterica trmD mutant and raised the levels of m1G in total tRNA, suggesting that the TRM5 gene codes for the yeast tRNA(m1G37) methyltransferase (Björk et al. 2001). Analysis of purified tRNAs from a trm5∆ mutant showed that the strain lacked m1G37 in the 6 tRNA species known to contain the modification (Björk et al. 2001). In addition, the trm5∆ strain lacked m1I37 in Phe tRNA Ala IGC and yW37 in tRNA GmAA , suggesting a role for the Trm5p in formation of these modified nucleosides (see Section 4.7 and 4.12). The TRM10 gene coding for the tRNA(m1G9) methyltransferase was identified by screening a genomic set of purified GST-ORF fusion proteins for modification at position 9 of in vitro transcribed tRNA Gly GCC in the presence of AdoMet (Jackman et al. 2003). Recombinant Trm10p purified from E. coli catalyzed m1G9 formation in tRNA Gly GCC and extracts from a strain with a trm10 null allele contained 1 no tRNA(m G9) methyltransferase activity. A strain with a trm10 deletion lacked m1G9 in 7 cytoplasmic tRNA species, suggesting that Trm10p is the only tRNA(m1G9) methyltransferase in the cell (Jackman et al. 2003). 4.7 wybutosine (yW)

The tricyclic nucleoside yW (Y-base) is found at position 37 exclusively in Phe tRNA GmAA (Table 2). The first step in synthesis of this complex nucleoside is the Phe isoformation of m1G37 (Droogmans and Grosjean 1987). Analysis of tRNA GmAA lated from a strain deleted for the TRM5 gene, coding for the tRNA(m1G37) methyltransferase, revealed that yW was missing (Björk et al. 2001). Thus, Trm5p is required for formation of yW37 and is most likely involved in the first step of the synthesis (Björk et al. 2001).

2

4.8 N2,N2-dimethylguanosine ( m 2 G)

In yeast tRNAs, the m 22 G nucleoside is present at position 26 in a subset of tRNA species (Table 2). Mutants lacking m 22 G in tRNA were identified as strains containing tRNA susceptible to methylation by enzyme extracts from wild type strains in the presence of AdoMet (Phillips and Kjellin-Stråby 1967). The mutant allele causing m 22 G deficiency in both cytoplasmic and mitochondrial tRNAs was named trm1 and the wild type TRM1 gene was identified by complementing the

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 101

tRNA( m 22 G) methyltransferase defect (Hopper et al. 1982; Ellis et al. 1986). Introduction of the yeast TRM1 gene into E. coli caused formation of m 22 G in the tRNA, which does not normally contain m 22 G, suggesting that TRM1 is the structural gene for tRNA( m 22 G) methyltransferase (Ellis et al. 1986). Recombinant Trm1p purified from E. coli was later shown to catalyze formation of m 22 G in the presence of AdoMet (Liu et al. 1998). 4.9 5-methylcytidine (m5C)

The modified nucleoside m5C is present at positions 34, 40, 48, and 49 (Fig. 1, Table 2). Identification of the multisite-specific Trm4p responsible for their formation was based on amino acid sequence homology with the E. coli 16S rRNA(m5C967) methyltransferase (RsmBp, Motorin, and Grosjean 1999). The TRM4 gene had previously been named NCL1 (nuclear protein 1) and was identified based on homology with the Nop2 protein, which is involved in rRNA biogenesis (Wu et al. 1998). Trm4p, which was expressed and purified from E. coli, catalyzed formation of m5C at positions 34, 40, 48, and 49 in appropriate T7transcribed tRNAs, using AdoMet as the methyldonor (Motorin and Grosjean 1999). Total tRNA isolated from a TRM4 disrupted strain lacked m5C and extracts from the mutant contained no tRNA(m5C) methyltransferase activity, suggesting that Trm4p is responsible for formation of all m5C residues in tRNA (Motorin and Grosjean 1999). 4.10 N4-acetylcytidine (ac4C)

The modified nucleoside ac4C is present at position 12 in tRNAs specific for serine and leucine (Table 2). Mutant alleles of the TAN1 gene, encoding a protein required for formation of ac4C in tRNA, were identified in a genetic screen for mutations that are lethal in combination with a sup61 allele coding for a mutant form of tRNA Ser CGA (Johansson and Byström 2004, see Section 6). Total tRNA isolated from a tan1 null mutant lacked ac4C and the TAN1 gene product was required to stabilise the mutant form of tRNA Ser CGA . Although no acetyltransferase activity of a recombinant Tan1 protein was detected, the protein was shown to interact with tRNA, suggesting a direct role in the modification (Johansson and Byström 2004). It is possible that Tan1p functions as the tRNA binding component of a modification enzyme consisting of more than one subunit. Consistent with this hypothesis, two additional gene products affecting formation of ac4C in tRNA have been identified (M. J. O. Johansson and A. S. Byström, unpublished data).

102 Marcus J.O. Johansson and Anders S. Byström

4.11 Inosine (I)

The adenosine deaminase that forms I at the wobble position (position 34) in tRNA consists of two subunits Tad2 and Tad3p (Gerber and Keller 1999). The TAD2 gene was identified in a search of the S. cerevisiae genome for putative deaminases. Upon purification of Tad2p from S. cerevisiae, the Tad3 protein copurified (Gerber and Keller 1999). The Tad3p also showed homology to deaminases, including the Tad2 protein. Recombinant Tad2 and Tad3 proteins were purified from E. coli, and the combination of Tad2p and Tad3p, but neither protein Ser alone, converted A34 to I34 in yeast tRNA Ala AGC and tRNA AGA (Gerber and Keller 1999). Strains with null alleles of TAD2 or TAD3 are nonviable and extracts from a strain with a tad2-1 Ts allele lacked A34 deaminase activity and tRNAAla from the mutant contained unmodified A34. Similarly, a mutant with a tad3-1 Ts allele lacked the modification activity (Gerber and Keller 1999). 4.12 1-methylinosine (m1I)

The synthesis of m1I at position 37 occurs in two steps, the formation of I37 by the Tad1p followed by a methylation probably catalyzed by Trm5p (Grosjean et al. 1996; Gerber et al. 1998; Björk et al. 2001). The adenosine deaminase Tad1p was identified based on its homology to the mammalian RNA editing enzymes ADAR1 and ADAR2 (Gerber et al. 1998). Recombinant Tad1p purified from Ala Pichia pastoris converted A37 to I37 in synthetic yeast tRNA Ala AGC ( tRNA IGC ), which normally contains m1I37. A strain with the TAD1 gene disrupted contained unmodified tRNA Ala IGC and an extract of the mutant was unable to form I37 (Gerber et al. 1998). Analysis of tRNA Ala IGC isolated from a strain deleted for the TRM5 1 gene, coding for the tRNA(m G37) methyltransferase, revealed that m1I was missing (Björk et al. 2001). Thus, Trm5p is likely to catalyze the methylation of I37 (Björk et al. 2001). 4.13 1-methyladenosine (m1A)

The two subunits of the tRNA(m1A58) methyltransferase are encoded by the essential TRM6 and TRM61 genes (formerly GCD10 and GCD14, Anderson et al. 1998, 2000; Calvo et al. 1999; Kadaba et al. 2004). Mutant trm6 and trm61 alleles were first identified by their ability to cause derepression of GCN4 translation and its target genes in the histidine biosynthetic pathway (Harashima and Hinnebusch 1986; Cuesta et al. 1998). By complementing this phenotype, the wild type TRM6 and TRM61 genes were identified (Garcia-Barrio et al. 1995; Calvo et al. 1999). The lethal phenotype caused by trm6 and trm61 alleles is suppressed by increased dosage of a initiator methionine tRNA (IMT) gene, making it possible to analyze tRNA from the mutants (Anderson et al. 1998; Calvo et al. 1999, see Section 5).

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 103

Total tRNA isolated from a trm6 or a trm61 deletion mutant, rescued by increased IMT4 gene dosage, was devoid of m1A (Anderson et al. 1998, 2000). The Trm6p/Trm61p complex purified from yeast bound tRNA and catalyzed incorporation of methyl groups from AdoMet into total tRNA and purified tRNA iMet from a trm6 mutant (Anderson et al. 2000). Analysis of the tRNA iMet from the in vitro reaction showed that the product was m1A (Anderson et al. 2000). 4.14 N6-isopentenyladenosine (i6A)

The modified nucleoside i6A is present at position 37 in a subset of tRNA species (Table 2). A mod5 mutant defective in formation of i6A in both cytoplasmic and mitochondrial tRNAs was identified as a strain in which a suppressor tRNATyr was unable to read ochre stop codons (Laten et al. 1978; Martin and Hopper 1982). Extracts from the mod5 mutant were deficient in the transfer of isopentenyl groups to tRNA using dimethylallyl pyrophosphate (∆2-isopentenylpyrophosphate) as the donor (Laten et al. 1985). A plasmid carrying the MOD5 gene was identified based on restored ochre suppression and formation of i6A in the mod5 mutant (Dihanich et al. 1987). Introduction of the MOD5 gene into an E. coli strain with a mutation in the miaA gene, encoding the bacterial tRNA isopentenyltransferase, restored transfer of isopentenyl groups to adenosines, suggesting that MOD5 encodes the yeast enzyme (Dihanich et al. 1987). 4.15 2'-O-ribosyladenosine (phosphate) (Ar(p))

In addition to base modifications, the 2’-O position of the ribose can be modified. A 2´-O-ribosyl-phosphate modification at position 64 is a unique feature of tRNA iMet in fungi and plants (Keith et al. 1990b; Glasser et al. 1991). Removal of this modification from tRNA iMet caused an increased binding to eukaryotic elongation factor 1A (eEF1A) and allowed the tRNA to read start as well as internal AUG codons in vitro (Kiesewetter et al. 1990; Förster et al. 1993). A strain with a mutation in the RIT1 gene encoding the S. cerevisiae 2´-O-ribosyl-phosphate transferase was isolated in a genetic screen for mutants that grow in the absence of elongator methionine tRNA ( tRNA Met m ) genes (Åström and Byström 1994). Thus, in a rit1 mutant tRNA iMet acts both as an initiator and an elongator tRNAMet. Total tRNA isolated from a rit1 mutant lacked the 2´-O-ribosyl-phosphate modification. Extracts from E. coli cells expressing the yeast Rit1 protein contained 2´-Oribosyl-phosphate transferase activity towards T7-transcribed tRNA iMet and required 5´-phospho-1´-ribosyl-pyrophosphate as the phosphoribosyl donor (Åström and Byström 1994).

104 Marcus J.O. Johansson and Anders S. Byström

4.16 2’-O-methylations

Ribose methylations are found at positions 4, 18, 32, 34, and 44 in tRNA (Fig. 1, Table 2). At position 4, the ribose-methylated nucleoside is either a 2'-Omethyladenosine (Am) or a 2'-O-methylcytidine (Cm). The ribose-methylated nucleoside at position 18 is always a 2'-O-methylguanosine (Gm) and at position 32 it is always a Cm. Ribose methylations are also found at position 34 as Gm, Cm, or 5-carbamoylmethyl-2'-O-methyluridine (ncm5Um). At position 44, the ribosemethylated nucleoside is always a 2'-O-methyluridine (Um). To date, genes required for formation of Gm18, Cm32, Cm34, and Gm34 are known. The TRM3 gene required for formation of Gm18 in tRNA was identified based on amino acid sequence similarity to known RNA 2’-O-methyltransferases (Cavaille et al. 1999). A strain with a deletion of the TRM3 gene lacked Gm18 in total tRNA. Moreover, extracts from the wild type but not from the trm3 mutant Ser catalyzed formation of Gm18 in T7-transcribed tRNA Ser AGA ( tRNA IGA ) in the presence of AdoMet as the methyl donor (Cavaille et al. 1999). The TRM7 gene coding for the 2´-O-methyltransferase involved in formation of Cm at position 32 and 34 and Gm at position 34 was identified based on homology to a bacterial rRNA 2’-O-methyltransferases (RrmJ, Pintard et al. 2002). The Trm7 protein bound AdoMet. In addition, a wild type extract or a purified Trm7p from yeast, but not an extract from a trm7 mutant, catalyzed formation of Cm32 Phe Phe and Gm34 in T7-transcribed tRNA GAA ( tRNA GmAA ). Analysis of tRNA from a trm7∆ mutant revealed that the strain lacked Cm34 in tRNATrp, Gm34 in tRNAPhe, and Cm32 in tRNAPhe, tRNATrp and tRNALeu (Pintard et al. 2002).

5 Phenotypes of tRNA modification mutants Only a few of the gene products required for formation of modified nucleosides in tRNA are important for growth under normal laboratory conditions. When phenotypes are observed in a mutant, the absence of a modified nucleoside either generates less functional tRNAs with respect to decoding capacity or decreases the amount of tRNAs available for translation. However, even if a growth defect correlates to the absence of a modified nucleoside in tRNA, it cannot be excluded that the tRNA modifying enzyme has other targets or functions. It is known that the Pus1 and Pus7 proteins in addition to tRNA have U2 snRNA as a substrate (Massenet et al. 1999; Ma et al. 2003). Recently, the mammalian Pus1p (mPus1p) was identified as a co-activator for retinoic acid receptor mRARγ dependent RNA polymerase II transcription (Zhao et al. 2004). In this process, mPus1p is a component of the co-activator complex and catalyzes pseudouridylation of the RNA component, Steroid Receptor RNA Activator (Zhao et al. 2004). Some modifying enzymes have a function independent of their known catalytic activity. An E. coli strain with a deletion of truB encoding the Ψ55-synthase competes poorly for growth in the presence of wild type E. coli (Gutgsell et al. 2000). However, intro-

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 105

ducing a plasmid carrying a mutant truB gene whose product cannot catalyze Ψ55 formation could relieve this growth disadvantage, suggesting two functions for the TruB protein (Gutgsell et al. 2000). Moreover, the E. coli trmA gene encoding the tRNA(m5U54) methyltransferase is essential whereas the methyltransferase activity is not, indicating dual functions of the protein (Persson et al. 1992). It is not known, whether this second function of TruB and TrmA proteins is related to tRNA biogenesis or another cellular process. Modification mutants exhibiting growth phenotypes are usually mutated in genes required for formation of modified nucleosides in the anticodon region. Modified nucleosides in this region are involved primarily in the decoding process of mRNA, such as reading frame maintenance and/or restriction or improvement of codon-anticodon interactions (Björk 1995; Yokoyama and Nishimura 1995). Sometimes the modified nucleosides act as identity elements in aminoacyl-tRNA synthetase recognition (Giege et al. 1998). This section will discuss phenotypes of S. cerevisiae strains deficient in formation of modified nucleosides in tRNA. The TAD2 and TAD3 genes encoding the two subunits of the adenosine deaminase that forms inosine at the wobble position are essential for cell viability (Gerber and Keller 1999). An I34 containing tRNA is predicted to read U, C, and A ending codons (Crick 1966; Yokoyama and Nishimura 1995). The presence of an unmodified A34 in a tRNA is likely to alter the wobble capacity and at least generate a decreased decoding of A ending codons (Yokoyama and Nishimura 1995). In addition, I34 in tRNA Ile IAU is an important positive determinant for the isoleucyltRNA synthetase (Senger et al. 1997), suggesting that presence of I34 affects tRNA function by two different mechanisms. The 2’-O ribose methyltransferase Trm7p involved in formation Cm at position 32 and 34 and Gm at position 34 is not essential (Pintard et al. 2002). However, a trm7∆ mutant shows a growth rate reduction. In correlation to this growth defect, a reduced polysome to monosome ratio in polysome profiles was observed as well as an increased sensitivity to the aminoglycoside antibiotic paromomycin, which affects translational fidelity (Pintard et al. 2002). Thus, translation seems to be impaired in a trm7∆ strain. A deletion of the TRM5 gene encoding a methyltransferase involved in the m1G37, m1I37 and yW37 modifications in a subset of tRNAs is not lethal, but causes an extremely slow growth (Björk et al. 2001). As the presence of m1G37 prevents frameshifting in bacteria (Björk et al. 1989; Urbonavicius et al. 2001), the slow growth of the yeast trm5∆ mutant is presumably caused by an inability to maintain reading frame during translation (Björk et al. 2001). Mutations in the MOD5 gene encoding the tRNA(i6A37) isopentenyltransferase prevent a suppressor tRNATyr encoded by the SUP7 gene from reading ochre stop codons (Laten et al. 1978). The suppressor tRNATyr likely contains i6A37 and absence of the modification apparently destabilises the codon-anticodon interaction (Laten et al. 1978). However, a strain deficient in formation of i6A has no obvious growth defect, indicating that the suppressor tRNATyr is sensitized for lack of i6A37 (Laten et al. 1978). The Mod5p catalyzes transfer of the isopentenyl group to A37 in tRNA from dimethylallyl pyrophosphate (Dihanich et al. 1987). The Erg20 pro-

106 Marcus J.O. Johansson and Anders S. Byström

tein is a farnesyl pyrophosphate synthetase that catalyzes the sequential condensation of dimethylallyl pyrophosphate and geranyl pyrophosphate with isopentenylpyrophosphate (Anderson et al. 1989). Increased expression of Erg20p in a mod5 mutant, which contained 60% of the wild type level of i6A in cytoplasmic tRNA, caused a further reduction of the i6A level. Correlating to this reduction, the SUP7 encoded suppressor tRNATyr had a decreased ability to decode ochre stop codons (Benko et al. 2000). Possibly, a limited pool of dimethylallyl pyrophosphate is available in the cell and the balance between Mod5p and Erg20p could affect translation by influencing i6A content in tRNA and/ or synthesis of metabolites originating from farnesyl pyrophosphate (Benko et al. 2000). These metabolites include sterols, prenylated proteins, ubiquinone, dolichol, and heme-A. A deletion of the NFS1 gene involved in formation of the 2-thio group in mcm5s2U is lethal (Li et al. 1999; Nakai et al. 2004). However, the lethality of a NFS1 deletion is not necessarily caused by lack of the 2-thio group in tRNAs as Nfs1p supplies sulfur to many cellular processes. Interestingly, a dominant NFS1 allele was identified in a genetic screen for S. cerevisiae mutants that increase translational suppression by an altered form of an Schizosaccharomyces pombe UGA suppressor tRNASer (Kolman and Söll 1993). The mutation in the suppressor tRNASer gene caused decreased amounts of mature suppressor tRNA and accumulation of intron containing precursor. The dominant NFS1 allele suppressed the processing defect and generated increased levels of the mature suppressor tRNASer (Kolman and Söll 1993). However, whether the suppression is linked to thiolation of the suppressor tRNA or whether it is caused by an indirect effect is unclear. Inactivation of the PUS3 gene encoding the enzyme catalysing formation of Ψ at position 38 or 39 in tRNAs gives a reduction in growth (Carbone et al. 1991; Lecointe et al. 1998). By using a set of reporter plasmids, a deletion of PUS3 was shown to cause a reduction in +1 frameshifting, in certain contexts, and naturally occurring nonsense suppression (Lecointe et al. 2002). Similarly, a mutation in the truA (hisT) gene in S. enterica encoding the Ψ38-40-synthase prevents a suppressor tRNA from reading an amber stop codon (Bossi and Roth 1980). A reduced translational elongation rate was observed in a truA mutant (Palmer et al. 1983) and a pus3 mutant may have a similar defect. Formation of m1A at position 58 is catalyzed by the essential tRNA(m1A58) methyltransferase, composed of the Trm6 and Trm61 proteins (Anderson et al. 2000, see Chapter 4 by J. T. Anderson and L. Droogmans). Although 20 tRNA species contain m1A58, the lethal phenotypes caused by trm6 or trm61 null alleles are suppressed by increased dosage of an IMT gene, coding for tRNA iMet , or LHP1, encoding a protein involved in pre-tRNA maturation. Moreover, a deletion of the LHP1 gene exacerbates the slow growth phenotype of strains having Ts trm61 alleles (Calvo et al. 1999). In a trm6 or a trm61 mutant, a lower steady-state level of tRNA iMet was observed and in the trm6 mutant, this decrease was due to an increased turnover of tRNA iMet (Anderson et al. 1998; Calvo et al. 1999). Absence of Lhp1p in a Ts trm61 mutant caused a further decrease of tRNA iMet levels whereas over-expression led to an increased steady-state level (Calvo et al. 1999).

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 107

These data suggest that tRNA iMet is destabilised in the absence of a functional tRNA(m1A58) methyltransferase and that tRNA iMet has a unique requirement for the modification (Anderson et al. 1998, 2000; Calvo et al. 1999). A disruption of the TRM3 or the TRM4 gene encoding the putative tRNA(Gm18) methyltransferase and the tRNA(m5C) methyltransferase, respectively, does not affect growth, but generates a slightly increased sensitivity to paromomycin (Wu et al. 1998; Cavaille et al. 1999). Paromomycin sensitivity has also been reported, but at an elevated temperature, for a strain with a null allele of the TRM9 gene, coding for the mcm5U/mcm5s2U tRNA carboxyl methyltransferase (Kalhor and Clarke 2003). Although the cause of these phenotypes is not clear, the Trm3, Trm4, and Trm9 proteins are important when translation is compromised.

6 Genetic approaches to study function of modified nucleosides and their modifying enzymes Deletion of a gene encoding a tRNA modification enzyme often results in a strain with no obvious growth phenotype. Additional genetic approaches have provided some clues about the in vivo function of some modified nucleosides/modifying enzymes. These approaches involve genetic screens for mutants where a tRNA modifying enzyme is essential for growth. Alternatively, growth defects are observed in direct tests where a mutant allele of a gene coding for a tRNA or a protein involved in translation/ tRNA biogenesis is combined with a deletion of a tRNA modification gene. The 2´-O-ribosyl-phosphate modification present at position 64 in tRNA iMet prevents participation in elongation of translation (Kiesewetter et al. 1990; Förster et al. 1993; Åström and Byström 1994). Introduction of a null allele of the RIT1 gene, encoding the tRNA 2'-O-ribosyl phosphate transferase, allows a strain to Met functions both as initiator and grow in the absence of tRNA Met m , as tRNA i Met elongator tRNA (Åström and Byström 1994). A deletion of the RIT1 gene in a wild type background generates a strain with no apparent growth defect (Åström et al. 1999). However, a rit1 null allele generated a synergistic growth defect if it was combined with mutations in genes encoding any of the three subunits of eukaryotic initiation factor 2 (eIF2) or with reduced number of IMT genes, encoding tRNA iMet (Åström et al. 1999). Increased IMT gene dosage alleviated this growth defect. In contrast, increased gene dosage of the TEF2 gene, encoding eEF1A, exacerbated the growth defect, presumably by sequestering met- tRNA iMet to elongation of translation. These results suggest that the presence of the 2’-O-ribosylphosphate modification in tRNA iMet is important under conditions where the

108 Marcus J.O. Johansson and Anders S. Byström

components of the met- tRNA iMet ·GTP·eIF2 ternary complex become limiting or have reduced function (Åström et al. 1999). The PUS1 gene encodes a nuclear Ψ-synthase involved in modification of positions 26-28, 34, 36, 65, and 67 in tRNA, and position 44 in U2 snRNA (Simos et al. 1996; Motorin et al. 1998; Massenet et al. 1999). A pus1 los1 mutant was identified in a screen for mutations lethal in combination with a Ts allele of the NSP1 gene, encoding a nuclear pore protein (Simos et al. 1996). The LOS1 gene encodes a nonessential tRNA exportin (Hurt et al. 1987; Hellmuth et al. 1998; Sarkar and Hopper 1998). Although the nsp1 pus1 los1 triple mutant was nonviable, no apparent growth defect was observed for nsp1 pus1 or nsp1 los1 double mutants. However, a pus1 los1 double mutant was nonviable at elevated temperatures (Simos et al. 1996). By screening for mutants requiring the PUS1 gene for growth, strains with a mutation in either the PUS4 or the sup70+ (CDC65) gene were identified (Grosshans et al. 2001). The PUS4 gene codes for the Ψ55-synthase and ln sup70+ is an essential gene that encodes tRNA GCUG (Lin et al. 1986; Weiss and Friedberg 1986; Becker et al. 1997). The mechanism for synthetic interaction of the pus1 and pus4 alleles is not clear (Grosshans et al. 2001). In the sup70 mutant, ln the altered form of tRNA GCUG , which is probably modified by Pus1p at positions 26-28, was destabilised and accumulated in the nucleus. Although the destabilised ln tRNA GCUG could be further destabilised in the absence of Ψ26-28, a model where Pus1-dependent tRNA modification would be important for nuclear export of tRNAs was favoured (Grosshans et al. 2001). Further support for a nuclear export defect comes from the observation that the single pus1 mutant accumulated the Ile spliced form of the intron containing tRNA Ile UAU ( tRNA ΨAΨ ) in the nucleus (Grosshans et al. 2001). This is consistent with the observation that the Los1p homologue in higher eukaryotes, exportin-t, binds a fully modified tRNA with higher affinity than the corresponding T7-transcript (Lipowsky et al. 1999). However, the tRNA splicing endonuclease is predominately cytoplasmic (Huh et al. 2003; Yoshihisa et al. 2003), making it difficult to interpret the nuclear accumulation of spliced tRNA Ile UAU in the pus1 mutant. A yeast strain deleted for the TRM2 gene, encoding the tRNA(m5U54) methyltransferase, is viable with no obvious growth defects (Nordlund et al. 2000). This contrasted the phenotype resulting from a defect in the corresponding gene in E. coli (trmA) that is essential for viability, although the methyltransferase activity is not (Persson et al. 1992). Four different sup61 alleles were identified in a screen for mutants requiring the TRM2 gene for growth (Johansson and Byström 2002). The intron-containing essential sup61 gene codes for tRNA Ser CGA , which is the only serine isoacceptor in S. cerevisiae able to decode UCG codons (Etcheverry et al. 1982). Absence of Trm2p in the sup61 mutants generated lethality that correlated to decreased levels of partially processed and mature tRNA Ser CGA . The level of primary transcripts was unchanged, suggesting that the methylation of U54 or the

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 109

Trm2 protein per se stabilised the mutant forms of tRNA Ser CGA after removal of the 5´ leader by RNase P (Johansson and Byström 2002). Interestingly, the growth defect and the reduction in tRNA Ser CGA levels caused by the trm2∆ allele in one sup61 mutant were complemented by alleles of TRM2 encoding catalytically inactive proteins. This indicates that Trm2p has a function in tRNA maturation distinct from its methyltransferase activity, possibly in stabilisation and/or protection of tRNAs from degradation (Johansson and Byström 2002). Null alleles of PUS4 (Becker et al. 1997), TRM1 (Ellis et al. 1986), TRM3 (Cavaille et al. 1999), and LHP1 (Yoo and Wolin 1997) were each combined with the sup61 alleles (Johansson and Byström 2002). The Ψ55, m 22 G26, and Gm18 nucleosides are all present in tRNA Ser CGA and Lhp1p has been shown to interact with pre- tRNA Ser CGA (Etcheverry et al. 1979; Yoo and Wolin 1997). When these null alleles were introduced into the sup61 mutants, growth defects were observed that correlated to a reduced level of tRNA Ser CGA , suggesting that the gene products are required for tRNA stability (Johansson and Byström 2002). In contrast, introduction of a RIT1 deletion, encoding a tRNA iMet -specific modifying enzyme (Åström and Byström 1994), did not affect the growth of the sup61 mutants. Thus, a strain with a mutant form of tRNA Ser CGA represents a sensitised system to investigate the role of proteins that interacts with or modifies the tRNA. A screen for mutations that are lethal in combination with a sup61 allele identified strains representing 12 different complementation groups (Johansson 2003; Johansson and Byström 2004). One complementation group contained strains with a mutation in the nonessential TAN1 gene required for formation of ac4C (Johansson and Byström 2004), which is present in tRNA Ser CGA (Etcheverry et al. 1979). Characterization of two of the remaining complementation groups identified strains with mutations in the DUS2 or MOD5 genes encoding the enzymes required for formation of D20 and i6A37 in tRNA, including tRNA Ser CGA (Etcheverry et al. 1979; Dihanich et al. 1987; Xing et al. 2002, 2004; Johansson 2003). The absence of Tan1p or Dus2p in the sup61 mutant influenced the stability of the mutant tRNA Ser CGA , similar to the effect caused by a deletion of LHP1, TRM1, TRM2, or PUS4 (Johansson and Byström 2002, 2004; Johansson 2003). The effect of the Mod5p on the mutant form of tRNA Ser CGA has not been investigated.

7 Concluding remarks and future prospects The last few years has seen a dramatic increase in the identification of gene products required for formation of modified nucleosides in tRNA. This has been accomplished by a combination of bioinformatic, biochemical genomic, and genetic approaches. We expect that essentially all gene products in S. cerevisiae directly

110 Marcus J.O. Johansson and Anders S. Byström

involved in tRNA modification will be identified in the near future. However, the function of many modified nucleosides in tRNA and their modifying enzymes is poorly characterized and it is likely that sensitised genetic systems have to be utilised to study their in vivo function. In addition to the systems described in section 6, direct tests for synthetic interactions between mutant alleles of genes coding for modifying enzymes could be performed. Synthetic growth defects have been observed when a trm8∆ was combined with mutations in DUS3, TRM4, or TRM10 (E. M. Phizicky, personal communication). Genome-wide synthetic genetic array analysis using the yeast knockout collection could also identify strains where a modifying enzyme is required for growth (Tong et al. 2001). So far, mutations showing synthetic interactions with trm4, trm10, pus3, and pus7 null alleles have been identified (Tong et al. 2004). These types of analyses have the potential to bring insights into the function of tRNA modifications/ modifying enzymes in tRNA biogenesis. In addition, pathways sensitive to hypomodified tRNA and yet unidentified targets of the modifying enzyme may be uncovered. Many of the modified nucleosides present in S. cerevisiae tRNAs, are found at the corresponding position in tRNAs from other eukaryotes (Sprinzl et al. 1998), making S. cerevisiae an excellent model organism for functional studies.

Acknowledgements: Dr. G.R. Björk, Dr. T.G. Hagervall, and Dr. D.L. Milton are acknowledged for valuable comments on the manuscript. We thank Dr. B. Lapeyre and Dr. E. M. Phizicky for communicating results prior to publication. This work was financially supported by the Swedish Research Council (621-2001-1890) and the Swedish Cancer Society (3516-B03-10XAB).

References Abelson J, Trotta CR, Li H (1998) tRNA splicing. J Biol Chem 273:12685-12688 Aebi M, Kirchner G, Chen JY, Vijayraghavan U, Jacobson A, Martin NC, Abelson J (1990) Isolation of a temperature-sensitive mutant with an altered tRNA nucleotidyltransferase and cloning of the gene encoding tRNA nucleotidyltransferase in the yeast Saccharomyces cerevisiae. J Biol Chem 265:16216-16220 Alexandrov A, Martzen MR, Phizicky EM (2002) Two proteins that form a complex are required for 7-methylguanosine modification of yeast tRNA. RNA 8:1253-1266 Altman S, Kirsebom L, Talbot S (1995) Recent studies of RNase P. In: Söll D, RajBhandary U (eds) tRNA: structure, biosynthesis, and function. ASM Press, Washington, D.C., pp 67-78 Anderson J, Phan L, Cuesta R, Carlson BA, Pak M, Asano K, Björk GR, Tamame M, Hinnebusch AG (1998) The essential Gcd10p-Gcd14p nuclear complex is required for 1methyladenosine modification and maturation of initiator methionyl-tRNA. Genes Dev 12:3650-3662

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 111 Anderson J, Phan L, Hinnebusch AG (2000) The Gcd10p/Gcd14p complex is the essential two-subunit tRNA(1- methyladenosine) methyltransferase of Saccharomyces cerevisiae. Proc Natl Acad Sci USA 97:5173-5178 Anderson MS, Yarger JG, Burck CL, Poulter CD (1989) Farnesyl diphosphate synthetase. Molecular cloning, sequence, and expression of an essential gene from Saccharomyces cerevisiae. J Biol Chem 264:19176-19184 Ansmant I, Motorin Y, Massenet S, Grosjean H, Branlant C (2001) Identification and characterization of the tRNA:Psi 31-synthase (Pus6p) of Saccharomyces cerevisiae. J Biol Chem 276:34934-34940 Asefa B, Kauler P, Cournoyer D, Lehnert S, Chow TY (1998) Genetic analysis of the yeast NUD1 endo-exonuclease: a role in the repair of DNA double-strand breaks. Curr Genet 34:360-367 Axel'rod VD, Kryukov VM, Isaenko SN, Bayev AA (1974) Nucleotide sequence in tRNA Val-2a from baker's yeast. FEBS Lett 45:333-336 Becker HF, Motorin Y, Planta RJ, Grosjean H (1997) The yeast gene YNL292w encodes a pseudouridine synthase (Pus4) catalyzing the formation of psi55 in both mitochondrial and cytoplasmic tRNAs. Nucleic Acids Res 25:4493-4499 Behm-Ansmant I, Grosjean H, Massenet S, Motorin Y, Branlant C (2004) Pseudouridylation at position 32 of mitochondrial and cytoplasmic tRNAs requires two distinct enzymes in Saccharomyces cerevisiae. J Biol Chem: in press Behm-Ansmant I, Urban A, Ma X, Yu YT, Motorin Y, Branlant C (2003) The Saccharomyces cerevisiae U2 snRNA:pseudouridine-synthase Pus7p is a novel multisitemultisubstrate RNA:Psi-synthase also acting on tRNAs. RNA 9:1371-1382 Benko AL, Vaduva G, Martin NC, Hopper AK (2000) Competition between a sterol biosynthetic enzyme and tRNA modification in addition to changes in the protein synthesis machinery causes altered nonsense suppression. Proc Natl Acad Sci USA 97:61-66 Bertrand E, Houser-Scott F, Kendall A, Singer RH, Engelke DR (1998) Nucleolar localization of early tRNA processing. Genes Dev 12:2463-2468 Björk GR (1986) Transfer RNA modifications in different organism. Chemica Scripta 26B:91-95 Björk GR (1995) Biosynthesis and function of modified nucleosides. In: Söll D, RajBhandary U (eds) tRNA: structure, biosynthesis, and function. ASM Press, Washington, D.C., pp 165-205 Björk GR, Jacobsson K, Nilsson K, Johansson MJO, Byström AS, Persson OP (2001) A primordial tRNA modification required for the evolution of life? EMBO J 20:231-239. Björk GR, Wikström PM, Byström AS (1989) Prevention of translational frameshifting by the modified nucleoside 1- methylguanosine. Science 244:986-989 Bonnet J, Ebel JP, Shershneva LP, Krutilina AI, Venkstern TV, Bayev AA, Dirheirmer G (1974) The corrected nucleotide sequence of valine tRNA from baker's yeast. Biochimie 56:1211-1213 Bossi L, Roth JR (1980) The influence of codon context on genetic code translation. Nature 286:123-127 Byström AS, von Pawel-Rammingen U, Åström SU (1993) Genetic systems in yeast for analysis of initiator/ elongator tRNA specificity. In: Nierhaus KH (ed) The translational apparatus. Plenum press, New York, pp 35-45 Calvo O, Cuesta R, Anderson J, Gutierrez N, Garcia-Barrio MT, Hinnebusch AG, Tamame M (1999) GCD14p, a repressor of GCN4 translation, cooperates with Gcd10p and

112 Marcus J.O. Johansson and Anders S. Byström Lhp1p in the maturation of initiator methionyl-tRNA in Saccharomyces cerevisiae. Mol Cell Biol 19:4167-4181 Carbone ML, Solinas M, Sora S, Panzeri L (1991) A gene tightly linked to CEN6 is important for growth of Saccharomyces cerevisiae. Curr Genet 19:1-8 Cavaille J, Chetouani F, Bachellerie JP (1999) The yeast Saccharomyces cerevisiae YDL112w ORF encodes the putative 2'- O-ribose methyltransferase catalyzing the formation of Gm18 in tRNAs. RNA 5:66-81 Chakshusmathi G, Kim SD, Rubinson DA, Wolin SL (2003) A La protein requirement for efficient pre-tRNA folding. EMBO J 22:6562-6572 Chang SH, Kuo S, Hawkins E, Miller NR (1973) The corrected nucleotide sequence of yeast leucine transfer ribonucleic acid. Biochem Biophys Res Commun 51:951-955 Chen JY, Kirchner G, Aebi M, Martin NC (1990) Purification and properties of yeast ATP (CTP):tRNA nucleotidyltransferase from wild type and overproducing cells. J Biol Chem 265:16221-16224 Chow TY, Perkins EL, Resnick MA (1992) Yeast RNC1 encodes a chimeric protein, RhoNUC, with a human rho motif and deoxyribonuclease activity. Nucleic Acids Res 20:5215-5221 Cooley L, Appel B, Söll D (1982) Post-transcriptional nucleotide addition is responsible for the formation of the 5' terminus of histidine tRNA. Proc Natl Acad Sci USA 79:64756479 Crick FH (1966) Codon--anticodon pairing: the wobble hypothesis. J Mol Biol 19:548-555 Cuesta R, Hinnebusch AG, Tamame M (1998) Identification of GCD14 and GCD15, novel genes required for translational repression of GCN4 mRNA in Saccharomyces cerevisiae. Genetics 148:1007-1020 Dihanich ME, Najarian D, Clark R, Gillman EC, Martin NC, Hopper AK (1987) Isolation and characterization of MOD5, a gene required for isopentenylation of cytoplasmic and mitochondrial tRNAs of Saccharomyces cerevisiae. Mol Cell Biol 7:177-184 Droogmans L, Grosjean H (1987) Enzymatic conversion of guanosine 3' adjacent to the anticodon of yeast tRNAPhe to N1-methylguanosine and the wye nucleoside: dependence on the anticodon sequence. EMBO J 6:477-483 el Adlouni C, Desgres J, Dirheimer G, Keith G (1991) Sequence of a new tRNA(Leu)(U*AA) from brewer's yeast. Biochimie 73:1355-1360 Ellis SR, Morales MJ, Li JM, Hopper AK, Martin NC (1986) Isolation and characterization of the TRM1 locus, a gene essential for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNA in Saccharomyces cerevisiae. J Biol Chem 261:9703-9709 Etcheverry T, Colby D, Guthrie C (1979) A precursor to a minor species of yeast tRNASer contains an intervening sequence. Cell 18:11-26 Etcheverry T, Salvato M, Guthrie C (1982) Recessive lethality of yeast strains carrying the SUP61 suppressor results from loss of a transfer RNA with a unique decoding function. J Mol Biol 158:599-618 Förster C, Chakraburtty K, Sprinzl M (1993) Discrimination between initiation and elongation of protein biosynthesis in yeast: identity assured by a nucleotide modification in the initiator tRNA. Nucleic Acids Res 21:5679-5683 Gangloff J, Keith G, Ebel JP, Dirheimer G (1972) The primary structure of aspartate transfer ribonucleic acid from brewer's yeast. II. Partial digestions with pancreatic ribonuclease and T 1 ribonuclease and derivation of complete sequence. Biochim Biophys Acta 259:210-222

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 113 Garcia-Barrio MT, Naranda T, Vazquez de Aldana CR, Cuesta R, Hinnebusch AG, Hershey JW, Tamame M (1995) GCD10, a translational repressor of GCN4, is the RNAbinding subunit of eukaryotic translation initiation factor-3. Genes Dev 9:1781-1796 Gerber A, Grosjean H, Melcher T, Keller W (1998) Tad1p, a yeast tRNA-specific adenosine deaminase, is related to the mammalian pre-mRNA editing enzymes ADAR1 and ADAR2. EMBO J 17:4780-4789 Gerber AP, Keller W (1999) An adenosine deaminase that generates inosine at the wobble position of tRNAs. Science 286:1146-1149 Giege R, Sissler M, Florentz C (1998) Universal rules and idiosyncratic features in tRNA identity. Nucleic Acids Res 26:5017-5035 Glasser AL, Desgres J, Heitzler J, Gehrke CW, Keith G (1991) O-ribosyl-phosphate purine as a constant modified nucleotide located at position 64 in cytoplasmic initiator tRNAs(Met) of yeasts. Nucleic Acids Res 19:5199-5203 Glasser AL, el Adlouni C, Keith G, Sochacka E, Malkiewicz A, Santos M, Tuite MF, Desgres J (1992) Presence and coding properties of 2'-O-methyl-5-carbamoylmethyluridine (ncm5Um) in the wobble position of the anticodon of tRNA(Leu) (U*AA) from brewer's yeast. FEBS Lett 314:381-385 Gorbulev VG, Axel'rod VD, Bayev AA (1977) Primary structure of baker's yeast tRNAVal 2b. Nucleic Acids Res 4:3239-3258 Grosjean H, Auxilien S, Constantinesco F, Simon C, Corda Y, Becker HF, Foiret D, Morin A, Jin YX, Fournier M, Fourrey JL (1996) Enzymatic conversion of adenosine to inosine and to N1-methylinosine in transfer RNAs: a review. Biochimie 78:488-501 Grosjean H, Szweykowska-Kulinska Z, Motorin Y, Fasiolo F, Simos G (1997) Introndependent enzymatic formation of modified nucleosides in eukaryotic tRNAs: a review. Biochimie 79:293-302 Grosshans H, Lecointe F, Grosjean H, Hurt E, Simos G (2001) Pus1p-dependent tRNA pseudouridinylation becomes essential when tRNA biogenesis is compromised in yeast. J Biol Chem 276:46333-46339 Gruhl H, Feldmann H (1976) The primary structure of a non-initiating methionine-specific tRNA from brewer's yeast. Eur J Biochem 68:209-217 Gu W, Jackman JE, Lohan AJ, Gray MW, Phizicky EM (2003) tRNAHis maturation: an essential yeast protein catalyzes addition of a guanine nucleotide to the 5' end of tRNAHis. Genes Dev 17:2889-2901 Gutgsell N, Englund N, Niu L, Kaya Y, Lane BG, Ofengand J (2000) Deletion of the Escherichia coli pseudouridine synthase gene truB blocks formation of pseudouridine 55 in tRNA in vivo, does not affect exponential growth, but confers a strong selective disadvantage in competition with wild-type cells. RNA 6:1870-1881 Hani J, Feldmann H (1998) tRNA genes and retroelements in the yeast genome. Nucleic Acids Res 26:689-696 Harashima S, Hinnebusch AG (1986) Multiple GCD genes required for repression of GCN4, a transcriptional activator of amino acid biosynthetic genes in Saccharomyces cerevisiae. Mol Cell Biol 6:3990-3998 Hartmann E, Hartmann RK (2003) The enigma of ribonuclease P evolution. Trends Genet 19:561-569 Hellmuth K, Lau DM, Bischoff FR, Kunzler M, Hurt E, Simos G (1998) Yeast Los1p has properties of an exportin-like nucleocytoplasmic transport factor for tRNA. Mol Cell Biol 18:6374-6386

114 Marcus J.O. Johansson and Anders S. Byström Holley RW, Apgar J, Everett GA, Madison JT, Marquisee M, Merrill SH, Penswick JR, Zamir A (1965) Structure of a ribonucleic acid. Science 147:1462-1465 Holness NJ, Atfield G (1976) The nucleotide sequence of cysteine transfer ribonucleic acid from baker's yeast. Identification of the products from partial degradation of the molecule and derivation of the complete sequence. Biochem J 153:447-454 Hopper AK, Furukawa AH, Pham HD, Martin NC (1982) Defects in modification of cytoplasmic and mitochondrial transfer RNAs are caused by single nuclear mutations. Cell 28:543-550 Hopper AK, Phizicky EM (2003) tRNA transfers to the limelight. Genes Dev 17:162-180 Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O'Shea EK (2003) Global analysis of protein localization in budding yeast. Nature 425:686-691 Hurt DJ, Wang SS, Lin YH, Hopper AK (1987) Cloning and characterization of LOS1, a Saccharomyces cerevisiae gene that affects tRNA splicing. Mol Cell Biol 7:1208-1216 Jackman JE, Montange RK, Malik HS, Phizicky EM (2003) Identification of the yeast gene encoding the tRNA m1G methyltransferase responsible for modification at position 9. RNA 9:574-585 Johansson MJO (2003) Transfer RNA biogenesis in Saccharomyces cerevisiae. PhD thesis. Department of Molecular Biology, Umeå University, Umeå, Sweden Johansson MJO, Byström AS (2002) Dual function of the tRNA(m5U54)methyltransferase in tRNA maturation. RNA 8:324-335 Johansson MJO, Byström AS (2004) The Saccharomyces cerevisiae TAN1 gene is required for N4-acetylcytidine formation in tRNA. RNA 10:712-719 Kadaba S, Krueger A, Trice T, Krecic AM, Hinnebusch AG, Anderson J (2004) Nuclear surveillance and degradation of hypomodified initiator tRNAMet in S. cerevisiae. Genes Dev 18:1227-1240 Kalhor HR, Clarke S (2003) Novel methyltransferase for modified uridine residues at the wobble position of tRNA. Mol Cell Biol 23:9283-9292 Kambampati R, Lauhon CT (2000) Evidence for the transfer of sulfane sulfur from IscS to ThiI during the in vitro biosynthesis of 4-thiouridine in Escherichia coli tRNA. J Biol Chem 275:10727-10730 Kambampati R, Lauhon CT (2003) MnmA and IscS are required for in vitro 2-thiouridine biosynthesis in Escherichia coli. Biochemistry 42:1109-1117 Keith G, Desgres J, Pochart P, Heyman T, Kuo KC, Gehrke CW (1990a) Eukaryotic tRNAs(Pro): primary structure of the anticodon loop; presence of 5carbamoylmethyluridine or inosine as the first nucleoside of the anticodon. Biochim Biophys Acta 1049:255-260 Keith G, Glasser AL, Desgres J, Kuo KC, Gehrke CW (1990b) Identification and structural characterization of O-beta-ribosyl-(1"-2')-adenosine-5"-phosphate in yeast methionine initiator tRNA. Nucleic Acids Res 18:5989-5993 Keith G, Pixa G (1984) The nucleotide sequence of asparagine tRNA from brewer's yeast. Biochimie 66:639-643 Keith G, Pixa G, Fix C, Dirheimer G (1983) Primary structure of three tRNAs from brewer's yeast: tRNAPro2, tRNAHis1 and tRNAHis2. Biochimie 65:661-672 Keith G, Roy A, Ebel JP, Dirheimer G (1972) The primary structure of tryptophan transfer ribonucleic acid from brewer's yeast. II. Partial digestion with pancreatic ribonuclease and derivation of complete sequence. Biochimie 54:1417-1426

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 115 Kiesewetter S, Ott G, Sprinzl M (1990) The role of modified purine 64 in initiator/elongator discrimination of tRNA(iMet) from yeast and wheat germ. Nucleic Acids Res 18:4677-4682 Kispal G, Csere P, Prohl C, Lill R (1999) The mitochondrial proteins Atm1p and Nfs1p are essential for biogenesis of cytosolic Fe/S proteins. EMBO J 18:3981-3989 Kobayashi T, Irie T, Yoshida M, Takeishi K, Ukita T (1974) The primary structure of yeast glutamic acid tRNA specific to the GAA codon. Biochim Biophys Acta 366:168-181 Koiwai O, Miyazaki M (1976) The primary structure of non-initiator methionine transfer ribonucleic acid from Bakers' yeast. II. Partial digestion with ribonuclease T1 and derivation of the complete sequence. J Biochem (Tokyo) 80:951-959 Kolman C, Söll D (1993) SPL1-1, a Saccharomyces cerevisiae mutation affecting tRNA splicing. J Bacteriol 175:1433-1442 Kufel J, Tollervey D (2003) 3'-processing of yeast tRNA(Trp) precedes 5'-processing. RNA 9:202-208 Kuntzel B, Weissenbach J, Dirheimer G (1972) The sequences of nucleotides in tRNA(Arg)(III) from brewer's yeast. FEBS Lett 25:189-191 Kuntzel B, Weissenbach J, Wolff RE, Tumaitis-Kennedy TD, Lane BG, Dirheimer G (1975) Presence of the methylester of 5-carboxymethyl uridine in the wobble position of the anticodon of tRNAIII Arg from brewer's yeast. Biochimie 57:61-70 Laten H, Gorman J, Bock RM (1978) Isopentenyladenosine deficient tRNA from an antisuppressor mutant of Saccharomyces cerevisiae. Nucleic Acids Res 5:4329-4342 Laten HM, Timmons RM, Suid S (1985) An antisuppressor mutant of Saccharomyces cerevisiae deficient in isopentenylated tRNA has reduced delta 2-isopentenylpyrophosphate: tRNA-delta 2-isopentenyl transferase activity. FEBS Lett 179:307-310 Lauhon CT (2002) Requirement for IscS in biosynthesis of all thionucleosides in Escherichia coli. J Bacteriol 184:6820-6829 Lecointe F, Namy O, Hatin I, Simos G, Rousset JP, Grosjean H (2002) Lack of pseudouridine 38/39 in the anticodon arm of yeast cytoplasmic tRNA decreases in vivo recoding efficiency. J Biol Chem 277:30445-30453 Lecointe F, Simos G, Sauer A, Hurt EC, Motorin Y, Grosjean H (1998) Characterization of yeast protein Deg1 as pseudouridine synthase (Pus3) catalyzing the formation of psi 38 and psi 39 in tRNA anticodon loop. J Biol Chem 273:1316-1323 Li J, Kogan M, Knight SA, Pain D, Dancis A (1999) Yeast mitochondrial protein, Nfs1p, coordinately regulates iron-sulfur cluster proteins, cellular iron uptake, and iron distribution. J Biol Chem 274:33025-33034 Lin JP, Aker M, Sitney KC, Mortimer RK (1986) First position wobble in codon-anticodon pairing: amber suppression by a yeast glutamine tRNA. Gene 49:383-388 Lipowsky G, Bischoff FR, Izaurralde E, Kutay U, Schafer S, Gross HJ, Beier H, Gorlich D (1999) Coordination of tRNA nuclear export with processing of tRNA. RNA 5:539549 Liu J, Liu J, Stråby KB (1998) Point and deletion mutations eliminate one or both methyl group transfers catalysed by the yeast TRM1 encoded tRNA (m22G26)dimethyl transferase. Nucleic Acids Res 26:5102-5108 Ma X, Zhao X, Yu YT (2003) Pseudouridylation (Psi) of U2 snRNA in S. cerevisiae is catalyzed by an RNA-independent mechanism. EMBO J 22:1889-1897 Madison JT, Boguslawski SJ, Teetor GH (1972) Nucleotide sequence of a lysine transfer ribonucleic acid from bakers yeast. Science 176:687-689

116 Marcus J.O. Johansson and Anders S. Byström Madison JT, Kung HK (1967) Large oligonucleotides isolated from yeast tyrosine transfer ribonucleic acid after partial digestion with ribonuclease T1. J Biol Chem 242:13241330 Marck C, Grosjean H (2002) tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA 8:1189-1232 Martin NC, Hopper AK (1982) Isopentenylation of both cytoplasmic and mitochondrial tRNA is affected by a single nuclear mutation. J Biol Chem 257:10562-10565 Martzen MR, McCraith SM, Spinelli SL, Torres FM, Fields S, Grayhack EJ, Phizicky EM (1999) A biochemical genomics approach for identifying genes by the activity of their products. Science 286:1153-1155 Massenet S, Motorin Y, Lafontaine DL, Hurt EC, Grosjean H, Branlant C (1999) Pseudouridine mapping in the Saccharomyces cerevisiae spliceosomal U small nuclear RNAs (snRNAs) reveals that pseudouridine synthase Pus1p exhibits a dual substrate specificity for U2 snRNA and tRNA. Mol Cell Biol 19:2142-2154 Mendenhall MD, Leeds P, Fen H, Mathison L, Zwick M, Sleiziz C, Culbertson MR (1987) Frameshift suppressor mutations affecting the major glycine transfer RNAs of Saccharomyces cerevisiae. J Mol Biol 194:41-58 Mihara H, Esaki N (2002) Bacterial cysteine desulfurases: their function and mechanisms. Appl Microbiol Biotechnol 60:12-23 Motorin Y, Grosjean H (1999) Multisite-specific tRNA:m5C-methyltransferase (Trm4) in yeast Saccharomyces cerevisiae: identification of the gene and substrate specificity of the enzyme. RNA 5:1105-1118 Motorin Y, Keith G, Simon C, Foiret D, Simos G, Hurt E, Grosjean H (1998) The yeast tRNA:pseudouridine synthase Pus1p displays a multisite substrate specificity. RNA 4:856-869 Muhlenhoff U, Balk J, Richhardt N, Kaiser JT, Sipos K, Kispal G, Lill R (2004) Functional characterisation of the eukaryotic cysteine desulfurase Nfs1p from S. cerevisiae. J Biol Chem 279:36906-36915 Nakai Y, Umeda N, Suzuki T, Nakai M, Hayashi H, Watanabe K, Kagamiyama H (2004) Yeast Nfs1p is involved in thio-modification of both mitochondrial and cytoplasmic tRNAs. J Biol Chem 279:12363-12368 Niewmierzycka A, Clarke S (1999) S-Adenosylmethionine-dependent methylation in Saccharomyces cerevisiae. Identification of a novel protein arginine methyltransferase. J Biol Chem 274:814-824 Nilsson K, Lundgren HK, Hagervall TG, Björk GR (2002) The cysteine desulfurase IscS is required for synthesis of all five thiolated nucleosides present in tRNA from Salmonella enterica serovar typhimurium. J Bacteriol 184:6830-6835 Nordlund ME, Johansson JOM, von Pawel-Rammingen U, Byström AS (2000) Identification of the TRM2 gene encoding the tRNA(m5U54)methyltransferase of Saccharomyces cerevisiae. RNA 6:844-860 O'Connor JP, Peebles CL (1991) In vivo pre-tRNA processing in Saccharomyces cerevisiae. Mol Cell Biol 11:425-439 Oltmanns O, Bacher A (1972) Biosynthesis of riboflavine in Saccharomyces cerevisiae: the role of genes rib1 and rib7. J Bacteriol 110:818-822 Oltmanns O, Bacher A, Lingens F, Zimmermann FK (1969) Biochemical and genetic classification of riboflavine deficient mutants of Saccharomyces cerevisiae. Mol Gen Genet 105:306-313

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 117 Palmer DT, Blum PH, Artz SW (1983) Effects of the hisT mutation of Salmonella typhimurium on translation elongation rate. J Bacteriol 153:357-363 Pande S, Jahn D, Söll D (1991) Histidine tRNA guanylyltransferase from Saccharomyces cerevisiae. I. Purification and physical properties. J Biol Chem 266:22826-22831 Papadimitriou A, Gross HJ (1996) Pre-tRNA 3'-processing in Saccharomyces cerevisiae. Purification and characterization of exo- and endoribonucleases. Eur J Biochem 242:747-759 Penswick JR, Martin R, Dirheimer G (1975) Evidence supporting a revised sequence for yeast alanine tRNA. FEBS Lett 50:28-31 Percudani R, Pavesi A, Ottonello S (1997) Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae. J Mol Biol 268:322-330 Persson BC, Gustafsson C, Berg DE, Björk GR (1992) The gene for a tRNA modifying enzyme, m5U54-methyltransferase, is essential for viability in Escherichia coli. Proc Natl Acad Sci USA 89:3995-3998 Phillips JH, Kjellin-Stråby K (1967) Studies on microbial ribonucleic acid. IV. Two mutants of Saccharomyces cerevisiae lacking N-2-dimethylguanine in soluble ribonucleic acid. J Mol Biol 26:509-518 Pintard L, Lecointe F, Bujnicki JM, Bonnerot C, Grosjean H, Lapeyre B (2002) Trm7p catalyses the formation of two 2'-O-methylriboses in yeast tRNA anticodon loop. EMBO J 21:1811-1820 Piper PW (1978) A correlation between a recessive lethal amber suppressor mutation in Saccharomyces cerevisiae and an anticodon change in a minor serine transfer RNA. J Mol Biol 122:217-235 Pixa G, Dirheimer G, Keith G (1984) Sequence of tRNA Ile IAU from brewer's yeast. Biochem Biophys Res Commun 119:905-912 Qiu H, Hu C, Anderson J, Björk GR, Sarkar S, Hopper AK, Hinnebusch AG (2000) Defects in tRNA processing and nuclear export induce GCN4 translation independently of phosphorylation of the alpha subunit of eukaryotic translation initiation factor 2. Mol Cell Biol 20:2505-2516 RajBhandary UL, Chang SH, Stuart A, Faulkner RD, Hoskinson RM, Khorana HG (1967) Studies on polynucleotides LXVIII. The primary structure of yeast phenylalanine transfer RNA. Proc Natl Acad Sci USA 57:751-758 Randerath E, Gupta RC, Chia LL, Chang SH, Randerath K (1979) Yeast tRNA Leu UAG. Purification, properties and determination of the nucleotide sequence by radioactive derivative methods. Eur J Biochem 93:79-94 Rozenski J, Crain PF, McCloskey JA (1999) The RNA Modification Database: 1999 update. Nucleic Acids Res 27:196-197 Sadekova S, Chow TY (1996) Over-expression of the NUD1-coded endo-exonuclease in Saccharomyces cerevisiae enhances DNA recombination and repair. Curr Genet 30:50-55 Sarkar S, Hopper AK (1998) tRNA nuclear export in Saccharomyces cerevisiae: in situ hybridization analysis. Mol Biol Cell 9:3041-3055 Schiffer S, Rosch S, Marchfelder A (2002) Assigning a function to a conserved group of proteins: the tRNA 3'- processing enzymes. EMBO J 21:2769-2777 Senger B, Auxilien S, Englisch U, Cramer F, Fasiolo F (1997) The modified wobble base inosine in yeast tRNAIle is a positive determinant for aminoacylation by isoleucyltRNA synthetase. Biochemistry 36:8269-8275

118 Marcus J.O. Johansson and Anders S. Byström Simos G, Tekotte H, Grosjean H, Segref A, Sharma K, Tollervey D, Hurt EC (1996) Nuclear pore proteins are involved in the biogenesis of functional tRNA. EMBO J 15:2270-2284 Simsek M, RajBhandary UL (1972) The primary structure of yeast initiator transfer ribonucleic acid. Biochem Biophys Res Commun 49:508-515 Smith CJ, Teh HS, Ley AN, D'Obrenan P (1973) The nucleotide sequences and coding properties of the major and minor lysine transfer ribonucleic acids from the haploid yeast Saccharomyces cerevisiae S288C. J Biol Chem 248:4475-4485 Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res 26:148-153 Strain J, Lorenz CR, Bode J, Garland S, Smolen GA, Ta DT, Vickery LE, Culotta VC (1998) Suppressors of superoxide dismutase (SOD1) deficiency in Saccharomyces cerevisiae. Identification of proteins predicted to mediate iron-sulfur cluster assembly. J Biol Chem 273:31138-31144 Szweykowska-Kulinska Z, Senger B, Keith G, Fasiolo F, Grosjean H (1994) Introndependent formation of pseudouridines in the anticodon of Saccharomyces cerevisiae minor tRNA (Ile). EMBO J 13:4636-4644 Takaku H, Minagawa A, Takagi M, Nashimoto M (2003) A candidate prostate cancer susceptibility gene encodes tRNA 3' processing endoribonuclease. Nucleic Acids Res 31:2272-2278 Tong AH, Evangelista M, Parsons AB, Xu H, Bader GD, Page N, Robinson M, Raghibizadeh S, Hogue CW, Bussey H, Andrews B, Tyers M, Boone C (2001) Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294:2364-2368 Tong AH, Lesage G, Bader GD, Ding H, Xu H, Xin X, Young J, Berriz GF, Brost RL, Chang M, Chen Y, Cheng X, Chua G, Friesen H, Goldberg DS, Haynes J, Humphries C, He G, Hussein S, Ke L, Krogan N, Li Z, Levinson JN, Lu H, Menard P, Munyana C, Parsons AB, Ryan O, Tonikian R, Roberts T, Sdicu AM, Shapiro J, Sheikh B, Suter B, Wong SL, Zhang LV, Zhu H, Burd CG, Munro S, Sander C, Rine J, Greenblatt J, Peter M, Bretscher A, Bell G, Roth FP, Brown GW, Andrews B, Bussey H, Boone C (2004) Global mapping of the yeast genetic interaction network. Science 303:808-813 Urbonavicius J, Qian Q, Durand JM, Hagervall TG, Björk GR (2001) Improvement of reading frame maintenance is a common function for several tRNA modifications. EMBO J 20:4863-4873 Van Vliet-Reedijk JC, Planta RJ (1993) The RHO4a and NUD1 genes on Saccharomyces cerevisiae chromosome XI. Yeast 9:1139-1147 Weiss WA, Friedberg EC (1986) Normal yeast tRNA(CAGGln) can suppress amber codons and is encoded by an essential gene. J Mol Biol 192:725-735 Weissenbach J, Kiraly I, Dirheimer G (1977) Primary structure of tRNA Thr 1a and b from brewer's yeast. Biochimie 59:381-391 Weissenbach J, Martin R, Dirheimer G (1975) The primary structure of tRNAIIArg from brewers' yeast. 2. Partial digestion with ribonuclease T1 and derivation of the complete sequence. Eur J Biochem 56:527-532 Winey M, Mendenhall MD, Cummins CM, Culbertson MR, Knapp G (1986) Splicing of a yeast proline tRNA containing a novel suppressor mutation in the anticodon stem. J Mol Biol 192:49-63 Wu P, Brockenbrough JS, Paddy MR, Aris JP (1998) NCL1, a novel gene for a nonessential nuclear protein in Saccharomyces cerevisiae. Gene 220:109-117

Transfer RNA modifications and modifying enzymes in Saccharomyces cerevisiae 119 Xiao S, Scott F, Fierke CA, Engelke DR (2002) Eukaryotic ribonuclease P: A plurality of ribonucleoprotein enzymes. Annu Rev Biochem 71:165-189 Xing F, Hiley SL, Hughes TR, Phizicky EM (2004) The specificities of four yeast dihydrouridine synthases for cytoplasmic tRNAs. J Biol Chem 279:17850-17860 Xing F, Martzen MR, Phizicky EM (2002) A conserved family of Saccharomyces cerevisiae synthases effects dihydrouridine modification of tRNA. RNA 8:370-381 Yamamoto N, Yamaizumi Z, Yokoyama S, Miyazawa T, Nishimura S (1985) Modified nucleoside, 5-carbamoylmethyluridine, located in the first position of the anticodon of yeast valine tRNA. J Biochem (Tokyo) 97:361-364 Yokoyama S, Nishimura S (1995) Modified nucleosides and codon recognition. In: Söll D, RajBhandary U (eds) tRNA: structure, biosynthesis, and function. ASM Press, Washington, D.C., pp 207-223 Yoo CJ, Wolin SL (1994) La proteins from Drosophila melanogaster and Saccharomyces cerevisiae: a yeast homolog of the La autoantigen is dispensable for growth. Mol Cell Biol 14:5412-5424 Yoo CJ, Wolin SL (1997) The yeast La protein is required for the 3' endonucleolytic cleavage that matures tRNA precursors. Cell 89:393-402 Yoshida M (1973) The nucleotide sequence of tRNAGly from yeast. Biochem Biophys Res Commun 50:779-784 Yoshihisa T, Yunoki-Esaki K, Ohshima C, Tanaka N, Endo T (2003) Possibility of cytoplasmic pre-tRNA splicing: the yeast tRNA splicing endonuclease mainly localizes on the mitochondria. Mol Biol Cell 14:3266-3279 Zachau HG, Dutting D, Feldmann H (1966) The structures of two serine transfer ribonucleic acids. Hoppe Seylers Z Physiol Chem 347:212-235 Zhao X, Patton JR, Davis SL, Florence B, Ames SJ, Spanjaard RA (2004) Regulation of nuclear receptor activity by a pseudouridine synthase through posttranscriptional modification of steroid receptor RNA activator. Mol Cell 15:549-558 Åström SU, Byström AS (1994) Rit1, a tRNA backbone-modifying enzyme that mediates initiator and elongator tRNA discrimination. Cell 79:535-546 Åström SU, Nordlund ME, Erickson FL, Hannig EM, Byström AS (1999) Genetic interactions between a null allele of the RIT1 gene encoding an initiator tRNA-specific modification enzyme and genes encoding translation factors in Saccharomyces cerevisiae. Mol Gen Genet 261:967-976

Abbreviations I: inosine m1I: 1-metylinosine m1A: 1-methyladenosine t6A: N6-threonylcarbamoyladenosine i6A: N6-isopentenyladenosine Ar(p): 2´-O-ribosyladenosine (phosphate) Am: 2´-O-methyladenosine m5C: 5-methylcytidine ac4C: N4-acetylcytidine

120 Marcus J.O. Johansson and Anders S. Byström

m3C: 3-methylcytidine Cm: 2´-O-methylcytidine m1G: 1-methylguanosine m2G: N2-methylguanosine m 22 G: N2,N2-dimethylguanosine Gm: 2´-O-methylguanosine m7G: 7-methylguanosine yW: wybutosine Ψ: pseudouridine D: dihydrouridine m5U: 5-methyluridine Um: 2'-O-methyluridine mcm5U: 5-methoxycarbonylmethyluridine mcm5s2U: 5-methoxycarbonylmethyl-2-thiouridine ncm5U: 5-carbamoylmethyluridine ncm5Um: 5-carbamoylmethyl-2'-O-methyluridine

Byström, Anders S. Department of Molecular Biology, Umeå University, 901 87 Umeå, Sweden [email protected] Johansson, Marcus J.O. Department of Molecular Biology, Umeå University, 901 87 Umeå, Sweden

Biosynthesis and function of 1-methyladenosine in transfer RNA James T. Anderson and Louis Droogmans

Abstract Determining the function of single nucleotide modifications in tRNA has been elusive because so many tRNA modification enzymes are not essential for cell viability, making it difficult to do functional studies in vivo. The enzyme that cata1 lyzes the formation of 1-methyladenosine modification at position 58 (m A58) in most yeast tRNAs is essential for yeast cell viability, which has made it possible to explore the role of this single modification in tRNA structure and function. In 1 addition to reviewing the role of m A in tRNAs from prokaryotes to eukaryotes 1 and mitochondria to cytoplasm, this chapter discusses the importance of m A58 in Met maintaining the 3-dimensional structure of yeast initiator tRNA . Exploiting the Met genetics available in yeast, it has been discovered that initiator tRNA lacking 1 m A58 is eliminated from cells by 3’ polyadenylation and 3’ to 5’ exonuclease degradation.

1 Introduction The free base, 1-methyladenine, was identified 43 years ago (Dunn 1961), and shortly following its discovery, the 1-methyladenosine mononucleotide was puri1 fied from RNA (Dunn 1963). The first demonstration of 1-methyladenosine (m A) Phe in tRNA came after purification and sequence determination of tRNA from yeast (RajBhandary et al. 1966); nearly four decades later the number of tRNAs 1 known to possess m A stands at 264 out of 564 tRNA sequences (Sprinzl et al. 1 1998). The reported positions of m A in tRNA are 9, 14, 22, (57 transiently) and 1 58, with the major proportion of the 264 tRNAs containing m A at position 58 in 1 the TΨC loop (Fig. 1). The significance of m A modification at this position is underscored by its occurrence in tRNAs from the three domains of life (Bacteria, Archaea, and Eukaryota). The postulate which has emerged from this knowledge is 1 that m A is a primordial RNA modification that plays a significant role in tRNA 1 function and or structure (Björk 1995). In support of this, m A being one of a few methylated nucleosides bearing a positive electrostatic charge (together with 7methylguanosine and 3-methylcytidine), indicates that it could make a significant contribution to tRNA structure stability through an electro-chemical interaction, reviewed in (Agris 1996). The goal of this review is to summarize what is known

Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106364 / Published online: 7 January 2005 © Springer-Verlag Berlin Heidelberg 2005

122

James T. Anderson and Louis Droogmans

Fig. 1. Location of 1-methyladenosine (m1A) in tRNA from the three domains of life. A. Formula of m1A and 1-methylinosine (m1I), the deamination product of m1A. R represents ribose. B. Cloverleaf structure of tRNA with the occurrence and positions of m1A. The letters A, B, and E refer to the three domains of life (A, Archaea; B, Bacteria; E, Eukaryota), while M refers to mitochondria. m1A has not been detected in any sequenced chloroplast tRNAs. m1A57 is the intermediate in m1I57 formation in archaeal tRNAs.

about the function of m1A in tRNA and to highlight interesting questions that remain unanswered about m1A and its role in tRNA function with particular attention being paid to recent genetic studies in yeast.

Biosynthesis and function of 1-methyladenosine in transfer RNA

123

2 The m1A methyltransferases (MTases) The first characterizations of tRNA m1A58 MTases were done using partially purified protein fractions from bovine liver, Thermus thermophilus and Tetrahymena pyriformis that exhibited m1A MTase activity in vitro against tRNA from E. coli, which lacked m1A (Glick and Leboy 1977; Yamazaki et al. 1992; Sengupta et al. 2000). The purification and characterization of the m1A58 MTase from Saccharomyces cerevisiae (Anderson et al. 1998, 2000) has made it possible to clone cDNAs encoding the human m1A58 MTase (J. Anderson unpublished data), and identify other m1A58 MTases from Thermus thermophilus (Droogmans et al. 2003), Pyrococcus abyssi (Roovers et al. 2004), and Mycobacterium tuberculosis (Varshney et al. 2004). The two essential genes GCD10 and GCD14 (renamed TRM6 and TRM61) encode the two nonidentical subunits of the S. cerevisiae m1A58 MTase (54 kDa and 44 kDa, respectively) (Anderson et al. 1998). The purified enzyme behaves as a tetramer upon gel filtration with an apparent molecular weight of 200 kDa (Anderson et al. 1998). Detailed analysis, both in vitro and in vivo, showed that Trm61p is responsible for AdoMet-binding and presumably catalysis of the methyltransfer reaction, while Trm6p appears to be essential for tRNA binding (Anderson et al. 2000). Trm61p was found to be closely related to a group of prokaryotic proteins which share not only the AdoMet-binding site but also other strongly conserved motifs and invariant residues within Trm61p. Hence, it was hypothesized that the prokaryotic proteins correspond to true m1A58 MTases (Bujnicki 2001). This hypothesis was verified experimentally for the Thermus thermophilus, Pyrococcus abyssi, and Mycobacterium tuberculosis homologs (Gupta et al. 2001; Droogmans et al. 2003; Roovers et al. 2004; Varshney et al. 2004). These proteins are homotetramers of a Trm61p-like polypeptide (called TrmI). On the other hand, orthologs of the Trm6p protein were found only in Eukaryota. Interestingly, protein fold-recognition analysis revealed that despite the absence of the characteristic MTase motifs, Trm6p is structurally and evolutionary related to Trm61p (Bujnicki 2001). This suggests that the eukaryotic m1A58 MTase evolved by gene duplication and subfunctionalization to produce a heterooligomeric protein complex, while their prokaryotic orthologs remained homooligomers encoded by a single gene (Bujnicki 2001). Substrate recognition by Xenopus laevis m1A58 MTase was studied using T7runoff transcripts of tRNA genes injected into oocytes. These studies showed that mutations affecting tertiary interactions in the tRNA core do not affect m1A58 formation. Thus, the X. laevis m1A58 MTase does not recognize the intact 3D structure of tRNA (Grosjean et al. 1996; Morin et al. 1998). Moreover, the m1A58 MTase from the archaeon Pyrococcus abyssi exhibits activity using fragmented tRNAs as substrate. In particular, a minisubstrate corresponding to domain I of tRNA (acceptor stem extended by TψC stem-loop) was found to be efficiently methylated (Constantinesco et al. 1999). The synthesis of m1I57 in archaeal tRNA occurs by a two-step process that requires 1-methyladenosine formation and subsequent deamination to complete the formation of m1I57 (Grosjean et al. 1995).

124

James T. Anderson and Louis Droogmans

During the course of characterizing the P. abyssi m1A58 MTase, it was determined that TrmI methylates adenosine 57 and 58, indicating that the archaeal m1A MTase exhibits region specificity (Roovers et al. 2004). Since none of the cloned and characterized m1A MTases have been shown to modify adenosines in human mitochondrial tRNA at position 9 or bacterial tRNA at position 22, it is presumed that these modification enzymes are distinct from the m1A58 MTases and still await discovery. A human candidate cDNA (AK000635) that may encode the mitochondrial m1A9 MTase has been identified by homology searches with yeast Trm61p (J. Anderson and J. Bujnicki unpublished results), but it is still unclear if the protein of this gene FLJ20628 represents the m1A9 MTase enzyme. The activity of the B. subtilis m1A22 enzyme has been characterized using cell-free extracts and bacterial or eukaryotic tRNAs as substrates (Raettig et al. 1977). Thus, it should be possible to identify that polypeptide by further fractionation of B. subtilis extracts using the activity assay to enrich for the enzyme. It is possible that sequence similarity searches of the translated genome of Bacillus subtilis using known MTases as queries will uncover open reading frames that may encode the m1A22 MTase. However, as yet the searches using TrmI or Trm6p/Trm61p sequences as queries have been unsuccessful, which suggests that the m1A22 MTase could be very remotely related or even unrelated to m1A58 MTases. The existence of m1A14 is rare having only been reported in cytoplasmic tRNAPhe from several mammals (Sprinzl et al. 1998). It is not yet known whether m1A14 in tRNAPhe is formed by the same enzyme that modifies tRNAs at position 58 in mammals or some other yet uncharacterized protein.

3 m1A influences tRNA structure Of the fifteen modified nucleosides in tRNA that are found in representative organisms from the three domains of life, only nine are found at the corresponding position in tRNA from each domain. Of those nine, six are part of or near the anticodon stem and loop, which demonstrates the need for a modified nucleoside to maintain key molecular interactions while decoding mRNAs (Söll and RajBhandary 1995; Yokoyama 1995; Grosjean and Benne 1998). The other three modified nucleosides Ψ13, Ψ55 and m1A58 are predicted to play important roles in maintaining the general tertiary structure of most tRNAs. The need for Ψ55 for tRNA structural integrity is supported by the observation that more than 90% of tRNAs with uridine at position 55 have Ψ at this position in the mature tRNA (Sprinzl et al. 1998). The presence of Ψ13 is less well conserved than Ψ55; only 20-30% of the tRNAs have uridine at position 13. Of those possessing uridine at position 13, 60-70% are modified to Ψ13. Adenosine at position 58 in tRNA is extremely conserved being found in more than 4000 tRNA DNA sequences (Marck and Grosjean 2002) and most tRNAs sequenced, and when position 58 is adenosine, it is modified to m1A ~25% of the time.

Biosynthesis and function of 1-methyladenosine in transfer RNA

125

Fig. 2. Hydrogen bonds make significant contribution to the tertiary structure of yeast tRNAiMet. The cloverleaf structure of yeast initiator tRNAMet with adenosines 20, 54, 58, and 60 depicted as boxed A. The hydrogen bonds between adenosines 20, 54, 58, and 60 that form a substructure unique to initiator tRNAs are shown as dotted lines. The figure is reprinted with permission from the Nature publishing group and was originally published in (Basavappa and Sigler 1991).

3.1 m1A58 and tRNA structure The idea that m1A58 might play a significant part in tRNA structure was first suggested in a report describing the 3-dimensional structure of initiator tRNAMet (tRNAiMet) from S. cerevisiae (Basavappa and Sigler 1991). The authors characterized a unique “substructure” in yeast tRNAiMet that was stabilized by hydrogen bonds between adenosines A20, A54, and A60 (Fig. 2). They predicted that this unique substructure would be found in all eukaryotic initiator tRNAs due to the presence of A20, A54, and A60 in those tRNAs. They also noted that m1A58 makes a significant contribution to this substructure by hydrogen bonding with A54 and A60. Since a loss of m1A58 would change the hydrogen bonding capabilities of A58 and eliminate the positive charge of hydrogen-bonded m1A, it seemed feasible that this single alteration in tRNAiMet could be responsible for dis-

126

James T. Anderson and Louis Droogmans

rupting the structure of tRNAiMet while having little to no effect on elongator tRNAs. This idea is supported by the 3 dimensional structures of elongator tRNAs, where in most cases U54 is converted to T54, which then forms a reverseHoogsteen base pair with adenosine 58 through interaction of T54 N3, O4, or O2 with A58 N6, N7. Initiator tRNAs form a non-canonical A54 N6, N3, A58 N7, N6 base pair that in the absence of N1-methyl would create the possibility for A54 and A58 to interact through N1, N6 and N6, N1 hydrogen bonds, respectively. This prediction would require that one or both adenosines in initiator tRNA obtain an altered conformation to better accommodate the alternative hydrogen bonding interaction. It is reasonable to assume that fluctuation between two equally feasible base-pairing possibilities could create a subset of aberrantly structured initiator tRNA that would possess an intrinsic instability. If indeed loss of m1A58 results in changes to tRNAiMet structure, the way in which this occurs will await structure determination of the native tRNAiMet lacking only this modified nucleoside. Certainly, the presence of m1A58 in all eukaryotic initiator tRNAs sequenced to date strongly supports the assertion that m1A58 plays an important role in maintaining the tertiary structure of these tRNAs. The structural stabilization of ribonucleic acids in thermophilic organisms is particularly important in tRNA, where there is a requirement for the maintenance of a complex three-dimensional structure in the absence of other macromolecular associations. Studies performed on Thermus thermophilus tRNA have shown that the thiolation of ribothymidine 54 (s2T54, in the TΨC loop) stabilizes the tRNA at high temperature (Horie et al. 1985). Interestingly, a mutant of T. thermophilus in which the trmI gene encoding m1A58 MTase has been inactivated by the insertion of an antibiotic resistance cassette displays a temperature sensitive phenotype (Droogmans et al. 2003), suggesting that the methylation of A58 (involved in a reverse-Hoogsteen interaction with s2T54) plays an important role in the stabilization of the tRNA structure at high temperature. Independently of nucleoside modifications, the importance of the reverse-Hoogsteen interaction between residues 54 and 58 for tRNA function was recently demonstrated using combinatorial libraries of an E. coli suppressor tRNA gene (Zagryadskaya et al. 2003, 2004). 3.2 m1A9 in mitochondrial tRNA structure and function Greater than 50% of tRNAs sequenced have adenosine at position 9, and of those, 10-15% are modified to m1A9. Thus far, all tRNAs containing m1A9 have been isolated from eukaryotic mitochondria. Sequence determination of the mitochondrial tRNALys (mt-tRNALys) from human (Helm et al. 1998), rat (Randerath et al. 1981), hamster (HsuChen et al. 1983), and cow (Krzyzosiak et al. 1988) has revealed that they all contain m1A9, and the human mt-tRNALys serves as the prototype for understanding the contribution of m1A9 in stabilizing mt-tRNALys structure. After determining the sequence of mt-tRNALys, Florentz and co-workers surprisingly discovered structural differences between the native, fully modified,

Biosynthesis and function of 1-methyladenosine in transfer RNA

127

Fig. 3. 1-methyladenosine at position 9 of mitochondrial tRNALys is required for cloverleaf structure formation. A) The predicted cloverleaf structure of mitochondrial tRNALys based on chemical and enzymatic probing (Helm et al. 1998). The position of 1-methyladenosine 9 (m1A) and the other modified nucleoside found in human mt-tRNALys are shown. B) One of two alternative structures of unmodified mt-tRNALys as determined by chemical and enzymatic probing (Helm et al. 1998). The nucleotides that extend the acceptor stem by forming base pairs between the D (AAG) and T (CUU) arms of mt-tRNALys are shown. The unmodified mt-tRNALys adopts an extended anticodon stem through interactions between D and T stem nucleotides (grey circles).

mt-tRNALys and the mt-tRNALys made in vitro using T7 RNA polymerase (Helm et al. 1998). After exhaustive chemical and nuclease probing of the different tRNAs, they proposed two alternative structures for mt-tRNALys lacking any modified nucleosides. The main feature of each proposed tRNA structure is a 3 base pair extension of the acceptor stem, which involves A8-U65, A9-U64, and G10-C63 (Fig. 3). The placement of A9 in the center of the three base pair extension provided an important clue and led to the hypothesis that the presence of m1A9 precludes the formation of these alternative structures and favors the formation of the normal cloverleaf structure. This idea was further investigated by making mutations in mt-tRNALys that disrupted base-pairing between positions 9 and 64, which restored formation of the cloverleaf structure to the unmodified T7 synthesized mt-tRNALys (Helm et al. 1998). Remarkably, introduction of only the m1A9 modification into an otherwise unmodified mt-tRNALys restored its clover-

128

James T. Anderson and Louis Droogmans

leaf structure comparable to the fully modified native mt-tRNALys (Helm et al. 1999). These data clearly demonstrated that m1A9 blocks formation of a standard Watson-Crick base pair between A9-U64, which when allowed to form, extends the acceptor stem and leads to stable formation of an alternative non-cloverleaf mt-tRNALys. It is not known whether m1A9 is also important in the formation of the native structure for all mt-tRNAs possessing this modified nucleoside, although a survey of the existing mt-tRNA sequences in the database may establish what proportion of mt-tRNAs could potentially form an extended acceptor stem in the absence of this base modification. There is at least one case where m1A9 in mt-tRNA appears not to disrupt the tRNA structure, but instead appears to be important for its function in translation elongation. In the nematode Ascaris suum the presence of m1A9 in tRNAMet, tRNAPhe and tRNASer(UCU) has been reported (Watanabe et al. 1994). In studies stemming from their original observations, Watanabe and co-workers created synthetic tRNAMet by T7 transcription or by ligating together synthetic oligos bearing m1A at position 9. The unmodified, fully modified, and m1A9 containing synthetic mt-tRNAMet species all had similar structures by chemical and enzymatic probing (Ohtsuki et al. 1996). Even though the structures of these tRNAs were not obviously different from each other and the Km of the unmodified and m1A9 synthetic mt-tRNAMet for methionyl-tRNA synthetase were indistinguishable, the unmodified and m1A9 containing synthetic tRNAs exhibited Km’s ~50 times greater than the native fully modified mt-tRNAMet (Ohtsuki et al. 1996). From these data, it was concluded that one of the five other modified nucleosides found in the acceptor stem and/or anticodon stem and loop of mt-tRNAMet is a positive determinant for the synthetase and the presence of m1A9 expressed no influence on tRNA aminoacylation. In contrast, the presence of m1A9 did have a notably positive effect on mt-tRNA-binding by the translation elongation factor EF-Tu (Sakurai M 2001), suggesting that m1A9 may be an important determinant during translation elongation. In the cases of human and nematode mt-tRNA, m1A9 plays an important role in the function of these mt-tRNAs in vitro, either by helping to maintain the tertiary structure or having a positive effect on translation. Determining if either of these findings is true in vivo will probably require identification of the m1A9 MTase and genetic depletion of the enzyme to assess the ability of mttRNA to function when lacking m1A9.

4 m1A58 and HIV replication The role of modified nucleosides and tRNA in HIV replication is dealt with in more detail in the chapter by Marquet and Dardel of this volume. A summary of what is known about m1A58 in human tRNA3Lys and HIV replication is included here to stress the diversity of 1-methyladenosine function. A pathological consequence of m1A58 is its role in HIV replication. The HIV-1 genome (accession NC 001802) contains an 18 nucleotide sequence complementary to the last 18 nucleotides of human tRNA3Lys. This so called primer binding site (PBS) is used by virus

Biosynthesis and function of 1-methyladenosine in transfer RNA

129

to form a hybrid with tRNA3Lys and prime synthesis of minus strand cDNA using reverse transcriptase (RT). To faithfully reproduce only the 18 nucleotides of tRNA3Lys in the mature cDNA, the RT must terminate synthesis of plus strand cDNA after copying the last 18 nucleotides of tRNA3Lys. The termination of RT dependent plus strand cDNA synthesis was first hypothesized to occur at the nucleotide prior to m1A58 in tRNAPro when it was observed that a plus strand strongstop cDNA made in detergent solubilized Moloney murine Leukemia virions was extended ~20 nucleotides beyond the expected end of plus strand cDNA (Gilboa et al. 1979). The conclusion from this study was that m1A58 would be pivotal in terminating reverse transcription. This has been shown to be true for HIV using an in vitro reconstituted replication system and either fully modified or completely unmodified tRNA3Lys incorporated into minus strand DNA and an oligonucleotide to prime synthesis of plus-strand cDNA using HIV RT (Ben-Artzi et al. 1996; Burnett and McHenry 1997). The results of both studies led to the conclusion that m1A58 is important for reverse transcription termination and the faithful reproduction of the exact PBS in HIV cDNA. The model of m1A58 blocking extension of the cDNA beyond the PBS was supported by results from a study aimed at addressing the in vivo requirement of m1A58 in tRNA3Lys for HIV replication. A mutant tRNA3Lys was created where adenosine 58 was replaced by uracil (T in DNA sequence) (Renda et al. 2001). Established cell lines expressing the tRNA3Lys A58U variant were challenged with HIV-1 virus and it was discovered that cell lines containing the mutant tRNA3Lys were infected by HIV-1 at a remarkably reduced level. Moreover, when infected, the cell line expressing the tRNA3Lys A58U variant exhibited a significant delay in HIV replication (Renda et al. 2001).

5 m1A58 function in stabilizing tRNAiMet from S. cerevisiae The identification of the yeast tRNA m1A58 MTase occurred when yeast strains bearing mutations in one of two genes GCD10 (TRM6) and GCD14 (TRM61) were found to cause a reduced steady-state level of tRNAiMet and a loss of m1A formation in total tRNA (Anderson et al. 1998; Calvo et al. 1999). The reduction of tRNAiMet in trm6-504 mutant strains was found to be a result of increased tRNAiMet turnover (Anderson et al. 1998), and not from a reduction in tRNAiMet gene transcription (J. Anderson unpublished data). A hypothesis was formulated that tRNAiMet but not other tRNAs were degraded in the absence of m1A58, because of an alteration in the tRNAiMet structure or its ability to function in translation. Inconsistent with the idea that tRNAiMet lacking m1A58 adopts an aberrant structure is the observation that no other tRNAs normally containing m1A58 were rendered unstable in the absence of m1A58 (Anderson et al. 1998; Calvo et al. 1999). However, given the unique details of the 3-dimensional structure of yeast tRNAiMet reported (Basavappa and Sigler 1991) and described above, it remains possible that the 3-dimensional structure of only tRNAiMet is sensitive to a loss of m1A58. This model, while attractive is yet untested, and verification will require structure determination of the tRNAiMet lacking only m1A58, along with a com-

130

James T. Anderson and Louis Droogmans

prehensive assessment of the steady-state level of elongator tRNAs from yeast cells lacking a functional m1A58 MTase. The destabilization of tRNAiMet in the absence of m1A58 suggested that a mechanism for degrading aberrantly structured or misfolded tRNA exists in S. cerevisiae. To identify components of such a tRNA degradation pathway, extragenic suppressors of a trm6-504 temperature-sensitive (Ts-) phenotype were identified and characterized. So far, this work has led to the identification of three complementation groups which are required for the degradation of tRNAiMet lacking m1A58 (Kadaba et al. 2004). Initially, trm6-504 suppressors were characterized as single gene recessive-mutations that stabilized hypomodified tRNAiMet without simultaneously restoring m1A formation. The wild type genes of two out of three complementation groups were cloned and shown to be TRF4 and RRP44. The third gene has not yet been identified, although it has been preliminarily identified as MTR4 (J. Anderson and X. Wang unpublished result). Mtr4p is a member of the family of DEAH-box RNA helicases and has been shown to be important in mRNA trafficking and in rRNA processing, which involves the exoribonuclease complex termed the exosome (Kadowaki et al. 1994; Liang et al. 1996; de la Cruz et al. 1998). Trf4p is a member of the nucleotidyltransferase family of proteins and purified recombinant Trf4p or Trf4p-HA purified from yeast possess in vitro DNA and polyA polymerase activities, respectively (Wang et al. 2000; Saitoh et al. 2002). Rrp44 is a member of the RNR superfamily of exoribonucleases (Zuo and Deutscher 2001), and it is one of ten core components of the nuclear and cytoplasmic exosome (Mitchell et al. 1997) (reviewed in van Hoof and Parker 1999; Mitchell and Tollervey 2000; Butler 2002). Discovery that a deletion of RRP6, the nonessential exoribonuclease unique to the nuclear exosome, suppresses the Tsphenotype of a trm6-504 mutant and stabilizes tRNAiMet placed the degradation of hypomodified tRNAiMet in the nucleus. This conclusion is consistent with the localization of Trf4p to the nucleus (Walowsky et al. 1999) and our initial hypothesis that pre-tRNAiMet is the target for degradation (Anderson et al. 1998), and once mature tRNAiMet reaches the cytoplasm it is relatively stable. This could mean that m1A58 or the MTase is only required to stabilize pre-tRNAiMet during nuclear processing, or it could imply that degradation of mature tRNAiMet lacking m1A58 in the cytoplasm is inefficient. In this regard, inactivation of Ski2p, a component of the cytoplasmic ski complex of proteins (Brown et al. 2000) that is involved in exosome mediated degradation of mRNA (Jacobs et al. 1998; van Hoof et al. 2000, 2002; Araki et al. 2001), leads to modest but reproducible increases in the steady-state mature tRNAiMet level in a trm6-504 strain at a permissive temperature for growth (Kadaba et al. 2004), suggesting that a low level of degradation of tRNAiMet lacking m1A58 occurs in the cytoplasm. Further support that the mature tRNAiMet lacking m1A58 possesses abnormal properties and may be subject to degradation in the cytoplasm comes from the observation that tRNAiMet from the trm6-504 mutant is not as efficiently aminoacylated as tRNAiMet from a wild type cell (J. Anderson unpublished data). Even though there is probably some degradation of tRNAiMet lacking m1A58 in the cytoplasm, it would only represent a very small proportion of the overall degradation of the mutant tRNAiMet occurring in the cell, and what triggers degradation in the cytoplasm is unknown.

Biosynthesis and function of 1-methyladenosine in transfer RNA

131

Fig. 4. Pre-tRNAiMet lacking m1A58 is polyadenylated. RNA isolated from the indicated strains was subjected to Northern blot analysis using total RNA (lanes 1-4, 5 µg and lane 10, 1 µg) or poly(A)+ RNA (lanes 5 and 6, 1 µg, lanes 7-9, 2 µg, and lanes 11-13, 1 µg) after separation by denaturing polyacrylamide gel electrophoresis and transfer to a membrane. Detection of tRNAiMet was accomplished by hybridization with a radioactive probe that recognizes only tRNAiMet, and the tRNAiMet was visualized by autoradiography. Pretreatment of 1 µg of poly(A)+ RNA from the trm6-504 rrp6∆ +hcTRF4 strain with oligod(T18) and RNaseH was done to eliminate the poly(A) tails prior to separation and detection of tRNAiMet.

The fact that a putative polyA polymerase and the exosome are required for hypomodified tRNAiMet degradation appears conspicuously like a mechanism for degrading tRNA in E. coli where Pap1 (polyA polymerase) and PNPase (polynucleotide phosphorylase, the core component of an exosome-like complex called the degradasome) are required for degradation of a mutant tRNATrp bearing a point mutation in the acceptor stem (Li et al. 2002). In E. coli strains lacking Pap1 or PNPase or both proteins, the mutant tRNATrp accumulated in its precursor form. Subsequently it was shown that the mutant tRNATrp contained 3’ nontemplated adenylates whose presence is dependent upon pap1 expression. The authors postulated that the adenylates serve as a tag to target mutant tRNATrp for degradation by the degradasome. Given the identity and proposed function of the trm6-504 suppressors, it seemed plausible that the degradation of yeast hypomodified tRNAiMet

132

James T. Anderson and Louis Droogmans

occurred by a mechanism similar to the one in E. coli. This hypothesis was tested by selecting poly(A)+ RNA from a trm6-504 rrp6∆ double mutant strain using oligo(dT) chromatography. Northern blot analysis of the poly(A)+ RNAs from trm6-504 rrp6∆ revealed that tRNAiMet is polyadenylated and that polyadenylation is dependent upon trm6-504 (Fig. 4). Furthermore, analysis of the poly(A)+ tRNAiMet after pretreatment with oligo(dT) and RNaseH demonstrated that pretRNAiMet is the substrate for polyadenylation (Fig. 4) (Kadaba et al. 2004). Moreover, the failure to detect polyA+ tRNAiMet in total RNA from trm6-504 trf4-20 or trm6-504 trf4∆ mutant strains indicates that a functional Trf4p protein is required for polyadenylation of the hypomodified tRNAiMet (S. Kadaba and J. Anderson unpublished data). A working model to explain how hypomodified tRNAiMet might be degraded in yeast has been developed and is shown in Figure 5. A subset of precursor initiator tRNAMet molecules that lack m1A58 and possess an aberrant structure are subjected to polyadenylation by Trf4p. The polyadenylated pre-tRNAiMet is then degraded by the nuclear exosome. It is likely that Rrp6p is important in the degradation for two reasons; i. deletion of RRP6 suppresses trm6-504 and stabilizes hypomodified tRNAiMet, and ii. deletion of RRP6 causes several non proteinencoding stable RNAs to accumulate as polyadenylated RNAs, implying that Rrp6p may possess an intrinsic deadenylase activity. It stands to reason then that degradation of the polyadenylated tRNAiMet could occur in two steps: first, rapid deadenylation by Rrp6p, then degradation of the tRNAiMet body by the core exosome. The preliminary involvement of Mtr4p, a putative DEAH box RNA helicase, in tRNAiMet degradation suggests that the exosome requires tRNA unfolding or remodeling for complete degradation. It is also possible that Mtr4p and Trf4p in forming a complex (Ho et al. 2002; Krogan et al. 2004) orchestrate the degradation of hypomodified tRNAiMet through repeated rounds of polyadenylation and incremental stages of degradation by the exosome. In this regard, Trf4p, Trf5p, and Air2p have been affinity purified with tandem affinity purification (TAP) tagged Mtr4p (J. LaCava and D. Tollervey personal communication). Trf5p is a close (55% identity) homolog of Trf4p and it has been reported that Trf4p and Trf5p have redundant functions in DNA synthesis (Sadoff et al. 1995; Castano et al. 1996; Wang et al. 2000). Interestingly, Air2p and its homolog, Air1p (45% identity) each possess a novel RING finger domain and physically interact with Hmt1p, an hnRNP methyltransferase protein in vivo and in vitro (Inoue et al. 2000). Incubation of the TAP-Mtr4p complexes with tRNA, purified exosome, and ATP leads to polyadenylation of the tRNA and its subsequent degradation to mono and dinucleotides. If incubated with the exosome alone or if the tRNA is prepolyadenylated prior to incubation with the exosome, no tRNA degradation is observed (J. LaCava and D. Tollervey personal communication). It is not known if all normally stable RNAs are subjected to this surveillance degradation pathway. However, the observation that an extremely unstable mutant U6 snRNA (Eschenlauer et al. 1993) is partially stabilized in strains lacking Rrp6p or Trf4p (J. Anderson unpublished data) and other RNA substrates can be polyadenylated

Biosynthesis and function of 1-methyladenosine in transfer RNA

133

Fig. 5. Hypomodified tRNAiMet degradation requires polyadenylation and the nuclear exosome. tRNAiMet from a trm6-504 m1A Mtase mutant lacks the m1A58 modification. Trf4p alone or in complex with Mtr4p polyadenylates the hypomodified tRNAiMet, which targets it for degradation by the nuclear exosome in a 3’ to 5’ direction. Degradation of tRNAiMet may require multiple rounds of adenylation, RNA helicase dependent unwinding and exonucleolytic digestion. Thus, Trf4p in a complex with Mtr4p and Air2p may continually adenylate tRNAiMet and remodel it for complete degradation to mononucleotides.

134

James T. Anderson and Louis Droogmans

and degraded in vitro using purified components (J. LaCava and D. Tollervey personal communication) suggests that the mechanism of degrading hypomodified tRNAiMet is more general and includes other stable RNAs.

6 Conclusions and perspectives The recent cloning and characterization of m1A MTases from Bacteria and Archaea will make it possible to determine if m1A58 plays an important role in tRNA structure or function in these organisms like it has been established for yeast initiator tRNAMet. The instability of initiator tRNAMet lacking m1A58 in yeast is easily explained by the existing data, but detailed structural determinations of the hypomodified tRNAiMet must take place before these conclusions can be fully accepted. Interestingly, m1A58 appears to play an important role in the tertiary structure of tRNAiMet, while m1A9 is important for maintaining a stable secondary structure in mt-tRNALys. These two findings indicate the versatility of m1A in influencing tRNA structure. On the one hand, m1A58 forms stable interactions with sister nucleotides in tRNAiMet, and loss of the methyl group probably disrupts those interactions. On the other hand, the presence of m1A9 impedes the formation of undesired stable Watson-Crick base pair in mt-tRNALys, which when formed causes a dramatic structural change in the tRNA. Now that the role of m1A in tRNA structure has been clearly established, the future research should focus on the role played by m1A in tRNA function. The discovery of a nuclear pathway that initiates degradation of a hypomodified tRNA through a Trf4p dependent polyadenylation event demonstrates that poly(A) tails could target aberrant nuclear RNAs for degradation. This is antithetic to the role of poly(A) tails in mRNA stability in the cytoplasm, implying strongly that nuclear and cytoplasmic exosome complexes or their accessory factors possess distinct mechanisms of substrate recognition. Since the core components found in the nuclear and cytoplasmic exosome complexes are, as far as we know, identical, it is more likely that substrate selection in each compartment is determined by the accessory components or Rrp6p, the unique nuclear limited exoribonuclease of the exosome. Examining the molecular interactions between purified tRNA, Mtr4p-Trf4p-Air2p and the exosome in vitro will be a critical step toward a complete understanding of nuclear tRNA degradation.

Acknowledgements L. D. is a Research Associate of the F.N.R.S. (Fonds National de la Recherche Scientifique). This work was supported in the USA by a National Institutes of Health grant to J. Anderson, R15 GM066791, and in Belgium by grants from the F.R.F.C. (Fonds pour la Recherche Fondamentale Collective), from the French Community of Belgium (Actions de Recherches Concertées) and from the Univer-

Biosynthesis and function of 1-methyladenosine in transfer RNA

135

sité Libre de Bruxelles (Fonds E. Defay). We wish to thank John LaCava and David Tollervey for generously sharing unpublished results.We also thank Janusz Bujnicki and Henri Grosjean for helpful comments during the writing of this chapter.

References Agris PF (1996) The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function. Prog Nucleic Acid Res Mol Biol 53:79-129 Anderson J, Phan L, Hinnebusch AG (2000) The Gcd10p/Gcd14p complex is the essential two-subunit tRNA(1-methyladenosine) methyltransferase of Saccharomyces cerevisiae. Proc Natl Acad Sci USA 97:5173-5178 Anderson J, Phan L, Cuesta R, Carlson BA, Pak M, Asano K, Bjork GR, Tamame M, Hinnebusch AG (1998) The essential Gcd10p-Gcd14p nuclear complex is required for 1methyladenosine modification and maturation of initiator methionyl-tRNA. Genes Dev 12:3650-3662 Araki Y, Takahashi S, Kobayashi T, Kajiho H, Hoshino S, Katada T (2001) Ski7p G protein interacts with the exosome and the Ski complex for 3'-to-5' mRNA decay in yeast. EMBO J 20:4684-4693 Basavappa R, Sigler PB (1991) The 3 Å crystal structure of yeast initiator tRNA: functional implications in initiator/elongator discrimination. EMBO J 10:3105-3111 Ben-Artzi H, Shemesh J, Zeelon E, Amit B, Kleiman L, Gorecki M, Panet A (1996) Molecular analysis of the second template switch during reverse transcription of the HIV RNA template. Biochemistry 35:10549-10557 Björk GR (1995) Biosynthesis and function of modified nucleosides. In: Söll D, RajBhandary, U. L. (ed) tRNA: Structure Biosynthesis and Function. ASM Press, pp 165-206 Brown JT, Bai X, Johnson AW (2000) The yeast antiviral proteins Ski2p, Ski3p, and Ski8p exist as a complex in vivo. RNA 6:449-457 Bujnicki JM (2001) In silico analysis of the tRNA:m1A58 methyltransferase family: homology-based fold prediction and identification of new members from Eubacteria and Archaea. FEBS Lett 507:123-127 Burnett BP, McHenry CS (1997) Posttranscriptional modification of retroviral primers is required for late stages of DNA replication. Proc Natl Acad Sci USA 94:7210-7215 Butler JS (2002) The yin and yang of the exosome. Trends Cell Biol 12:90-96 Calvo O, Cuesta R, Anderson J, Gutierrez N, Garcia-Barrio MT, Hinnebusch AG, Tamame M (1999) GCD14p, a repressor of GCN4 translation, cooperates with Gcd10p and Lhp1p in the maturation of initiator methionyl-tRNA in Saccharomyces cerevisiae. Mol Cell Biol 19:4167-4181 Castano IB, Heath-Pagliuso S, Sadoff BU, Fitzhugh DJ, Christman MF (1996) A novel family of TRF (DNA topoisomerase I-related function) genes required for proper nuclear segregation. Nucleic Acids Res 24:2404-2410 Constantinesco F, Motorin Y, Grosjean H (1999) Transfer RNA modification enzymes from Pyrococcus furiosus: detection of the enzymatic activities in vitro. Nucleic Acids Res 27:1308-1315

136

James T. Anderson and Louis Droogmans

de la Cruz J, Kressler D, Tollervey D, Linder P (1998) Dob1p (Mtr4p) is a putative ATPdependent RNA helicase required for the 3' end formation of 5.8S rRNA in Saccharomyces cerevisiae. EMBO J 17:1128-1140 Droogmans L, Roovers M, Bujnicki JM, Tricot C, Hartsch T, Stalon V, Grosjean H (2003) Cloning and characterization of tRNA (m1A58) methyltransferase (TrmI) from Thermus thermophilus HB27, a protein required for cell growth at extreme temperatures. Nucleic Acids Res 31:2148-2156 Dunn DB (1961) The occurrence of 1-methyladenine in ribonucleic acid. Biochim Biophys Acta 46:198-200 Dunn DB (1963) The isolation of 1-methyladenylic acid and 7-methylguanylic acid from ribonucleic acid. Biochem J 86:14P-15P Eschenlauer JB, Kaiser MW, Gerlach VL, Brow DA (1993) Architecture of a yeast U6 RNA gene promoter. Mol Cell Biol 13:3015-3026 Gilboa E, Mitra SW, Goff S, Baltimore D (1979) A detailed model of reverse transcription and tests of crucial aspects. Cell 18:93-100 Glick JM, Leboy PS (1977) Purification and properties of tRNA(adenine-1)methyltransferase from rat liver. J Biol Chem 252:4790-4795 Grosjean H, Benne R (1998) Modification and editing of RNA. ASM Press, Washington, DC Grosjean H, Constantinesco F, Foiret D, Benachenhou N (1995) A novel enzymatic pathway leading to 1-methylinosine modification in Haloferax volcanii tRNA. Nucleic Acids Res 23:4312-4319 Grosjean H, Edqvist J, Straby KB, Giege R (1996) Enzymatic formation of modified nucleosides in tRNA: dependence on tRNA architecture. J Mol Biol 255:67-85 Gupta A, Kumar PH, Dineshkumar TK, Varshney U, Subramanya HS (2001) Crystal structure of Rv2118c: an AdoMet-dependent methyltransferase from Mycobacterium tuberculosis H37Rv. J Mol Biol 312:381-391 Helm M, Brule H, Degoul F, Cepanec C, Leroux JP, Giege R, Florentz C (1998) The presence of modified nucleotides is required for cloverleaf folding of a human mitochondrial tRNA. Nucleic Acids Res 26:1636-1643 Helm M, Giege R, Florentz C (1999) A Watson-Crick base-pair-disrupting methyl group (m1A9) is sufficient for cloverleaf folding of human mitochondrial tRNALys. Biochemistry 38:13338-13346 Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, Yang L, Wolting C, Donaldson I, Schandorff S, Shewnarane J, Vo M, Taggart J, Goudreault M, Muskat B, Alfarano C, Dewar D, Lin Z, Michalickova K, Willems AR, Sassi H, Nielsen PA, Rasmussen KJ, Andersen JR, Johansen LE, Hansen LH, Jespersen H, Podtelejnikov A, Nielsen E, Crawford J, Poulsen V, Sorensen BD, Matthiesen J, Hendrickson RC, Gleeson F, Pawson T, Moran MF, Durocher D, Mann M, Hogue CW, Figeys D, Tyers M (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415:180-183 HsuChen CC, Cleaves GR, Dubin DT (1983) A major lysine tRNA with a CUU anticodon in insect mitochondria. Nucleic Acids Res 11:8659-8662 Inoue K, Mizuno T, Wada K, Hagiwara M (2000) Novel RING finger proteins, Air1p and Air2p, interact with Hmt1p and inhibit the arginine methylation of Npl3p. J Biol Chem 275:32793-32799

Biosynthesis and function of 1-methyladenosine in transfer RNA

137

Jacobs JS, Anderson AR, Parker RP (1998) The 3' to 5' degradation of yeast mRNAs is a general mechanism for mRNA turnover that requires the SKI2 DEVH box protein and 3' to 5' exonucleases of the exosome complex. EMBO J 17:1497-1506 Kadaba S, Krueger A, Trice T, Krecic AM, Hinnebusch AG, Anderson J (2004) Nuclear surveillance and degradation of hypomodified initiator tRNAMet in S. cerevisiae. Genes Dev 18:1227-1240 Kadowaki T, Chen S, Hitomi M, Jacobs E, Kumagai C, Liang S, Schneiter R, Singleton D, Wisniewska J, Tartakoff AM (1994) Isolation and characterization of Saccharomyces cerevisiae mRNA transport-defective (mtr) mutants. J Cell Biol 126:649-659 Krogan NJ, Peng WT, Cagney G, Robinson MD, Haw R, Zhong G, Guo X, Zhang X, Canadien V, Richards DP, Beattie BK, Lalev A, Zhang W, Davierwala AP, Mnaimneh S, Starostine A, Tikuisis AP, Grigull J, Datta N, Bray JE, Hughes TR, Emili A, Greenblatt JF (2004) High-definition macromolecular composition of yeast RNA-processing complexes. Mol Cell 13:225-239 Krzyzosiak WJ, Marciniec T, Wiewiorowski M, Romby P, Ebel JP, Giege R (1988) Characterization of the lead(II)-induced cleavages in tRNAs in solution and effect of the Ybase removal in yeast tRNAPhe. Biochemistry 27:5771-5777 Li Z, Reimers S, Pandit S, Deutscher MP (2002) RNA quality control: degradation of defective transfer RNA. EMBO J 21:1132-1138 Liang S, Hitomi M, Hu YH, Liu Y, Tartakoff AM (1996) A DEAD-box-family protein is required for nucleocytoplasmic transport of yeast mRNA. Mol Cell Biol 16:5139-5146 Marck C, Grosjean H (2002) tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA 8:1189-1232 Mitchell P, Petfalski E, Shevchenko A, Mann M, Tollervey D (1997) The exosome: a conserved eukaryotic RNA processing complex containing multiple 3'-->5' exoribonucleases. Cell 91:457-466 Mitchell P, Tollervey D (2000) Musing on the structural organization of the exosome complex. Nat Struct Biol 7:843-846 Morin A, Auxilien S, Senger B, Tewari R, Grosjean H (1998) Structural requirements for enzymatic formation of threonylcarbamoyladenosine (t6A) in tRNA: an in vivo study with Xenopus laevis oocytes. RNA 4:24-37 Ohtsuki T, Kawai G, Watanabe Y, Kita K, Nishikawa K, Watanabe K (1996) Preparation of biologically active Ascaris suum mitochondrial tRNAMet with a TV-replacement loop by ligation of chemically synthesized RNA fragments. Nucleic Acids Res 24:662-667 Raettig R, Kersten H, Weissenbach J, Dirheimer G (1977) Methylation of an adenosine in the D-loop of specific transfer RNAs from yeast by a procaryotic tRNA (adenine-1) methyltransferase. Nucleic Acids Res 4:1769-1782 RajBhandary UL, Stuart A, Faulkner RD, Chang SH, Khorana HG (1966) Nucleotide sequence studies on yeast phenylalanine sRNA. Cold Spring Harb Symp Quant Biol 31:425-434 Randerath E, Agrawal HP, Randerath K (1981) Rat liver mitochondrial lysine tRNA (anticodon U*UU) contains a rudimentary D-arm and 2 hypermodified nucleotides in its anticodon loop. Biochem Biophys Res Commun 103:739-744 Renda MJ, Rosenblatt JD, Klimatcheva E, Demeter LM, Bambara RA, Planelles V (2001) Mutation of the methylated tRNA(Lys)(3) residue A58 disrupts reverse transcription and inhibits replication of human immunodeficiency virus type 1. J Virol 75:96719678

138

James T. Anderson and Louis Droogmans

Roovers M, Wouters J, Bujnicki JM, Tricot C, Stalon V, Grosjean H, Droogmans L (2004) A primordial RNA modification enzyme: the case of tRNA (m1A) methyltransferase. Nucleic Acids Res 32:465-476 Sadoff BU, Heath-Pagliuso S, Castano IB, Zhu Y, Kieff FS, Christman MF (1995) Isolation of mutants of Saccharomyces cerevisiae requiring DNA topoisomerase I. Genetics 141:465-479 Saitoh S, Chabes A, McDonald WH, Thelander L, Yates JR, Russell P (2002) Cid13 is a cytoplasmic poly(A) polymerase that regulates ribonucleotide reductase mRNA. Cell 109:563-573 Sakurai M OT, Watanabe Y, Watanabe, K (2001) Requirement of modified residue m1A9 for EF-Tu binding to nematode mitochodrial tRNA lacking the T arm. Nucleic Acids Res Suppl 1:237-238 Sengupta R, Vainauskas S, Yarian C, Sochacka E, Malkiewicz A, Guenther RH, Koshlap KM, Agris PF (2000) Modified constructs of the tRNA TPsiC domain to probe substrate conformational requirements of m(1)A(58) and m(5)U(54) tRNA methyltransferases. Nucleic Acids Res 28:1374-1380 Söll D, RajBhandary U (1995) tRNA structure, biosynthesis, and function. ASM Press, Washington, D.C. Sprinzl M, Horn C, Brown M, Loudovitch A, Steinberg S (1998) Compilation of tRNA sequences and sequenes of tRNA genes. Nucleic Acids Res 26:148-153 van Hoof A, Frischmeyer PA, Dietz HC, Parker R (2002) Exosome-mediated recognition and degradation of mRNAs lacking a termination codon. Science 295:2262-2264 van Hoof A, Parker R (1999) The exosome: a proteasome for RNA? Cell 99:347-350 van Hoof A, Staples RR, Baker RE, Parker R (2000) Function of the ski4p (Csl4p) and Ski7p proteins in 3'-to-5' degradation of mRNA. Mol Cell Biol 20:8230-8243 Varshney U, Ramesh V, Madabushi A, Gaur R, Subramanya HS, RajBhandary UL (2004) Mycobacterium tuberculosis Rv2118c codes for a single-component homotetrameric m1A58 tRNA methyltransferase. Nucleic Acids Res 32:1018-1027 Walowsky C, Fitzhugh DJ, Castano IB, Ju JY, Levin NA, Christman MF (1999) The topoisomerase-related function gene TRF4 affects cellular sensitivity to the antitumor agent camptothecin. J Biol Chem 274:7302-7308 Wang Z, Castano IB, De Las Penas A, Adams C, Christman MF (2000) Pol kappa: A DNA polymerase required for sister chromatid cohesion [see comments]. Science 289:774779 Watanabe Y, Tsurui H, Ueda T, Furushima R, Takamiya S, Kita K, Nishikawa K, Watanabe K (1994) Primary and higher order structures of nematode (Ascaris suum) mitochondrial tRNAs lacking either the T or D stem. J Biol Chem 269:22902-22906 Yamazaki N, Hori H, Ozawa K, Nakanishi S, Ueda T, Kumagai I, Watanabe K, Nishikawa K (1992) Purification and characterization of tRNA(adenosine-1-)- methyltransferase from Thermus thermophilus HB27. Nucleic Acids Symp Ser 27:141-142 Yokoyama SN(1995) Modified Nucleosides and Codon Recognition. In: Söll D, RajBhandary, U. L. (ed) tRNA Structure, Biosynthesis, and Function. ASM Press, pp 207-223 Zagryadskaya EI, Doyon FR, Steinberg SV (2003) Importance of the reverse Hoogsteen base pair 54-58 for tRNA function. Nucleic Acids Res 31:3946-3953 Zagryadskaya EI, Kotlova N, Steinberg SV (2004) Key elements in maintenance of the tRNA L-shape. J Mol Biol 340:435-444 Zuo Y, Deutscher MP (2001) Exoribonuclease superfamilies: structural analysis and phylogenetic distribution. Nucleic Acids Res 29:1017-1026

Biosynthesis and function of 1-methyladenosine in transfer RNA

139

Anderson, James T. Department of Biological Sciences, Marquette University, Milwaukee, WI USA [email protected] Droogmans, Louis Laboratoire de Microbiologie, Université Libre de Bruxelles, Avenue E. Gryson 1, B-1070 Bruxelles, Belgium

The biosynthesis and functional roles of methylated nucleosides in eukaryotic mRNA Joseph A. Bokar

Abstract Modified nucleosides are present in mRNA of all eukaryotes, albeit at much lower levels than in other RNA moieties such as rRNA, tRNA, and snRNA. Modifica7 tion by methylation occurs on the terminal guanosine of the cap (N -methylguanosine), and the first two encoded nucleosides (2’-O-methylnuculeosides) in most higher eukaryotes. Additional modifications of cap nucleosides occur in special cases where the cap is derived by trans-splicing in nematodes and kinetoplastids. Modification by methylation also occurs at internal adenosine residues in 6 many species (N -methyladenosine). Modification by deamination occurs at specific adenosine residues (forming inosine) and cytidine residues (forming uridine) in very specific cases leading to post-transcriptional editing. Numerous studies 7 have shown the importance of the cap N -methylguanosine in translation, splicing, transport, and mRNA stability. The role of the 2’-O-methylnucleosides is not as well understood, but there is evidence that these modifications play some role in 6 translation efficiency. The role of internal N -methyladenosine residues is least 6 known, and is the focus of this review. The formation of N -methyladenosine is catalyzed by a complex enzyme containing a subunit (MT-A70) that co-localizes with nuclear speckles and appears to be widely expressed in all higher eukaryotes. Loss of this enzyme leads to a sporulation defect in yeast and to apoptosis in mammalian cells, although the exact mechanism by which the effects occur remains obscure.

1 Introduction Post-transcriptional modifications of eukaryotic mRNA include the methylation of a small subset nucleosides, both within the 5’-terminal cap structure and within the body of the mRNA. The 5’-terminal cap guanine residue, first and second transcribed nucleosides, and specific internal residues can be methylated at positions either on the base or on the ribose moiety. These methylations lead to the 7 7 6 formation of N -guanine (m G), 2’-O-methylnucleosides (Nm), and N 6 methyladenosine (m A). These methylated nucleosides represent nearly the entire repertoire of modified nucleosides in eukaryotic mRNA, a much more limited set than is found in other eukaryotic RNA species. (The only others are inosine and Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106365 / Published online: 7 January 2005 © Springer-Verlag Berlin Heidelberg 2005

142 Joseph A. Bokar

uridine, which will only be briefly considered in this review but are covered in detail in this volume in separate chapters by Hoopengardner et al. and Smith et al.). As is the case for many nucleoside modifications in rRNA, tRNA, and snRNA, the functions of these methylated nucleosides are not completely understood. This chapter will briefly review the biogenesis, the structure, and the function of the methylated cap, and will then focus in detail on internal m6A residues, the enzyme complex that catalyzes the sequence-specific methylation of internal adenosine residues, and their function. Interesting features of mRNA N6-methyladenosine methyltransferase include its multicomponent nature, the co-localization of this enzyme with splicing factors in nuclear speckles, and the prediction that it is a prototypical member of a conserved group of RNA methyltransferases, which most closely resemble the β class of methyltransferases, which previously had been known to include only prokaryotic DNA:m6A and DNA:m4C methyltransferases. This methyltransferase activity appears to be essential for important gene regulatory functions as it is necessary for induction of sporulation in yeast, and its absence leads to apoptosis in a human cell line.

2 Methylated nucleosides present in eukaryotic mRNA 2.1 The 5’-terminal cap structure The general structure of the mRNA cap is m7GpppN1(m)pN2(m)pN..., where the terminal guanine is invariably methylated at the N7-position on the purine ring. N1, which can be any of the four nucleosides, is generally methylated on the 2’-Opositon of the ribose ring, and N2 can also be methylated in the same fashion (Rottman et al. 1974; Wei et al. 1975; Shatkin 1976; Banerjee 1980; and reviewed by Reddy et al. 1992). Cap structures containing only 7-methylguanine are designated Cap 0, those containing a 2’-O-methylnucleoside at position N1 are designated Cap 1, and if both N1 and N2 are 2’-O-methylnucleosides the structure is designated Cap 2. If N1 is 2’-O-methyladenosine, it can also be further methylated at the N6-position of the purine ring. Both Cap 1 and Cap 2 structures predominate in mRNAs in higher eukaryotes, with Cap 1 being found with five times the frequency of Cap 2 in mammalian mRNA (Perry and Kelley 1976). An exception to this generality is found for eukaryotic organisms whose mRNA 5’-termini are formed by trans-splicing of a spliced leader from the 5- end of a SL RNA. In the kinetoplastids, Trypanosoma brucei and Crithidia fasciculate, transsplicing of a 39-41 nucleotide spliced leader yields a Cap 4 structure m7G(5’)ppp(5’)m26AmpAmpCmpm3Ump (Bangs et al. 1992). This unique cap has been shown to be synthesized co-transcriptionally on the 140 nucleotide SL RNA prior to trans-splicing to the mRNA (Mair et al. 2000). In nematodes (e.g. C. elegans and A. lumbricoides), a 2,2,7-trimethylgaunosine capped spliced leader of 22 nucleotides is added to some mRNAs via trans splicing from an ~100 nucleotide Sm snRNP (Liou and Blumenthal 1990; van Doren and Hirsh 1990; Maroney et al. 1995). Collectively, these cap methylation events account for the presence of

Biosynthesis and functional roles of methylated nucleosides 143

most of the nucleoside modifications in eukaryotic mRNA, namely: 7methylguanine (m7G), all four 2’-O-methylnucleosides (Nm), and (N6-,2’-O-)dimethyladenosine (m6Am), and the special circumstances in the kinetoplastids and nematodes noted above. 2.2 Biological function of methylated nucleosides within the cap structure The m7G cap is important for a number of biochemical processes involving mRNA. The most studied of these is the effect of the cap on translation initiation. The m7G cap structure enhances mRNA translation in Xenopus oocytes, in in vitro translation systems, and in cells directly transfected with mRNAs (for reviews see Merrick and Hershey 1996 and von der Haar et al. 2004). Although translation can occur in the absence of a cap, the efficiency is markedly diminished. 2.2.1 Cap effects on translation The m7G cap serves as an anchoring point for a cap-binding complex that is involved in the recruitment of the small ribosomal subunit. In fact, binding of eIF4E and eIF4G to the cap is essential for translation both in vivo and in vitro. The molecular contacts between eIF4E and the methylated guanosine have been determined by a number of groups. The eIF4E molecule contains a cavity that surrounds the cap, and the guanosine moiety is stacked between two tryptophan residues within this cavity. A mechanism has been proposed for the effect of the N-7-substitution, based upon studies using substituted cap analogues as inhibitors of translation. The positive charge on the imidazole of the N-7-alkylated purine ring interacts with the phosphate oxygens (Adams et al. 1978; Rhoads et al. 1983). The resulting conformer can then more efficiently interact with the cap-binding protein due to the increased stability of the stacking interaction (Niedzwiecka et al. 2002). Further evidence that the methyl group contributes significantly to the binding interaction comes from equilibrium studies of eIF4E with methylated and unmethylated GTP substrates. The equilibrium binding constant for the methylated moieties is approximately five orders of magnitude greater than the unmethylated nucleosides. This is important, as the high intracellular concentration of GTP could otherwise lead to competition of free GTP with mRNA for eIF4E binding. In addition to the intermolecular interactions between eIF4E and the terminal nucleotide, there are additional contacts between its carboxyl-terminal portion and downstream nucleotides, although these have been more difficult to quantify. Alterations in the sequence at the first two transcribed nucleotides can produce up to fourfold changes in equilibrium affinity (Carberry et al. 1992; von der Haar et al. 2000). Direct evidence that these downstream interactions are affected by the presence or absence of ribose methylations is lacking. However, there is ample evidence to suggest that these modifications do affect translation efficiency in some instances. Kuge and Richter (1995) have addressed the role of cap ribose

144 Joseph A. Bokar

methylation in regulation of translation, focusing on the control of maternal mRNA translation in Xenopus oocytes. Using histone B4 mRNA as a model, they showed that after injection of cap 0 (m7GpppN) containing mRNA into oocytes, the fist two nucleotides become 2’-O-methylated, and this methylation is dependent upon polyadenylation of the transcripts. Inhibition of the methylations resulted in a significant decrease in the translational activation of the message. This implies that 2’-O-methylation is important for protein-RNA interaction with the capbinding complex. Kuge et al. (1998) extended these observations by studying the effects of cap-ribose methylation on translational activation of a luciferase reporter gene and of c-mos mRNA. They reported several very important findings. Cap 0 and cap 1-containing mRNA were prepared in vitro and injected into oocytes. The ribose-methylated (cap 1) transcripts were translated 4.4-fold more efficiently than their unmethylated counterparts. Inhibition of cap-ribose methylation led to a block in the translational activation of endogenous c-mos mRNA. They showed that the stimulation of translation by message polyadenylation was mediated in large part by polyadenylation-dependent cap-ribose methylation. It is also interesting to note that in vaccinia virus, the poly(A) polymerase activity and 2’-O-ribose methyltransferase activities are both contained on a single heterodimeric enzyme suggesting that the interplay of polyadenylation and cap ribose methylation may be functionally linked in other systems as well (Schnierle et al. 1992). In contrast to the results with c-mos transcripts, Gillian-Daniel et al. (1998) report that not all translational recruitment by polyadenylation in an oocyte model is dependent on cap-ribose methylation. Specifically, ribose methylation of a reporter mRNA bearing a cyclin B1 3’-UTR is not required for poly(A)’s stimulation of that mRNA’s translation. Together, these studies illustrate the important physiologic regulatory potential of this nucleoside modification at the level of mRNA translation. 2.2.2 Cap effects on mRNA transport The monomethylated cap has also been shown to be important for transport of mRNA from the nucleus to the cytoplasm (Hamm and Mattaj 1990). In contrast, di- and tri-methylguanosine caps (as in small nuclear RNAs) prevent transport of RNA from the nucleus. Numerous studies have shown that nuclear to cytoplasmic translocation of mRNA is dependent upon a nuclear cap-binding complex (CBC) and for many mRNAs, this translocation is also dependent upon cap-binding of eIF4E. Therefore, the dependence of the affinity of the protein-cap interactions upon the base methylation and ribose methylations of the cap nucleosides is also likely to translate to an important function for these same modifications in regulation of export. In mammals, the CBC is composed of two proteins, CBP20 and CBP80. There is a high degree of conservation of these proteins in other organisms including yeast and insects. Visa et al. (1996) showed in an insect model that the CBC binds to the cap structure co-transcriptionally, and then “leads” the mRNA molecule through the nuclear pore complex (NPC). At some point, probably on the cytoplasmic side of the NPC, cap-bound CBC is exchanged for eIF4E to complete the translocation process. One very interesting example of regulation of gene expression at the level of eIF4E-dependent nuclear transport comes from

Biosynthesis and functional roles of methylated nucleosides 145

the cyclinD1 model. CyclinD1 over-expression is one mechanism for oncogenic cell transformation. Cohen et al. (2001) showed that a site-directed mutant of eIF4E that cannot function in translation remains active in cyclinD1 mRNA export. Overexpression of this mutant protein leads to cellular transformation, just as overexpression of the wild type protein does. This provides evidence that its transforming properties may be primarily due to the mRNA transport function of eIF4E, which is in turn dependent upon the interaction of this protein and its modified nucleoside-containing binding partner(s). 2.2.3 Cap effects on mRNA splicing The m7G cap is also important for splicing. Konarska et al. (1984) showed that pre-mRNA splicing occurred efficiently in HeLa whole cell extracts only when the substrate RNA was capped. Splicing could be inhibited in these reactions by the addition of cap analogues, m7GpppG and m7Gppp. Subsequent work showed that this cap-dependence was pronounced only for the 5’-terminal intron. The current consensus is that the cap functionally replaces the polypyrimidine tract of an upstream intron in the definition of an adjacent exon. Izzaurralde et al. (1994) have characterized a nuclear cap-binding complex (CBC) that is required for efficient splicing of adenoviral pre-mRNA in vitro, and the CBC has been shown to mediate the effect of the methylated cap on transport and on the efficiency of capproximal intron splicing in mammals (Izzuralde et al. 1994; Lewis et al. 1996) and yeast (Colot et al. 1996; Fresco et al. 1996; Schwer and Shuman 1996). It appears that the cap-CBC complex functions by altering the kinetics of binding of the U1 snRNP to the cap proximal 5’-splice site, but more importantly it has a dramatic effect on the replacement of the bound U1 with U6 snRNP at the 5’-splice site. Inhibition of splicing by cap analogue correlated with the loss of the U6 interaction but not the U1 interaction (O’Mullane and Eperon 1998). Whether this CBC is identical to that which has been shown to be important for mRNA transport remains uncertain. 2.2.4 Cap effects on mRNA degradation The methylated cap also serves to protect mRNA from 5’- exoribonuclease activity and, at the same time, it is a necessary determinant for recognition by decapping enzymes. There are two pathways for mRNA decay. In the first, mRNA is deadenylated at the 3’-end, followed by 3’-5’-hydrolysis. The remaining cap structure is then hydrolyzed by DcpS in humans and Dcs1p in yeast (Liu et al. 2002). An alternate pathway in humans, which is most likely regulated, involves decapping by hDcp2 immediately after removal of the polyA tail (Piccirillo et al. 2003). A similar pathway exists in yeast and involves the Dcp1p/Dcp2p complex. Following deadenylation and loss of PABP1-binding, decapping by DCP1/DCP2 complex is initiated, followed by 5’-3’ exonuclease digestion by Xrnp1 (Beelman et al. 1996; Dunckley and Parker 1999). Methylation of the terminal guanine has been shown to be an essential determinant for recognition by Dcp2. Numerous additional interactions have been identified between the decapping machinery and

146 Joseph A. Bokar

other components of the various mRNA degradation pathways. These include binding of the translation complex, primarily via EIF4e, Lsm proteins, Mrt1p/Pat1p, and Hsp70p (reviewed in Tucker and Parker 2000). 2.3 Enzymes involved in cap methylation The cellular enzymes that catalyze methylation of the nucleosides comprising the cap structure were initially partially purified and characterized, mainly by B. Moss and co-workers. Four separate activities were identified in HeLa cell extracts: RNA (guanine-7-)- methyltransferase (m7G methyltransferase), cap 1- and cap 2specific 2’-O-methyltransferases (Nm methyltransferases), and 2’-Omethyladenosine-N6-methyltransferase (m6Am methyltransferase) (for review see Mizumoto and Kaziro 1987). RNA (guanine-7-)-methyltransferase was purified 165-fold and had an apparent Mr of 56 kDa (Ensinger and Moss 1976). The enzyme was specific for the guanosine involved in the terminal 5’-dinucleoside triphosphate in capped RNA substrates, and could also methylate G(5’)pppG, but not GTP, GDP, or G(3’)pppG. Subsequently, Shuman’s group showed that cap biogenesis involves a series of three enzymatic reactions: hydrolysis of the nascent mRNA 5’-triphosphate by an RNA triphosphatase, addition of GMP by GTP:RNA guanylyl-transferase, and methylation by RNA (guanine-N7-)-methyltransferase (Shuman 1995). In S. cerevisiae, these three reactions are accomplished by three separate polypeptides: a triphosphatase (Cet1p), guanylyl-transferase (Ceg1p) (Tsukamoto et al. 1997; Ho et al. 1998; Yamada-Okabe et al. 1998), and cap methyltransferase (Abd1p) (Ho et al. 1998b). In metazoans, the triphosphatase and guanylyl-transferase activities are contained in separate domains in a single polypeptide (Mce1p in mammals) (Ho et al. 1998b; Wen et al. 1998). Similar to S. cerevisiae Abd1p, the methyltransferase activity is separately encoded in metazoans (Hcm1p in humans) (Saha et al. 1999). Mutational studies of cap methyltransferases from vaccinia virus, yeast, and human showed a high degree of structural conservation. 2’-O-methylation of N1 and N2 occur subsequent to N7-methylation of guanine. Two separate activities were partially purified which catalyze the 2’O-methylation of the first (cap 1) and second (cap 2) transcribed nucleosides (Langberg and Moss 1981). These were not extensively characterized, but the data suggest that cap 1 methyltransferase can utilize m7G(5’)pppN or G(5’)pppN as a substrate with equivalent efficiency in vitro. The enzyme appears to fractionate as a nuclear protein during the early steps of purification. The cap 2 methyltransferase activity, which appears to fractionate as a cytoplasmic enzyme, methylates N2 only when N1 is already methylated. The fourth cap methyltransferase, also described by Moss and colleagues, is (2’-O-methyladenosine-N6)methyltransferase (Keith et al. 1978). This enzyme was purified >340-fold from HeLa cytoplasmic extracts and had a Mr of 65 kDa. This partially purified enzyme fraction had a marked preference for m7G(5’)pppAm, and was much less efficient at methylating m7G(5’)pppApN or G(5’)pppAmpN. This enzyme preparation was not able to methylate internal adenosine residues, suggesting that it is distinct from

Biosynthesis and functional roles of methylated nucleosides 147 Table 1. Cap Methyltransferases from selected organisms Cap N7-guanine methyltransferases Human Hcm1 S. cerevisiae Abd1 E. cuniculi Ecm1 Vaccinia virus Reovirus

D1R/D12L λ2

Bifunctional – N7-G- and (2’-O-)methyltransferase

Cap (2’-O-)-methyltransferases Human Gene(s) not Cap 1 (2’-O-)-methyltransferase – identified nuclear protein Cap 2 (2’-O-)-methyltransferase – cytoplasmic protein (2’-O-methyladenosine-N6)methyltransferase Reovirus Bifunctional – N7-G and (2’-O-)λ2 methyltransferase Vaccinia VP39 Bifunctional – (2’-O-)virus methyltransferase and poly(A) polymerase stimulatory protein Flavivirus NS5 Bifunctional – (2’-O-)methyltransferase and RNA polymerase

Saha et al. 1999 Ho et al. 1998b Fabrega et al. 2004 Niles and Christen 1993 Luongo et al. 1998 Langberg and Moss 1981 Keith et al. 1978 Luongo et al. 1998 Cong and Shuman 1992 Egloff et al. 2002

the internal mRNA N6-adenosine methyltransferase (m6A-MT). The genes encoding these methyltransferases have not yet been identified and reported. Compared to the cellular cap methyltransferases, the vaccinia virus capping enzyme and cap methyltransferases have been studied in greater detail (Martin et al. 1975; Higman et al. 1992, 1994a, 1994b; Cong and Shuman 1992; Myette and Niles 1996). The mRNA triphosphatase, guanyltransferase, and guanine-N7-)methyltransferase activities were shown to be contained within a 130 kDA heterodimeric protein. This complex is composed of a small (D1R) and large (D12L) subunit. The methyltransferase activity is contained within the small subunit, while RNA-binding activity resides primarily within the large subunit. Components of both subunits are necessary for efficient cap guanine-7 methylation. 2’-Omethylation of the first transcribed nucleotide is catalyzed by the vaccinia virus protein, VP39. This bifunctional protein acts as both a cap modification enzyme at the 5’-terminus of mRNA and as a 3’-terminus modifying enzyme that acts as a poly(A)-polymerase stimulatory protein (Schnierle et al. 1992). A summary of genes and enzymes that act as cap-specific methyltransferases is provided in Table 1. The three-dimensional structure of the yeast cap 0 methyltransferase, ABD1, was modeled by Bujnicki et al. (2001). The predicted structure revealed an unexpected homology to glycine N-methyltransferase, and included a beta-sheet subdomain the forms a “lid” over the active site. The first X-ray structure of a cap

148 Joseph A. Bokar

m7G methyltransferase was solved by Reinisch and Harrison (2000). They solved the structure of the reovirus core at 3.6 Å. In the l2 protein, they identified a GTPase domain and two methyltransferase domains (MT1 and MT2). Based upon the mutual spatial arrangement of these domains and the assumption that the order of reactions is the same as in vaccinia virus (capping, then m7G methylation, then 2’-O-ribose methylation), they predicted MT1 and MT2 domains to be cap 0 and cap 1 methyltransferases, respectively. However, using bioinformatics techniques, Bujnicki and Rychlewski (2001) found that the functional assignments made by the crystallographers were inconsistent with the presence of conserved residues in the putative active sites of both domains. They suggested that MT1 was most similar to the vaccinia VP39 2’-O-methyltransferase, and contained the K-D-K signature of catalytic residues characteristic for 2’-O-methyltransferases (Hager et al. 2002; Feder et al. 2003). MT2 possesses the beta-sheet “lid” subdomain found initially in the glycine N- methyltransferase, predicted for yeast ABD1, and finally determined in the crystal structure of the viral cap methyltransferase, Ecm1. Sequence comparison of this “minimal” cap methyltransferase from the microsporidian parasite, E. cuniculi, suggested that it too resembles the cap methyltransferases identified in other organisms. Fabrega et al. (2004) determined the X-ray crystal structure of this model enzyme bound to its substrate, AdoMet. By their analysis, the structure is compatible with a reaction model in which an SN2-type displacement is promoted by optimal alignment of the N7 atom of guanine (the attacking nucleophile) and the methyl carbon of AdoMet (the leaving group) without direct contacts with the enzyme. The reaction chemistry does not appear to involve direct transition state stabilization, nucleophile activation, or expulsion of the leaving group by catalytic groups on the protein.

3 Modified nucleosides at internal positions in eukaryotic mRNA In addition to the methylated nucleosides found within the cap structure described above, four additional nucleoside modifications have been reported within internal regions of eukaryotic mRNA.– two due to deamination reactions (C to U, A to I), and two due to methylation: cytidine to 5-methylcytidine (m5C), and adenosine to N6-methyladenosine (m6A). 3.1 Nucleoside modification by deamination Inosine is formed post-transcriptionally by the deamination of specific adenosine residues, catalyzed by the RNA adenosine deaminases ADAR1 and ADAR2. Deamination of adenosine residues to form inosines results in the alteration of the coding sequence of the mRNA and has been studied in detail for the mRNAs encoding subunits of glutamate receptor channels in mammalian brain and the hepatitis delta virus antigenome RNA (Melcher et al. 1995; Polson et al. 1995).

Biosynthesis and functional roles of methylated nucleosides 149

ADAR1 is an essential gene whose loss of function results in apoptosis in a wide variety of cells in mouse embryos. ADAR2 is important for editing of glutamate receptor mRNAs and is essential for neural cell development. This topic is reviewed in detail by Hoopengardner et al. in this volume. In a somewhat analogous fashion, specific cytidine residues can also undergo deamination to yield uridine residues. This has been studied mainly for the apolipoprotein B mRNA, and controls the switch between expression of the ApoB-100 and ApoB-48 isoforms (Chen et al. 1987, Powell et al. 1987). Deamination of C6666 to U leads to the creation of a UAA stop codon and leads to an mRNA that encodes ApoB48, a truncated version of ApoB100, in a tissue specific fashion. The deaminase is a complex containing a 27 kDa protein, Apobec-1, which is essential for this process (Hirano et al. 1996). Subsequent studies have identified a family of genes that encode putative deaminases capable of mRNA editing, including Apobec-2, which is expressed only in cardiac muscle, and activation-induced deaminase (AID), which is involved in class-switch regulation of the immunoglobulin gene cluster. However, these latter two examples may primarily target DNA, and not RNA, deamination (Xie et al. 2004). The topic of C to U editing is also reviewed in depth by Smith et al. in this volume. 3.2 5-methylcytidine (m5C) The presence of small amounts of m5C in Sindbis viral mRNA, Adenovirus 2 mRNA, and in cellular mRNA from cultured hamster (BHK-21) cells was reported in early studies (Dubin et al. 1975a, 1975b, 1977). Interestingly, m5C was not detected in mRNA from Novikoff hepatoma cells (Desrosiers et al. 1974, 1975), HeLa (Wei et al. 1976, 1977), or mouse L cells (Perry et al. 1975). The significance of this discrepancy is unclear, and further studies of m5C content in mRNA from various sources have not been reported. The possibility that the m5C detected represented contamination of the mRNA fraction with other RNA species cannot be ruled out. It is also possible that low levels of m5C are present in mRNA as a result of the activity of tRNA m5C methyltransferase. Grosjean and coworkers have shown that enzymes capable of catalyzing m5C40 and m5C49 formation in yeast tRNAPhe are independent of the three dimensional structure of the substrate and can act on fragments of the tRNA (Jiang et al. 1997; Grosjean et al. 1996, 1997). It might well be that the same methyltransferase(s) act on mRNA, either by design, or by accident. 3.3 N6-methyladenosine In contrast to the apparent infrequent occurrence of m5C, inosine, and deaminated cytidine in cellular mRNA, m6A is present in easily detectable amounts in mRNA isolated from all higher eukaryotes tested including plants (Nichols 1979), mammals (Desrosiers et al. 1975; Perry et al. 1975; Perry and Sherrer 1975; Furuichi et al. 1975; Wei et al. 1976; Adams and Corey 1975), and Drosophila (Levis and Penman 1978). mRNAs isolated from lower eukaryotes such as S. cerevisiae (Sri-

150 Joseph A. Bokar

pati et al. 1976), Dictyostelium (Dottin et al. 1976), and Neurospora (Seidel and Somberg 1978) were traditionally thought to not contain detectable amounts of m6A, although it is now known that m6A is present in S. cerevisiae mRNA isolated from yeast growing meiotically, but not mitotically (this will be discussed in detail below). This modification is present in viral RNAs that replicate in the nucleus (adenovirus, SV40, herpes viruses, Rous sarcoma virus, influenza) (Somner et al. 1976; Aloni et al. 1979; Canaani et al. 1979; Bartuski and Roizman 1978; Beemon and Keith 1977; Narayan et al. 1987), but not in the RNA from viruses that do not have a nuclear phase in their life cycle (Sindbis virus, vaccinia, reovirus, vesicular stomatitis virus) (for review see Narayan and Rottman 1992). Analysis of the methylated constituents of mouse L cell, Novikoff hepatoma cell, and HeLa cell mRNA has revealed that approximately 50% of incorporated methyl label is in m6A (Fig. 1), making it the most abundant modified nucleoside in mRNA. The average content of m6A has been estimated to be 3-5 residues per mRNA in Novikoff, HeLa, mouse L cell, and Chinese hamster ovary (CHO) cell mRNA. The number of m6A residues in viral RNAs ranges from 1-15 per RNA, depending on the virus. Although m6A is present at an average of 3-5 residues per mRNA in mammalian cells when total polyadenylated RNA is analyzed, its distribution among individual mRNA is not uniform. Examples of m6A abundance in specific cellular mRNAs are approximately one residue in bovine prolactin mRNA (Horowitz et al. 1984) and three residues in mouse dihydrofolate reductase mRNA (Rana and Tuck 1990). Some cellular mRNAs, including bovine growth hormone (Horowitz et al. 1984) and mouse globin (Perry and Scherrer 1975), appear not to contain m6A, indicating that this modification is not an absolute requirement for mature mRNA function. 3.3.1 Sequence specificity of m6A in vivo The distribution of m6A within an RNA molecule is non-random, and it is not found within the poly(A) tract (Desrosiers et al. 1975; Perry et al. 1975a). m6A occurs only within sequences matching PuAC (where A is the methylated residue) in all organisms tested including plants and mammals (Schibler et al. 1977; Wei and Moss 1976; Nichols 1979; Bartkuski and Roizman 1978; Dimock and Stoltzfus 1977). Using a variety of specific ribonucleases in combination with column and thin layer chromatography, Schibler et al. (1977) analyzed [3H-methyl]labeled mRNA from mouse L cells. They demonstrated that there is a preferred five nucleotide consensus sequence, PuPuACH (where H=A,C,U) for methylation. Based upon the frequency with which this degenerate sequence occurs and the estimated m6A content of mRNA, it is clear that only a minority of sequences matching the consensus actually are methylated. This suggests that there are other constraints in addition to the primary sequence that affect m6A formation. Specific m6A sites have been mapped in only two individual mRNAs. Rous sarcoma virus (RSV) virion RNA contains 12 m6A residues per RNA molecule of 9500 nucleotides (Beemon and Keith 1977). A transformation deficient mutant of RSV, lacking the src sequence, contains only 7 m6A residues, suggesting that 5 m6A residues are contained within the src sequence. In contrast, the gag and pol

Biosynthesis and functional roles of methylated nucleosides 151

Fig. 1. HPLC of HeLa RNA.HeLa cells grown in methionine deficient medium were labeled with [3H-methyl-]methionine. Polyadenylated RNA was prepared, digested with ribonuclease P1 and nucleoside pyrophosphatase, and dephosphorylated with alkaline phosphatase. Nucleosides were separated by HPLC on a Supelcosil LC-18-S column eluted isocratically with 7.5% methanol/30 mM sodium phosphate pH 5.3.

sequences did not contain m6A. The individual sites were further localized by hybrid selection of RNA fragments, followed by T1 RNase digestion and twodimensional fingerprinting. Thirteen m6A sites were precisely mapped (Kane and Beemon 1985; Csepany et al. 1990). All of these sites matched the sequence PuGACU, in agreement with the consensus sequence suggested by the analysis of heterogeneous mRNA. An important feature of m6A occurrence in viral and cellular mRNA was also documented in these studies; the modification occurs nonstoichiometrically, i.e. individual sites are methylated on only a portion of the mRNA molecules, ranging from 20-90% for these RSV sites. The only cellular mRNA for which individual m6A sites have been mapped is that encoding bovine prolactin (bPRL) (Horowitz et al. 1984; Narayan et al. 1994). Unlike viral mRNA, labeling steady-state mRNA populations in eukaryotic cells with 32P-orthophosphate or [3H-methyl]methionine results in the incorporation of low amounts of radioactivity into individual mRNAs. Therefore, for these studies, unlabeled bPRL mRNA from pituitary was first purified by hybrid selection, followed by T1 nuclease digestion. Individual T1 fragments were then isolated and digested to 3’-mononucleotides, and the resulting nucleotides were then 5’-labeled in vitro with [γ-32P]ATP and T4 polynucleotide kinase. The 3’ phosphates were removed using P1 nuclease and the resulting nucleosides were analyzed for the presence of m6A by TLC. Hybrid selection of fragments of the bPRL mRNA localized the predominant m6A site to the 3’-terminal 130 nucleotide

152 Joseph A. Bokar

fragment. By digestion of this hybrid-selected fragment with T1 ribonuclease and analysis of each T1 fragment for the presence of m6A, the m6A site was localized to an AGACU consensus sequence in the 3’-untranslated region. The extent of methylation at this site was estimated to be approximately 20%, also illustrating the lack of stoichiometry of m6A formation at individual sites. 3.3.2 Sequence specificity of m6A in vitro The development of a cell free system capable of accurately catalyzing N6adenosine methylation led to the ability to further study the methyltransferase activity and the RNA features that were required for an efficient methylation substrate (Narayan and Rottman 1988). Briefly, nuclear extracts prepared from cultured HeLa cells were used as an enzyme source. The substrates were synthetic RNA fragments derived from the bPRL mRNA sequence and [3H-methyl]AdoMet. Using this system, the same adenosine (within the AGACU consensus) was methylated on the synthetic transcript as was shown to be methylated in vivo. Harper et al. (1990) examined short (20 nucleotide) synthetic RNA substrates containing single or multiple consensus methylation sequences, or single nucleotide mutations derived from the consensus sequence. This study confirmed that the in vitro specificity closely paralleled the frequency with which sequences are methylated in vivo as originally described by Schibler (1977). From the localization studies in total cellular mRNA and individual mRNA and the results of the in vitro studies described above, it appears that the five nucleotide consensus sequence is the primary determinant for m6A formation. However, because the expected frequency of the degenerate five nucleotide consensus methylation sequence far exceeds the m6A content of mRNA, and because identical sequences are methylated to varying degrees in vivo (e.g. in RSV), the effects of RNA context were further explored. Two regions of PRL mRNA were examined in vitro (Narayan et al. 1994): the major methylation site (AGACU) found in the 3’untranslated region and the 5’-terminal 192 nucleotide of bPRL mRNA which is normally unmethylated. Mutation of the normally strong AGACU sequence in the 3’ untranslated region of PRL mRNA to the generally weaker AAACU sequence surprisingly resulted in enhanced methylation at this site, whereas mutation of the wild type AGACU to several non-consensus sequences resulted in methylation of a nearby cryptic AAACA site, located adjacent to the normal site of methylation, to levels similar to the wild type sequence. The same series of substrate sequences was placed in a different context, namely the 5’ 192 nucleotide bPRL fragment. Interestingly, a different relative order of efficiency with which the individual sequences were methylated was observed when in the context of the 5’-fragment vs. the normal 3’-site. Finally, a synthetic 20 nucleotide fragment of RNA complementary (antisense) to the 3’-terminal PRL methylation site was used in the in vitro system to demonstrate that a consensus m6A site present in duplex structure was incapable of being methylated. This series of experiments demonstrates a number of interesting points regarding the RNA substrate: The nucleotide sequence of the substrate is degenerate, as was expected. The efficiency with which a given sequence is methylated appears

Biosynthesis and functional roles of methylated nucleosides 153

to depend on both the primary sequence and the context within which the sequence occurs. A sequence which is normally methylated efficiently cannot be methylated when the consensus region is in a stable RNA duplex. The abolishment of a methylation site by mutation of the potentially modified adenosine residue can lead to the enhanced methylation of adjacent consensus sites that are otherwise not methylated. These data suggest that methylation of a strong site somehow inhibits additional methylation of the adjacent weaker sites. The mechanism by which this occurs is unknown, but may explain how the loss of a methylation site by site directed mutagenesis may lead to compensatory use of an adjacent cryptic site. This phenomenon has been observed for bPRL mRNA sequences both in vitro (as above) and in vivo, as discussed below. There are no published reports of studies directly testing the hypothesis that RNA secondary structure is a determinant for an efficient m6A site. However, examination of the sequence surrounding the major m6A site in bPRL mRNA (Fig. 2A) and two sites in RSV RNA (Fig. 2B, 2C) has suggested that each of these methylation sites may be displayed on the loop of a stem-loop structure. Regions of RNA sequence surrounding these sites were analyzed for potential secondary structures by computer using the method of Zuker (Jaeger et al. 1989a, 1989b). In each instance evaluated, the m6A site is predicted to lie within the loop of a stemloop structure. Although it is important to note that this prediction has not been experimentally verified, such structural features could contribute to the context effects described above. 3.4 Function of m6A in mRNA Despite three decades of study of m6A in mRNA, the biological function of this ubiquitous post-transcriptional modification remains unclear. Historically, two general approaches were used by several groups in an effort to define the role of m6A in mRNA biogenesis and function. First, mutations of the major methylation sites within RSV and bPRL derived expression constructs have been studied to determine the effects on mRNA biogenesis. Second, cells grown in culture have been treated with methylation inhibitors to explore the effects of mRNA undermethylation. These experiments suggest that m6A affects the efficiency of premRNA splicing or transport of mRNA from the nucleus to the cytoplasm. However, for reasons to be discussed, inherent limitations of both lines of experimentation have led to somewhat inconclusive results. Finally, genetic manipulation of yeast and mammalian cells to “knockout” expression of the methyltransferase gene illustrates that m6A leads to dramatic cellular changes, but the downstream molecular mechanisms by which this occurs remains obscure. 3.4.1 Mutation of m6A sites RSV RNA contains a cluster of seven m6A sites in the region of the src and env genes (Beemon and Keith 1977). Kane and Beemon (1987) generated a mutant virus containing point mutations at two of these sites located just downstream of the

154 Joseph A. Bokar

Fig. 2. Potential secondary structure of mRNAs containing m6A sites. Partial sequences of mRNA from bPRL and RSV surrounding selected m6A sites were analyzed for potential secondary structures using the method of Zuker (Jaeger et al. 1989a, 1989b). The methylation consensus sequences are highlighted in gray; the methylated adenosyl residue is shown in larger type. A. 62 nucleotides of sequence corresponding to the 3’-terminus of the bPRL mRNA. This is also the sequence of the RNA used in in vitro assays for the purification of m6A-MT (Bokar et al. 1994, 1997). B. 105 nucleotides of sequence corresponding to the first methylation site in the RSV src gene at position 7414 (Kane and Beemon 1985). C. 42 nucleotides of sequence corresponding the methylation site at position 6718 in the RSV env gene. (Kane and Beemon 1985).

src splice acceptor site, and just upstream of a potential or “cryptic” splice acceptor site. Mutation of two GAC sites to GAU sequences rendered them incapable of being methylated. These mutations did not lead to any differences in steady-state levels of src mRNA in mutant vs. wild type virus in infected CEF cells. No differences in the levels of src mRNA in the nuclear fraction were detected either, and no unusual spliced products were observed. Furthermore, there was no difference in the translation of the src protein, packaging of RNA into virions, or infectivity of mutant virus compared to wild type. Csepany et al. (1990) extended this study by evaluating an additional two m6A sites in a similar fashion. Mutation of the four major m6A sites in the src gene did not have an appreciable effect on steadystate levels of viral RNA nor on viral infectivity. Several explanations for the lack of an effect of these methylation mutations have been proposed. If methylation affects the kinetics of processing or transport, it may be necessary to study only newly synthesized mRNA by pulse-labeling studies in order to observe the effect. Alternatively, the multiple methylation sites may be redundant for function; therefore, all would need to be removed for an effect to be observed. This level of mutation was not studied. Finally, compensatory increases in methylation at nonmutated sites may have occurred in the mutant viral mRNA, but were not detected. Carroll et al. (1990) documented the presence of m6A within an intron-encoded region of an RNA transcribed from a bPRL mini-gene. This minigene contained the last two exons and intervening intron and was expressed in stably transfected CHO cells. Following hybrid selection of this RNA, analysis of ribonuclease T1 generated oligonucleotides from the intron encoded region revealed the presence of m6A in three oligonucleotides containing sites matching the five nucleotide

Biosynthesis and functional roles of methylated nucleosides 155

consensus sequence. In subsequent studies, these methylation sites were mutated to determine if any alteration in splicing pattern or efficiency would result from hypomethylation of the intron. Surprisingly, these mutations led to an apparently compensatory increase in methylation of nearby consensus sequences that were normally not methylated (T. Kienzle, F. Rottman, unpublished results). Therefore, it was impossible to study the effect of hypomethylation of intron D using this approach. 3.4.2 Studies utilizing methylation inhibitors A series of studies evaluating the effects of methylation inhibitors on mRNA synthesis, translation, and stability are described in the literature. These studies collectively suggest that undermethylation affects the efficiency of pre-mRNA splicing and/or transport. However, these studies uniformly suffer from the limitation that the observed effects may be due to inhibition of methylation events other than m6A in mRNA. In addition, they do not allow the discrimination of effects mediated by inhibition of mRNA cap vs. internal methylation, even though partial selectivity for m6A inhibition over cap methylation is reported with some inhibitors. Bachellerie et al. (1978) studied the effect of cycloleucine treatment on CHO cell mRNA formation. Cycloleucine is a competitive inhibitor in vitro of methionine adenosyl methyltransferase, and rapidly blocks the synthesis of AdoMet in vivo. It acts as a potent and reversible inhibitor of nucleic acid methylation. In this study, high concentrations of cycloleucine were used during a 20 min. pulse labeling with [3H-methyl]methionine or [3H]uridine. This treatment resulted in a 96% decrease in 3H-methyl incorporation into polyadenylated RNA, and the decay of this hypomethylated polyadenylated RNA from the nucleus was dramatically prolonged during a subsequent chase period in the absence of the inhibitor. This suggested that methylation of mRNA affects the efficiency of processing or transport, but does not discriminate between the effect of inhibiting cap methylation vs. internal methylation. Also, the possibility that inhibition of methylation of molecules other than the polyadenylated RNA (i.e. proteins, snRNAs) that may be important for mRNA processing could not be ruled out. The short window of exposure to the methylation inhibitor, however, makes it unlikely that the effects are mediated via inhibition of DNA, rRNA, or snRNA because of the presumed relatively long half-lives of these molecules. Stoltzfus and Dane (1982) studied the effect of cycloleucine treatment on B77 avian sarcoma virus RNA. The concentration of cycloleucine used in these studies significantly inhibited m6A and cap-associated 2’-O-methylation, but not m7G formation. A striking difference was observed in the size distribution of the virusspecific RNA when compared with RNA from the untreated control cells. Subgenomic env RNA did not accumulate, whereas genome-length RNA accumulated in larger than normal amounts. An increase in synthesis of a protein encoded by the genome-length RNA (Pr76gag) and a decrease in synthesis of a protein encoded by a spliced sub-genomic RNA (gPr92env) were also observed, consistent with the notion that alterations in RNA methylation can ultimately result in altered gene expression, by altering the efficiency of pre-mRNA splicing.

156 Joseph A. Bokar

In order to dissect the effect of internal methylation from cap methylation, Finkel and Groner (1983) utilized lower concentrations of cycloleucine that decreased m6A content by >90% with minimal effect on the extent of cap methylation. In infected BSC-1 cells, hypomethylated SV40 late mRNA accumulated in the cytoplasm at much lower levels. There was no measurable difference in the stability of the cytoplasmic mRNA, therefore, this study supported an effect of m6A on processing or transport. Camper et al. (1984) utilized another inhibitor, S-tubericidinylhomocysteine (STH), which is a structural analog of AdoMet and a potent inhibitor of AdoMet dependent methyltransferases. In this study, the effects of the inhibitor on cap methylation and internal methylation were closely monitored. At an inhibitor concentration where internal m6A was inhibited by 80%, and cap methylation was inhibited by 50-60%, there was no measurable change in the half-life of HeLa mRNA, but STH caused a significant lag in the time of cytoplasmic appearance of newly synthesized polyadenylated RNA (Fig. 3A). Carroll et al. (1990) extended this result by studying the effect of another methylation inhibitor, neplanocin, on the processing of a specific mRNA transcribed from a transfected mini-gene. A stable cell line was generated that expressed a transfected bPRL mini-gene, which contains a single intron. Treatment of these cells with the inhibitor resulted in increased levels of unspliced mini-gene pre-mRNA in the nucleus relative to untreated cells (Fig. 3B). The effects of the inhibitor on cap methylation relative to internal adenosine methylation were also determined, and found to be nearly identical to those reported by Camper et al. (1984) using STH. In summary, these studies, utilizing different methylation inhibitors in a variety of cell lines yielded fairly consistent results, i.e. inhibition of methylation leads to an apparent accumulation of unspliced pre-mRNA in the nucleus. The studies do not differentiate between a block in splicing only, or in both splicing and transport, and they do not definitively differentiate between effects at the level of internal methylation vs. cap methylation. Furthermore, each of these studies is also subject to the criticism that the observed effects could be mediated by hypomethylation of factors (protein or nucleic acid) other than mRNA, although the short time of exposure to the inhibitor in several of the studies (20-90 min.) partially decreases this concern. It has been apparent for a number of years that further elucidation of the role of m6A in mRNA depended upon developing new tools or techniques that would allow targeted disruption of m6A formation in mRNA. For this reason the focus shifted toward development of a cellular or animal system in which m6A is specifically targeted for disruption using genetic techniques.

4 Characterization and purification of HeLa mRNA N6adenosine methyltransferase A rapid assay for m6A formation and the development of an in vitro system for studying m6A formation in nuclear extracts facilitated the study of this methyltransferase (Narayan and Rottman 1988; Narayan et al. 1994; Bokar et al. 1994).

Biosynthesis and functional roles of methylated nucleosides 157

Fig. 3. Effects of methylation inhibitors on the cytoplasmic appearance of newly synthesized RNA and on nuclear splicing of a bPRL derived pre-mRNA. A (left). HeLa cells were prelabeled with a low level of [14C]uridine (0.48 mCi/mmol) for 12 h, treated with 500 µM STH (open circles) or no STH (closed circles) for 90 min., and then labeled with [3H]uridine (40 mCi/mmol). Cytoplasmic poly(A+) RNA and non-polyadenylated RNA were prepared at various times. The ratio of [3H]uridine to [14C]uridine in cytoplasmic poly(A+) RNA and non-polyadenylated RNA is shown. Appearance of newly synthesized polyadenylated RNA was delayed in STH treated cells as compared to control cells. No difference was seen for non-polyadenylated RNA (inset). (Reproduced from Camper et al. 1984, with permission.) B (right). Quantitative S1 nuclease mapping of nuclear bPRL precursor and mature-form RNA in stably transfected cells treated with the methylation inhibitor neplanocin (NPC). An autoradiogram showing the DNA fragments protected by 5 µg of total nuclear RNA isolated from cells treated with 10 µM NPC for 8 h (+NPC) and untreated cells (control). The lane labeled mock contains probe that was hybridized in the absence of RNA. (Reprinted from Carroll et al. 1990, with permission).

Briefly, a synthetic RNA substrate of 60 nucleotides of the 3’-terminus of bovine prolactin mRNA was used as a substrate for methyl transfer from [3Hmethyl]AdoMet. HeLa nuclear extract was used as an enzyme source. All of the incorporated 3H-cpm was shown to be in m6A by HPLC, and at the expected adenosine residue by T1 fingerprinting. An unanticipated result upon initial attempts at purification of the enzyme was the finding that multiple separable components are required for m6A-MT activity in vitro (Bokar et al. 1994). Fractionation of HeLa nuclear extract on a DEAE Sepharose column led to a flow-through and a bound fraction, neither of which contained significant m6A-MT activity when assayed individually. Full activity was restored when an aliquot of each was included in the reaction mixture. The

158 Joseph A. Bokar

factor or component that is initially in the flow-through fraction has been termed MT-A, while the factor that binds to the DEAE column has been named MT-B 4.1 Purification and cDNA cloning of the AdoMet-binding subunit Based upon its sedimentation in glycerol gradients, and its mobility on a size exclusion column, MT-A has an apparent Mr of 200 kDa (Bokar et al. 1994). Aliquots of column fractions that contained MT-A activity were treated by UVcrosslinking with [3H-methyl]AdoMet and then analyzed by SDS-PAGE and fluorography. A labeled protein with Mr of 70 kDa was observed only in the fractions that contained MT-A activity, and its elution paralleled that of the enzymatic activity on multiple columns. This protein, termed MT-A70, was, therefore, a likely candidate for the AdoMet -binding subunit of the m6A-MT. MT-A70 was purified on a preparative scale using FPLC by following enzymatic activity and crosslinking activity, and then purified to homogeneity by SDS-PAGE. After identification of a full-length cDNA encoding MT-A70, recombinant protein was expressed and used to generate specific antisera. These antisera can deplete m6A-MT activity from HeLa nuclear extract (Bokar et al. 1997), and recognize an immunoreactive protein on Western blots of column fractions that co-elutes precisely with MT-A activity. This series of findings has led to a model in which MT-A70 is a 70 kDa protein that contains a AdoMet -binding site, which in turn is a subunit of the 200 kDa factor MT-A. MT-A is necessary, but is not sufficient, for m6A-MT activity, as the presence of the very large MT-B component is also always required. These results indicate that sequence specific methylation of internal adenosine residues in mRNA occurs in a large multicomponent complex, not unlike many of the other nucleoside modifications and other post-transcriptional modifications of RNA detailed throughout this volume. 4.2 Further characterization of MT-B MT-B has not been as extensively characterized as MT-A. By glycerol gradient sedimentation and size exclusion chromatography it has an apparent Mr=875 kD [Bokar et al. 1994]. Unlike MT-A, MT-B binds to single-stranded DNA cellulose, suggesting that it may participate in RNA substrate binding of the enzyme complex. The large size of MT-B and its chromatographic properties led to the suspicion that it may itself contain an RNA component, such as a guide RNA. However, treatment of nuclear extract with micrococcal nuclease and immunoprecipitation with an anti-trimethylguanosine antibody did not lead to any change in m6A-MT activity, or in the apparent size of MT-B as judged by its elution on the size exclusion column (unpublished result). However, the involvement of snRNAs and snoRNAs in splicing, rRNA methylation, pseudouridine formation, and most recently in snRNA nucleoside modification, continue to make the involvement of a small RNA cofactor an intriguing possibility. If an RNA component is required, it must be inaccessible to micrococcal nuclease and either does

Biosynthesis and functional roles of methylated nucleosides 159

not have a trimethylguanosine cap, or the cap is inaccessible to antibody. Clearly there are precedents for these possibilities. More direct experiments aimed at detecting an RNA cofactor will need to be performed to test this hypothesis. 4.3 Subnuclear localization of MT-A70 in HeLa cells With the use of antibodies and specific nucleic acid probes, it has become increasingly apparent that many nucleoplasmic components are non-randomly located throughout the nucleus (Fakan 1994; Misteli and Spector 1996). Of recent interest is the observation that apparently disparate components of the mRNA transcription and processing machinery can be co-localized and specifically interact. For example, the large subunit of RNA polymerase II has been shown to co-localize with the splicing protein SC35 in nuclear “speckles” (interchromatin granule clusters) and the physical interaction of these components has been documented by immunoprecipitation (McCracken et al. 1997). The TRM1 gene product (yeast tRNA (N2,N2-dimethylguanosine26)-methyltransferase) localizes to the yeast nuclear periphery with a ring-like appearance by immunofluorescence (Rose et al. 1995). In an experiment to explore the possible distribution of MT-A70 among discrete structural domains of HeLa cell nuclei, the anti-MT-A70 antisera was used to probe paraformaldehyde-fixed HeLa cells. The results shown (Fig. 4) demonstrate a striking localization of MT-A70 to speckles in these human interphase nuclei. Co-localization of U2B″ protein, which is an snRNP protein and is used as a marker for speckles, clearly demonstrates that these two proteins are found in identical regions of the nucleus. This distribution is similar to that found for the large subunit of RNA polymerase II and splicing factor SC35, and is in contrast to the pattern seen with the TRM1 gene product. This is an exciting result which suggests that mRNA m6A-MT may indeed be associated with nuclear premRNA splicing components. 4.4 MT-A70 is the prototype of a previously undescribed class of putative RNA adenosine methyltransferases in a wide variety of organisms Bujnicki et al. (2002) subjected MT-A70 to extensive bioinformatics analysis in silico. They showed that MT-A70 is a prototypical member of a family that comprises four subfamilies of predicted orthologs (Fig. 5). The predicted consensus structure is characterized by a circular permuted sequence arrangement that has been previously been identified only in prokaryotic DNA:m6A and DNA:m4C methyltransferases (Malone et al. 1995). The family of related proteins contains not only probable methyltransferases, but other proteins that appear to have mRNA regulatory functions involving mechanism that apparently do not involve methylation. The most striking relationship that was identified prior to this analysis was the similarity of human MT-A70 to S. cerevisiae IME4 (Bokar et al. 1997), despite the fact that early work indicated that S. cerevisiae mRNA did not contain

160 Joseph A. Bokar

Fig. 4. MT-A70 localizes to speckled domains in human interphase nuclei. Double labeling of anti-MT-A70 and anti-U2 B” (a U2 specific snRNP protein) was performed on paraformaldehyde-fixed HeLa cell nuclei. MT-A70 was detected using polyclonal rabbit antiserum 12622 and FITC-conjugated anti-rabbit IgG (panel A). U2 B” protein was detected with monoclonal antibody 4G3 and Texas Red-conjugated anti-mouse IgG (panel B).

Fig. 5. Phylogenetic tree of MT-A70 orthologs. Bujnicki et al. (2002) subjected the sequence of MT-A70 to extensive in silico analysis to identify orthologous and paralogous proteins. Four subfamilies were identified, including putative RNA N6-adenosine methyltransferases in a variety of eukaryotic organisms (shown here), and three other subfamilies with varying degrees of interrelatedness (not shown). All four subfamilies have a predicted consensus fold that is characterized by a permuted topology previously found only in the β class of DNA methyltransferases.

m6A. IME4 is a key gene in the highly regulated pathway that leads to meiosis and sporulation in S. cerevisiae (Shah and Clancy 1992). This sequence similarity raised the possibility that yeast mRNA does contain m6A, but only under nutritional conditions conducive to high expression of the IME4 gene (see below).

Biosynthesis and functional roles of methylated nucleosides 161

A common feature of the members of this predicted family of proteins is the presence of all of the methyltransferase motifs in the conserved carboxyl-termini, with amino termini that are highly diverse and which are conserved only within individual subfamilies. This is a similar relationship to the case of bacterial and mammalian DNA:m5C methyltransferases in which the mammalian members differ from their bacterial counterparts in that they contain large amino-proximal regions that play regulatory roles (Robertson 2001). The exception to this is that the MT-A70 orthologs also contain a conserved, bipartite nuclear localization signal within their otherwise non-conserved amino termini. This is consistent functionally with the experimentally determined finding that MT-A70 is a nuclear protein (Fig. 4), and only mammalian viral RNAs that have a nuclear phase of their life cycle contain m6A (Narayan and Rottman 1992). The fold recognition alignment of MT-A70 and its orthologs yield testable predictions of structure-function. Side chains of F533, D376, E531, and H543 should participate in AdoMet binding, while the conserved DPPW397 motif should play a role in catalysis. This hypothesis is tested indirectly in the experiments described below.

5 IME4 is the S. cerevisiae ortholog of MT-A70 Older data suggested that m6A was not present in the mRNA of the yeast S. cerevisiae (Sripati et al. 1976). Prior to widespread availability of genomic data, IME4 was the only known gene with any extensive homology to MT-A70. This sequence similarity raised the possibility that yeast mRNA does contain m6A, but only under nutritional conditions conducive to high expression of the IME4 gene. IME4 is a key gene in the highly regulated pathway that leads to meiosis and sporulation in S. cerevisiae (Shah and Clancy 1992). IME4 mRNA levels are very low in exponentially-growing cells and are greatly elevated during sporulation and in stationary phase cells (Gasch et al. 2000). Sporulation occurs only in the diploid MATa/MATα cell type, and requires nitrogen starvation and a respirable carbon source. Under these conditions, expression and activation of the IME1 gene and its protein product lead to the initiation of a complex cascade of gene expression (Chu et al. 1998) that culminates in the formation of four haploid ascospores within the confines of the original mother cell. The IME4 gene is important for IME1 transcript accumulation as well as downstream events, and is essential for meiosis and sporulation in standard laboratory strains. The mechanism by which Ime4p exerts its effects was previously not known, although, it appeared that it did not function as a transcriptional activator or repressor in tethering assays (B. Jursic, Y. Wang, M. Clancy, unpublished), despite early reports that it was a transcriptional activator.

162 Joseph A. Bokar

Biosynthesis and functional roles of methylated nucleosides 163

Fig. 6. Methylated nucleoside content of total RNA isolated from sporulating and nonsporulating yeast. A. Yeast grown in methionine deficient medium were labeled with [3Hmethyl-]methionine. Total RNA was prepared and analyzed by HPLC as in Figure 1. The positions of the predominant 2’-O-methylribonucleosides (Nm) are shown, as is the position of the much smaller m6A peak. The data shown represents the mean of at least nine separate experiments. (For experimental details, see Clancy et al. 2002). B. The same data is replotted on an expanded scale to better illustrate the m6A peak. C. Methylated nucleoside content of polyadenylated RNA isolated from sporulating and non-sporulating yeast. Total RNA was prepared as before, but the preparations were scaled up tenfold to yield approximately 100 µg of 3H-labelled RNA. From this, approximately 1 µg of polyadenylated RNA was isolated, containing roughly 1000 cpm 3H, and was analyzed as above. All data was standardized to account for differences in the total counts loaded onto the column. The curves represent the mean values from three independent experiments.

5.1 m6A is present in mRNA isolated from sporulating yeast The close sequence similarity between the IME4 gene and MT-A70 led to the hypothesis that Ime4p activates meiosis and spore formation by a mechanism that involves the formation of m6A in yeast RNA, possibly mRNA (Clancy et al. 2002). To test this hypothesis, yeast cells were grown in methionine-free SD medium or in sporulation medium and were labeled using L-[methyl-3H]methionine. Labeled total RNA was then treated with ribonuclease P1, nucleotide pyrophosphatase, and alkaline phosphatase. This would be expected to yield primarily mononucleosides. Any resistant polynucleotides would not interfere with the de-

164 Joseph A. Bokar

tection of m6A, as these charged moieties should elute very quickly from the Supelcosil LC-18-S column. No difference was seen in the amount of radioactivity in the peaks corresponding to the predominant methylated nucleosides (Cm, Um, Gm, and Am, Fig. 6A). Most of the less common methylated nucleosides would be expected to elute prior to ten minutes, and would be indistinguishable from the Cm, Um, and Gm peaks. The m6A peak, as expected, represented a tiny fraction of the total 3H cpm, but was easily distinguished from the predominant methylated nucleosides due to its relatively long retention time. There was in fact an increase in the amount of m6A present in the RNA prepared from the sporulating yeast cells as compared to the non-sporulating cells (Fig. 6B). Although the absolute amount of the modified nucleoside was small compared to the others, the relative difference in the amount of m6A was significant – 1.6-fold. An identical analysis was performed for polyadenylated RNA. m6A, albeit in low amounts, was clearly present in the polyadenylated RNA isolated from sporulating cells, while no m6A peak was detected in polyadenylated RNA prepared from cells growing under non-sporulating conditions (Fig. 6C). The ratio of the m6A peak to the Am peak provides evidence that the m6A seen in mRNA prepared from sporulating yeast was not simply carryover from contaminating rRNA. In total RNA, the m6A peak is approximately 0.01 times the height of the Am peak, whereas in the polyadenylated samples, the m6A peak and the Am peak are roughly equivalent. These results support the hypothesis that induction of IME4, by the appropriate genetic and nutritional signals, leads to N6-adenosine methylation in some or all polyadenylated RNA. An additional, unanticipated finding was also evident in these experiments. The amounts of the 2’-O-methylyated nucleosides, Um, Gm, and Am were substantially and consistently reduced in the polyadenylated RNA fraction from sporulated cells as compared to the control cells. It is likely that the amount of Cm was also reduced, although the Cm peak is not distinguishable from the m7G peak. This suggests that in addition to the appearance of m6A in the mRNA of yeast during sporulation, there is a significant decrease in the 2’-O-methylation of other nucleosides. However, Sripati et al. (1976) reported that there are no other methylated nucleosides in yeast mRNA, so the significance of this finding is not known. 5.2 IME4 is necessary for m6A formation in sporulating yeast In order to prove that the appearance of m6A depended on IME4, a similar experiment was done using SK1 cells containing an ime4 null mutation. SK1-ime4 cells were labeled with L-[methyl-3H]methionine exactly as in the experiments above, in methionine-free SD medium, and in sporulation medium. No m6A peak was detected in the mRNA from the SK1-ime4 cells either in control medium or in sporulation medium. Interestingly, the changes in the peaks corresponding to the 2’-O-methylnucleosides were observed just as they were for the wild type SK1 cells (Fig. 7). This suggests that IME4 is necessary for N6-adenosine methylation in the mRNA of sporulating yeast, and also indicates that the putative changes in cap methylation are independent of Ime4p activity, and of the loss of m6A.

Biosynthesis and functional roles of methylated nucleosides 165

Fig. 7. Methylated nucleoside content of polyadenylated RNA isolated from SK1-ime4 (null mutant). A. Comparison of methylated nucleosides from SK1-ime4 polyadenylated RNA prepared from the null-mutant yeast labeled in control medium and sporulation medium. B. Comparison of methylated nucleosides in polyadenylated RNA prepared from SK1-ime4 cells and wild type SK1 cells, both labeled in sporulation medium.

166 Joseph A. Bokar Table 2. Sporulation kinetics of strains containing motif IV mutations 3

Time after shift to sporulation medium (h) 8 12 29 48

72

144

Gene on plasmida Wild type 0.0 0.0 13.5 50.5 58.2 58.2 61.5 None 0.0 0.0 0.0 0.0 0.0 0.0 0.0 D348A 0.0 0.0 0.0 4.3 5.1 5.1 4.9 W351A 0.0 0.0 0.0 0.2 0.0 0.1 0.0 Transformants of the ime4-deficient MC309 strain were grown in pre-sporulation medium and shifted to sporulation medium. The percentage of the cells that had formed visible spores was determined at intervals by bright-field microscopy. At least 600 cells were counted for each determination shown. For the W351A and D348A mutants, at least 900 cells were counted. a Wild type and mutant genes were carried on the CEN-URA3 vector, pRS316.

Fig. 8. RNA interference of MT-A70 expression and its effect on HeLa cell growth. Tandem transfections of HeLa cells with a pool of four RNAi directed against the MT-A70 mRNA were performed over seven days. At days 3, 5, and 7, aliquots of transfected cells were removed and analyzed for MT-A70 mRNA levels and protein levels. A. RNase protection assay showing reduction in the MT-A70-specific 169 nucleotide protected fragment. B. Western blot analysis of cellular protein showing disappearance of the MT-A70-spcific band by day 7. The identity of the 100 kDa band immunoreactive band is not known. C. Growth curves of cells transfected with the pooled MT-A70 RNAi, transfection reagent alone (mock) or a luciferase RNAi (luc). Data represents the mean +/- S.D. from at least six independent experiments.

Biosynthesis and functional roles of methylated nucleosides 167

These experiments do not address the possibility that IME4 could exert its regulatory effect on sporulation primarily by some other function, not dependent upon its methyltransferase activity. Therefore, plasmids encoding either wild type IME4, or encoding single amino acid substitutions within the consensus methyltransferase motif IV were transferred into another ime4-deficient yeast strain (MC309) and the ability of the plasmids to complement the sporulation deficiency of this strain was assessed. Table 2 shows the results of an experiment in which the kinetics of spore formation were compared in MC309-ime4 null mutant cells carrying plasmid-borne mutant D348A or W351A alleles vs. wild type and vectoronly controls. Cells carrying the wild type allele began to form visible spores between eight and twelve hours after shift to sporulation as assessed by bright-field microscopy. By 24 hours, approximately 50% of the cells had sporulated, with final levels achieved by 48 hours. The mutant alleles, by contrast, were severely compromised in their ability to promote sporulation of the ime4-deficient strain. How might mRNA methylation in yeast, catalyzed by Ime4p, lead to induction of the sporulation pathway? Microarray studies show that more than 500 mRNAs are present at elevated levels in sporulating cells, including some encoding proteins involved in general aspects of mRNA biogenesis, as well as many that promote sporulation-specific processes (Chu et al. 1998). The above experiments suggest that IME4 activity governs part of this global response to nutritional and cell type signals by facilitating a general process such as splicing, 3'-end formation, nuclear pre-mRNA turnover, export or localization of mature mRNAs, or translation efficiency. The discovery that m6A is induced in the mRNA of sporulating yeast, along with the finding that mutations within the putative catalytic methyltransferase motifs in IME4 severely attenuate sporulation, provide the strongest evidence yet that this nucleoside modification has a critical role in a gene regulatory pathway.

6 MT-A70 is critical for viability of mammalian cell lines 6.1 RNA interference transfection The major impediment to determining whether MT-A70 (and m6A itself) has some critical function in metazoan cells has been the lack of a null system. This problem was finally overcome with the advent of RNA interference (RNAi) techniques. We showed that by performing three tandem transfections over seven days using four pooled RNAi duplexes (SmartPool, Dharmacon Inc.) MT-A70 levels fell to undetectable levels after seven days (Shambaugh and Bokar, submitted). A key to the success of this approach was to keep transfecting and passaging the cells so that they remained subconfluent throughout the seven days. With this approach, measured MT-A70 mRNA levels fell within three days of RNAi transfection, and were undetectable at day 7 (Fig. 8A). MT-A70 protein was slightly decreased at day 3, appears to be decreased by about 75% by day 5, and was absent by day 7 (Fig. 8B). The significant lag between the decrease in mRNA level and

168 Joseph A. Bokar

the decrease in protein level suggests that the MT-A70 protein is relatively stable, although direct measurements of MT-A70 protein stability were not performed. During the course of these RNAi transfection experiments, it became apparent that the cells transfected with the MT-A70 SmartPool RNAi began to grow more slowly toward the end of the experiment than control cells that were transfected with a negative control RNAi, or with the transfection agent alone. The growth curves start to separate at day 5, and there are half as many cells in the MT-A70 RNAi-transfected sample at day 7 compared to the mock-transfected or luciferase RNAi-transfected samples (Fig. 8C). These changes appear to be coincident with the disappearance of MT-A70 protein. In separate experiments performed to test the concentration-dependence of the RNAi on cell growth, it was apparent that significant effects on cell growth were seen only under conditions in which MTA70 expression was completely inhibited, indicating that cells can tolerate a significant reduction in MT-A70 level, but not complete absence of MT-A70, without a significant change in growth rate. 6.2 Loss of MT-A70 leads to HeLa cell apoptosis In order to determine whether the loss of MT-A70 led to cell cycle arrest, apoptosis, necrosis, or a combination of these, flow cytometry was performed. The proportion of cells in each sample that fell into sub-G1, G0/G1, S-phase, and G2 were estimated from the PI DNA content profiles by analyzing the flow cytometry profiles of three independent experiments using ModFit LT (Verity Software House). There was no significant difference in the proportions of cells in G0/G1, S-phase, and G2 (not shown), however, the proportion of cells that had subdiploid DNA content was consistently two- to threefold higher in MT-A70 RNAitransfected cells at all three time points compared to mock-transfected or luciferase RNAi-transfected cells (Fig. 9A). This result indicates that the loss of MT-A70 does not lead to cell cycle arrest, but instead causes cell death. Fig. 9. (overleaf). Inhibition of MT-A70 expression results in HeLa cell apoptosis. In order to determine whether the loss of MT-A70 led to cell cycle arrest, apoptosis, necrosis, or a combination of these, flow cytometry was performed. A. and C. Aliquots of cells were removed prior to transfection on days 3, 5, and 7, fixed in MeOH and stained with propidium iodide for analysis of DNA content by flow cytometry. There was no significant difference in the proportions of cells in G0/G1, S-phase, and G2 (not shown), however, the proportion of cells that had sub-diploid DNA content was consistently two- to threefold higher in MTA70 RNAi-transfected cells at all three time points compared to mock-transfected or luciferase RNAi-transfected cells. B. and D. Double fluorescent labeling of unfixed, transfected or control cells were analyzed by flow cytometry. Healthy cells stain for neither (quadrant a), dead cells stain for both (quadrant c), or are excluded from the analysis by the flow cytometry light scatter windows. Cells undergoing apoptosis, but still maintaining membrane integrity stain for Annexin V (quadrant b). There is a significant increase in apoptotic cells in the sample transfected with RNAi targeting MT-A70, compared to mocktransfected or control cells transfected with RNAi targeting luciferase mRNA (a mRNA not found in HeLa cells).

Biosynthesis and functional roles of methylated nucleosides 169

PI/Annexin V double staining of unfixed cells was performed to distinguish between cell necrosis and apoptosis (Fig. 9B, 9C). An early event in apoptosis is translocation of the phospholipid phosphatidylserine from the inner face of the plasma membrane to the surface. Annexin V is a serum protein that has a high affinity for phosphatidylserine, therefore, cell surface binding of fluorescent-labeled Annexin V-PE (Biovision) followed by fluorescent detection with flow cytometry

170 Joseph A. Bokar

serves as a sensitive assay for cells undergoing apoptosis. Additionally, propidium iodide staining of unfixed cells is a useful tool to discriminate viable cells (which exclude PI) from non-viable cells (which do not). Therefore, unaffected cells (Annexin V -/PI -), cells undergoing early apoptosis (Annexin V +/PI -), late apoptosis (Annexin V +/ PI+), and dead cells (Annexin V -/PI +) can be differentiated. No significant increase in early apoptotic cells was apparent at day 3, when MT-A70 levels remain at roughly 50% of normal. By day 5, there is about a twofold increase, and at day 7, a threefold increase in cells undergoing early apoptosis, coincident with disappearance of MT-A70 protein (Fig. 9D). There is also an approximate twofold increase in non-viable cells (those staining with PI) in the MTA70 RNAi-transfected cells (not shown). The appearance of Annexin V +/PI- cells is a strong indicator that the mechanism of cell death due to MT-A70 depletion is apoptosis, not necrosis. The mechanism by which the loss of MT-A70 leads to apoptosis remains under investigation.

7 Conclusion m6A has been known to be a common feature of eukaryotic mRNA for many years. Mutational analyses of m6A sites in a few viral and cellular mRNAs failed to establish its function. Indirect assessments, which relied on methylation inhibitors with pleiotropic effects, suggested that m6A somehow affects the efficiency of pre-mRNA splicing or transport. Unfortunately, it was impossible to firmly establish that these effects were due exclusively to inhibition of m6A formation, and not to cap methylation or methylation of other RNAs or proteins involved in premRNA processing. Surprisingly, the first direct evidence that m6A plays a gene regulatory role at a post-transcriptional level came from the yeast, S. cerevisiae, which had been thought to not contain any m6A in its mRNA. Abolishment of m6A via inactivation of the IME4 gene led to a block in the switch from mitotic growth to meiotic growth. In mammalian cells, m6A appears to be even more critical. Complete inhibition of MT-A70 expression, via RNAi technology, leads to cell death via apoptosis. Both of these experiments provide strong support that m6A in mRNA participates in critical gene regulatory functions, but the specific mechanisms by which it accomplishes this remain obscure. Additional studies will be necessary to establish whether these effects are mediated by m6A’s participation in global mRNA metabolism, such as splicing or transport, or whether it has some more selective effects on only a subset of mRNA. The molecular mechanisms by which these effects are mediated still remain obscure. In conjunction with the well-established roles of m7G, and 2’-O-methylnucleosides contained within the mRNA cap structure, these observations illustrate the extremely important functional roles that modified nucleosides in mRNA play in the complex processes of post-transcriptional regulation of gene expression in eukaryotic organisms.

Biosynthesis and functional roles of methylated nucleosides 171

References Adams BL, Morgan M, Muthukrishnan S, Hecht SM, Shatkin AJ (1978) The effect of “cap” analogs on reovirus mRNA binding to wheat germ ribosomes. J Biol Chem 253:2589-2595 Adams JM, Cory S (1975) Modified nucleosides and bizarre 5’-terminus of HeLa cell messenger RNA. Nature 255:28-33 Aloni Y, Dhar R, Khoury G (1979) Methylation of nuclear simian virus 40 RNAs. J Virol 32:52-60 Bachellerie JP, Amalric F, Caboche M (1978) Biosynthesis and utilization of extensively undermethylated poly (A)+ RNA in CHO cells during a cycloleucine treatment. Nucleic Acids Res 5:2927-2943 Banerjee AK (1980) 5' - terminal cap structure in eucaryotic messenger ribonucleic acids. Microbiological Reviews 44:175-205 Bangs JD, Crain PF, Hashizume T, McCloskey JA, Boothroyd JC (1992) Mass spectrometry of mRNA cap 4 from trypanosomatids reveals two novel nucleosides. J Biol Chem 267:9805-15 Bartkoski MJ, Roizman B (1978) Regulation of herpesvirus macromolecular synthesis VII Inhibition of internal methylation of mRNA late in infection. Virology 85:146-156 Beelman CA, Stevens A, Caponigro G, Lagrandeur TE, Hatfield L, Fortner DM, Parker R (1996) An essential component of the decapping enzyme required for normal rates of mRNA turnover. Nature 382:642-646 Beemon K, Keith J (1977) Localization of N6-methyladenosine in the Rous sarcoma virus genome. J Mol Biol 113:165-179 Bokar JA, Rath-Shambaugh ME, Ludwiczak RL, Narayan P, Rottman FM (1994) Characterization and partial purification of mRNA N6-adenosine methyltransferase from HeLa cell nuclei. J Biol Chem 269:17697-17704 Bokar JA, Shambaugh ME, Polayes D, Matera AG, Rottman FM (1997) Purification and cDNA cloning of the AdoMet-binding subunit of the human mRNA (N6-adenosine)methyltransferase. RNA 3:1233-1247 Bujnicki JM, Rychlewski L (2001) Reassignment of specificities of two cap methyltransferase domains in the reovirus lambda 2 protein. Genome Biol 2:9 Bujnicki JM, Feder M, Radlinska M, Rychlewski L (2001) mRNA:guanine-N7 cap methyltransferases: identification of novel members of the family, evolutionary analysis, homology modeling, and analysis of sequence-structure-function relationships. BMC Bioinformatics 2:2 Bujnicki JM, Feder M, Radlinska M, Blumenthal RM (2002) Structure prediction and phylogenetic analysis of a functionally diverse family of proteins homologous to the MTA70 subunit of the human mRNA: m6A methyltransferase. J Mol Evol 55:431-444 Camper SA, Albers RJ, Coward JK, Rottman FM (1984) Effect of undermethylation on mRNA cytoplasmic appearance and half-life. Mol Cell Biol 4:538-543 Canaani D, Kahana C, Lavi S, Groner Y (1979) Identification and mapping of N6methyladenosine containing sequences in simian virus 40 RNA. Nucleic Acids Res 6:2879-2899 Carberry SE, Friedland DE, Rhoads RE, Goss DJ (1992) Binding of protein synthesis initiation factor 4E to oligoribonucleotides: effects of cap accessibility and secondary structure. Biochemistry 31:1427-1432

172 Joseph A. Bokar Carroll SM, Narayan P, Rottman FM (1990) N6-methyladenosine residues in an intronspecific region of prolactin pre-mRNA. Mol Cell Biol 10:4456-4465 Chen SN, Habib G, Yang CY, Gu ZW, Lee BR, Weng S, Silberman SR, Cai SJ, Deslypere JP, Rosseneu M, Gotto AM, Li WG, Can L (1987) Apolipoprotein B-48 is the product of a messenger RNA with an organ-specific in-frame stop codon. Science 238:363-366 Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, Herskowitz I (1998) The transcriptional program of sporulation in budding yeast. Science 282:699-705 Clancy MJ, Shambaugh, ME, Timpte CS, Bokar JA (2002) Induction of sporulation in Saccharomyces cerevisiae leads to the formation of N6-methyladenosine in mRNA: a potential mechanism for the activity of the IME4 gene. Nucleic Acids Res 30:4509-4518 Cohen N, Sharma M, Kentsis A, Perez JM, Strudwick S, Borden KL (2001) PML RING suppresses oncogenic transformation by reducing the affinity of eIF4E for mRNA. EMBO J 20:4547-4559 Colot HV, Stutz F, Rosbash M (1996) The yeast splicing factor mud13p is a commitment complex component and corresponds to CBP20 the small subunit of the nuclear capbinding complex. Genes Dev 10:1699-1708 Cong P, Shuman S (1992) Methyltransferase and subunit association domains of vaccinia virus mRNA capping enzyme. J Biol Chem 267:16424-16429 Csepany T, Lin A, Baldick CJ, Beemon K (1990) Sequence specificity of mRNA N6adenosine methyltransferse. J Biol Chem 265:20117-20122 Desrosiers RC, Friderici KH, Rottman FM (1974) Identification of methylated nucleosides in messenger RNA from Novikoff hepatoma cells. Proc Natl Acad Sci USA 71:39713975 Desrosiers RC, Friderici KH, Rottman FM (1975) Characterization of Novikoff hepatoma mRNA methylation and heterogeneity in the methylated 5' terminus. Biochem 14:4367-4374 Dimock K, Stoltzfus CM (1977) Sequence specificity of internal methylation in B77 avian sarcoma virus RNA subunits. Biochemistry 16:471-478 Dottin RP, Weiner AM, Lodish HF (1976) 5’-Terminal nucleotide sequences of the messenger RNAs of Dictyostelium discoideum. Cell 8:233-244 Dubin DT, Stollar V (1975) Methylation of sindbus virus "26s" messenger RNA. Biochem Biophys Res Comm 66:1373-1379 Dubin DT, Taylor RH (1975) The methylation state of poly A-containing-messenger RNA from cultured hamster cells. NAR 2:1653-1668 Dubin DT, Stollar V, Hsuchen CC, Timko K, Guild GM (1977) Sindbus virus messenger RNA: The 5' termini and methylated residues of 26 and 42 S RNA. J Virology 77:457470 Dunckley T, Parker R (1999) The DCP2 protein is required for mRNA decapping in Saccharomyces cerevisiae and contains a functional MutT motif. EMBO J18:5411-5422 Ensinger MJ, Moss B (1976) Modification of the 5’ terminus of mRNA by an RNA (Guanine-7-)-methyltransferase from HeLa cells. J Biol Chem 251:5283-5291 Feder M, Pas J, Wyrwicz LS, Bujnicki JM (2003) Molecular phylogenetics of the RrmJ/fibrillarin superfamily of ribose 2'-O-methyltransferases. Gene 302:129-138 Fabrega C, Hausmann S, Shen V, Shuman S, Lima CD (2004) Structure and mechanism of mRNA cap (guanine-N7) methyltransferase. Mol Cell 13:77-89 Fakan S (1994) Perichromatin fibrils are in situ forms of nascent transcripts. Trends in Cell Biology 4:86-90

Biosynthesis and functional roles of methylated nucleosides 173 Finkel D, Groner Y (1983) Methylations of adenosine residues (m6A) in pre-mRNA are important for formation of late simian virus 40 mRNAs. J Virology 131:409-425 Fresco LD, Buratowski S (1996) Conditional mutants of the yeast messenger RNA capping enzyme show that the cap enhances, but is not required for, messenger RNA splicing. RNA 2:584-596 Furuichi Y, Morgan MA, Shatkin AJ, Jelinek W, Salditt-Georgieff M, Darnell JE (1975) Methylated, blocked 5’ termini in HeLa cell mRNA. Proc Natl Acad Sci USA 72:1904-1908 Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Bio Cell 11:4241-4257 Gillian-Daniel DL, Gray NK, Astrom J, Barkoff A, Wickens M (1998) Modifications of the 5’ cap of mRNAs during Xenopus oocyte maturation: Independence from changes in poly(A) length and impact on translation. Mol Cell Biol 18:6152-6163 Grosjean H, Edqvist J, Straby KB, Giege R (1996) Enzymatic formation of modified nucleosides in tRNA: dependence on tRNA architecture. J Mol Biol 255:67-85 Grosjean H, Szweykowska-Kulinska Z, Motorin Y, Fasiolo F, Simos G (1997) Introndependent enzymatic formation of modified nucleosides in eukaryotic tRNAs: a review. Biochimie 79:293-302 Hager J, Staker BL, Bugl H, Jakob U (2002) Active site in RrmJ, a heat shock-induced methyltransferase. J Biol Chem 277:41978-41986 Hamm J, Mattaj IW (1990) Monomethylated cap structures facilitate RNA export from the nucleus. Cell 63:109-118 Harper JE, Miceli SM, Roberts RJ, Manley JL (1990) Sequence specificty of the human mRNA N6-adenosine methylase in vitro. Nucleic Acids Res 18:5735-5741 Higman MA, Bourgeois N, Niles EG (1992) The vaccinia virus mRNA (guanine -N7-)methyltransferase requires both subunits of the mRNA capping enzyme for activity. J Biol Chem 267:16430-16437 Higman MA, Niles EG (1994) Location of the S-adenosyl-L-methionine binding region of the vaccinia virus mRNA (guanine-7-) methyltransferase. J Biol Chem 269:1498214987 Higman MA, Christen LA, Niles EG (1994) The mRNA (guanine-7-) methyltransferase domain of the vaccinia virus mRNA capping enzyme: expression in Escherichia coli and structural and kinetic comparison to the intact capping enzyme. J Biol Chem 269:14974-14981 Hirano K, Young SG, Farese RV Jr, Ng J, Sande E, Warburton C, Powell-Braxton LM, Davidson NO (1996) Targeted disruption of the mouse apobec-1 gene abolishes apolipoprotein B mRNA editing and eliminates apolipoprotein B48. J Biol Chem 271:98879890 Ho CK, Schwer B, Shuman S (1998) Genetic, physical, and functional interactions between the triphosphatase and guanylyltransferase components of the yeast mRNA capping apparatus. Mol Cell Biol 18:5189-5198 Ho CK, Sriskanda V, McCracken S, Bentley D, Schwer B, Shuman S (1998) The guanylyltransferase domain of mammalian mRNA capping enzyme binds to the phosphorylated carboxyl-terminal domain of RNA polymerase II. J Biol Chem 273:9577-9585 Horowitz S, Horowitz A, Nilsen TW, Munns TW, Rottman FM (1984) Mapping of N6methyladenosine residues in bovine prolactin mRNA. Proc Natl Acad Sci USA 81:5667-5671

174 Joseph A. Bokar Izaurralde E, Lewis J, McGuigan M, Jankowska E, Darzynkiewicz, IW Mattaj (1994) A nuclear cap binding protein complex involved in pre-mRNA splicing. Cell 78:657-668 Jaeger JA, Turner DH, Zuker M (1989) Improved predictions of secondary structure for RNA. Proc Natl Acad Sci USA 86:7706-7710 Jaeger JA, Turner DH, Zuker M (1989) Predicting optimal and suboptimal secondary structure for RNA. In "Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences", RF Doolittle ed Methods in Enzymology, 183, 281-306 Jiang HQ, Motorin Y, Jin X, Grosjean H (1997) Pleiotropic effects of intron removal on base modification pattern of yeast tRNAPhe: an in vitro study. Nucleic Acids Res 25:2694-2701 Kane SE, Beemon K (1985) Precise localization of m6A in Rous sarcoma virus RNA reveals clustering of methylation sites: implications for RNA processing. Mol Cell Biol 5:2298-2306 Kane SE, Beemon K (1987) Inhibition of methylation at two internal N6-methyladenosine sites caused by GAC to GAU mutations. J Biol Chem 262:3422-3427 Keith JM, Ensinger MJ, Moss B (1978) HeLa cell RNA (2’-O-methyladenosine-N6-)methyltransferase specific for the capped 5’-end of messenger RNA. J Biol Chem 253:5033-5041 Konarska MM, Padgett RA, Sharp PA (1984) Recognition of cap structure in splicing in vitro of mRNA precursors. Cell 38:731-36 Kuge H, Richter JD (1995) Cytoplasmic 3' poly (A) addition induces 5' cap ribose methylation: implications for translational control of maternal RNA. EMBO J 14:6301-6310 Kuge H, Brownlee GG, Gershon PD, Richter JD (1998) Cap ribose methylation of c-mos mRNA stimulates translation and oocyte maturation in Xenopus laevis. NAR 26:32083214 Langberg SR, Moss B (1981) Post-transcriptional modifications of mRNA: purification and characterization of cap 1 and cap2 RNA (nucleoside-2’-)-methyltransferases from HeLa cells. J Biol Chem 256:10054-10060 Levis R, Penman S (1978) 5’-terminal structures pf poly (A)+ cytoplasmic messenger RNA and of Poly (A)+ and poly (A)- heterogeneous nuclear RNA of cells of the dipteran Drosophila melanogaster. J Mol Biol 120:487-515 Lewis JD, Gorlich D, Mattaj IW (1996) A yeast cap binding protein complex (yCBC) acts at an early step in pre-mRNA splicing. Nucleic Acids Res 24:3332-3336 Liou RF, Blumenthal T (1990) mRNAs with trimethylguanosine caps result from transsplicing in Caenorhabditis elegans. Mol Cell Biol 10:1764-1768 Liu H, Rodgers ND, Jiao X, Kiledjian M (2002) The scavenger mRNA decapping enzyme DcpS is a member of the HIT family of pyrophosphatases. EMBO 21:4699-4708 Malone T, Blumenthal RM, Cheng X (1995) Structure-guided analysis reveals nine sequence motifs conserved among DNA amino-methyltransferases, and suggests a catalytic mechanism for these enzymes. J Mol Biol 253:618-632 Mair G, Ullu E, Tschudi C (2000) Cotranscriptional cap 4 formation on the Trypanosoma brucei spliced leader RNA. J Biol Chem 275:28994-28999 Maroney PA, Denker JA, Darzynkiewicz E, Laneve R, Nilsen TW (1995) Most mRNAs in the nematode Ascaris lumbricoides are trans-spliced: A role for spliced leader addition in translational efficiency. RNA 1:714-723 Martin SA, Paoletti E, Moss B (1975) Purification of mRNA guanylyltransferase and mRNA (guanine-7-)methyltransferase from vaccinia virions. J Biol Chem 250:93229329

Biosynthesis and functional roles of methylated nucleosides 175 McCracken S, Fong N, Yankulov K, Ballantyne S, Pan G, Greenblatt J, Patterson SD, Wickens M, Bentley DL (1997) The C-terminal domain of RNA polymerase II couples mRNA processing to transcription. Nature 385:357-361 Melcher T, Maas S, Higuchi M, Keller W, Seeburg PH (1995) Editing of a-amino-3hydroxy-5-methylisoxazole-4-propionic acid receptor GluR-B pre-mRNA in vitro reveals site-selective adenosine to inosine conversion. J Biol Chem 270:8566-8570 Merrick WC, Hershey JWB, in Translational Control, JWB Hershey, MB Mathews, N Sorenbberg, Eds. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York,1996) pp. 31-69; RJ Jackson, ibid., pp. 71-112; VM Pain, Eur J Biochem 236, 747 (1996) Misteli T, Spector DL (1996) Serine/threonine phosphatase 1 modulates the subnuclear distribution of pre-mRNA splicing factors. Mol Biol Cell 7:1559-1572 Mizumoto K, Kaziro Y (1987) Messenger RNA capping enzymes from eukaryotic cells. Prog in Nucleic Acid Res 34:1-28 Myette JR, Niles EG (1996) Characterization of the vaccinia virus RNA 5’-triphosphatase and nucleoside triphosphate phosphohydrolase activities: demonstration that both activities are carried out at the same active site. J Biol Chem 271:11945-11952 Narayan P, Ayers DF, Rottman FM, Maroney PA, Nilsen TW (1987) Unequal distribution of N6-methyladenosine in influenza virus mRNAs. Mol and Cell Biol 7:1572-1575 Narayan P, Rottman FM (1988) An in vitro system for accurate methylation of internal adenosine residues in messenger RNA. Science 242:1159-1162 Narayan P, Rottman FM (1992) Methylation of mRNA. In: Advances in Enzymology and Related Areas of Molecular Biology, A Meister, ed, John Wiley and Sons, Inc pp 255285 Narayan P, Ludwiczak RL, Goodwin E, Rottman FM (1994) Context effects of N6adenosine methylation sites in prolactin mRNA. Nucleic Acids Res 22:419-426 Nichols JL (1979) “Cap” structures in maize poly(A)-containing RNA. Biochim Biophys Acta 563:490-495 Niedzwiecka A, Marcotrigiano J, Stepinski J, Jankowska-Anyszka M, WyslouchCieszynska A, Dadlez M, Gingras AC, Mak P, Darzynkiewicz E, Sonenberg N, Burley SK, Stolarski R (2002) Biophysical studies of eIF4E cap-binding protein: recognition of mRNA 5’ cap structure and synthetic fragments of eIF4G and 4E-BP1 proteins. J Mol Biol 319:615-635 O’Mullane L, Eperon IC (1998) The pre-mRNA 5’ cap determines whether U6 small nuclear RNA succeeds U1 small nuclear ribonucleoprotein particle at 5’ splice sites. Mol Cell Biol 18:7510-7520 Perry RP, Kelley DE, Fridirici K, Rottman FM (1975) The methylated constituents of L cell messenger RNA: Evidence for an unusual cluster at the 5'-terminus. Cell 4:387-394 Perry RP, Scherrer K (1975) Methylated constituents of globin mRNA. FEBS Lett 57:7378 Perry RP, Kelley DE (1976) Kinetics of formation of 5’ terminal caps in mRNA. Cell 8:433-442 Piccirillo C, Khanna R, Kiledjian M (2003) Functional characterization of the mammalian mRNA decapping enzyme hDcp2. RNA 9:1138-1147 Polson AG, Bass BL, Casey JL (1995) RNA editing of hepatitus delta virus antigenome by dsRNA-adenosine deaminase. Nature 380:454-456

176 Joseph A. Bokar Powell LM, Wallis SC, Pease RJ, Edwards YH, Knott TJ, Scott J (1987) A novel form of tissue-specific RNA processing produces apolipoprotein-B48 in intestine. Cell 50:831840 Rana AP, Tuck MT (1990) Analysis and in vitro localization of internal methylated adenine residues in dihydrofolate reductase mRNA. Nucleic Acids Res 18:4803-4807 Reddy R, Singh R, Shimba S (1992) Methylated cap structures in eukaryotic RNAs: structure, synthesis and functions. Pharm Therapeutics 54:249-267 Reinisch KM, Nibert ML, Harrison SC (2000) Structure of the reovirus core at 3.6Åresolution. Nature 404:960-967 Rhoads RE, Hellmann GM, Remy P, Ebel JP (1983) Translational recognition of messenger ribonucleic acid caps as a function of pH. Biochem 22:6084-6088 Robertson KD (2001) DNA methylation, methyltransferases and cancer. Oncogene 20:3139-3155 Rose AM, Belford HG, Shen WC, Greer CL, Hopper AK, Martin NC (1995) Location of N2,N2-dimethylguanosine-specific tRNA methyltransferase. Biochimie 77:45-53 Rottman FM, Shatkin AJ, Perry RP (1974) Sequences containing methylated nucleotides at the 5'-termini of messenger RNAs: Possible implications for processing. Cell 3:197199 Saha N, Schwer B, Shuman S (1999) Characterization of human, Schizosaccharomyces pombe, and Candida albicans mRNA cap methyltransferases and complete replacement of the yeast capping apparatus by mammalian enzymes. J Biol Chem 274:1655316562 Schibler U, Kelley DE, Perry RP (1977) Comparison of methylated sequences in messenger RNA and heterogeneous nuclear RNA from mouse L cells. J Mol Biol 115:695-714 Schnierle BS, Gershon PD, Moss B (1992) Cap-specific mRNA (nucleoside-O2'-)methyltransferase and poly(A) polymerase stimulatory activities of vaccinia virus are mediated by a single protein. Proc Natl Acad Sci USA 89:2897-2901 Schwer B, Shuman S (1996) Conditional inactivation of messenger RNA capping enzyme affects yeast pre-messenger RNA splicing in vivo. RNA 2:574-583 Seidel BL, Somberg EW (1978) Characterization of Neurospora crassa polyadenylated messenger ribonucleic acid structure of the 5’ terminus. Biochem Biophys Res Commun 187:108-112 Shah JC, Clancy MJ (1992) IME4, a gene that mediates MAT and nutritional control of meiosis in Saccharomyces cerevisiae. Mol Cell Biol 12:1078-1086 Shambaugh ME, Bokar JA Loss of the mRNA N6-methyladenosine methyltransferase, MTA70, leads to apoptosis in HeLa cells. (submitted) Shatkin AJ (1976) Capping of eukaryotic mRNAs. Cell 9:645-653 Shuman S (1995) Capping enzyme in eukaryotic mRNA synthesis. Prog Nucleic Acid Res Mol Biol 50:101-129 Somner S, Salditt-Georgieff M, Bachenheimer S, Darnell JE, Furuichi Y, Morgan M, Shatkin AJ (1976) The methylation of adenovirus-specific nuclear and cytoplasmic RNA. NAR 3:749-765 Sripati CE, Groner Y, Warner JR (1976) Methylated, blocked 5’ termini of yeast mRNA. J Biol Chem 251:2898-2904 Stoltzfus CM, Dane RW (1982) Accumulation of spliced avian retrovirus mRNA is inhibited in S-adenosylmethionine-depleted chicken embryo fibroblasts. J Virology 42:918931

Biosynthesis and functional roles of methylated nucleosides 177 Tsukamoto T, Shibagaki Y, Imajoh-Ohmi S, Murakoshi T, Suzuki M, Nakamura A, Gotoh H, Mizumoto K (1997) Isolation and characterization of the yeast mRNA capping enzyme beta subunit gene encoding RNA 5'-triphosphatase, which is essential for cell viability. Biochem Biophs Res Commun 239:116-122 Tucker M, Parker R (2000) Mechanisms and control of mRNA decapping in Saccharomyces cerevisiae. Annu Rev Biochem 69:571-595 van Doren K, Hirsch D (1990) mRNAs that mature through trans-splicing in Caenorhabditis elegans have a trimethylguanosine cap at their 5’ terminus. Mol Cell Biol 10:17691772 Visa N, Alzhanova-Ericsson AT, Sun X, Kiseleva E, Bjorkroth B, Wurtz T, Daneholt B (1996) A pre-mRNA-binding protein accompanies the RNA from the gene through the nuclear pores and into polysomes. Cell 84:253-264 Von der Haar T, Ball PD, McCarthy JEG (2000) Stabilization of eukaryotic initiation factor 4E binding to the mRNA 5’cap by domains of eIF4G. J Biol Chem 275:30551-30555 Von der Haar T, Gross JD, Wagner G, McCarthy JEG (2004) The mRNA cap-binding protein eIF4E in post-transcriptional gene expression. Nat Struct Mol Biol 11:503-511 Wei CM, Gershowitz A, Moss B (1975) Methylated nucleotides block 5' terminus of HeLa cell messenger RNA. Cell 4:379-386 Wei CM, Gershowitz A, Moss B (1976) 5'-terminal and internal methylated nucleotide sequences in HeLa cell mRNA. Biochem 15:397-401 Wei CM, Moss B (1977) Nucleotide sequences at the N6-methyladenosine sites of HeLa cell messenger ribonucleic acid. Biochem 16:1672-1676 Wen Y, Yue Z, Shatkin AJ (1998) Mammalian capping enzyme binds RNA and uses protein tyrosine phosphatase mechanism. Proc Natl Acad Sci USA 95:12226-12231 Xie K, Sowden MP, Dance GSC, Torelli AT, Smith HC, Wedekind, JE (2004) The structure of a yeast RNA-editing deaminase provides insight into the fold and function of activation-induced deaminase and APOBEC-1. Proc Natl Acad Sci USA 101:81148119 Yamada-Okabe T, Mio T, Matsui M, Kashima Y, Arisawa M, Yamada-Okabe H (1998) Isolation and characterization of the Candida albicans gene for mRNA 5'triphosphatase: association of mRNA 5'-triphosphatase and mRNA 5'guanylyltransferase activities is essential for the function of mRNA 5'-capping enzyme in vivo. FEBS Lett 435:49-54

Bokar, Joseph A. Division of Hematology/Oncology, Department of Medicine and Center for RNA Molecular Biology, Case Western Reserve University, School of Medicine, Cleveland, OH 44106, USA [email protected]

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs Achim Dickmanns and Ralf Ficner

Abstract The biogenesis of spliceosomal UsnRNPs in higher eukaryotes involves a nucleocytoplasmic shuttling cycle. After transcription and processing in the nucleus, the 7 m G-cap-dependent export of the snRNAs U1, U2, U4, and U5 to the cytoplasm occurs. In the cytoplasm, these UsnRNAs specifically associate with seven Smproteins and form a doughnut-shaped snRNP core structure. This assembly, medi7 ated by the SMN complex, is a prerequisite for the hypermethylation of the m Gcap to the 2,2,7-trimethylguanosine (m3G)-cap. Snurportin1 (SPN1), specifically, recognises the m3G-cap and facilitates the nuclear import of UsnRNPs. The recently determined crystal structure of human SPN1 reveals a significantly differ7 ent binding mode for the cap structure in comparison to that of the m G-binding proteins CBC, eIF4E and VP39.

1 Introduction In eukaryotic cells, mRNAs are generally transcribed as pre-mRNAs. One hallmark of the eukaryotic genome is the separation of genes into coding regions of variable length, by intervening non-coding regions. Thus, the newly synthesized pre-mRNAs also carry coding sequences, exons, and non-coding regions, introns. Before export to and subsequent translation in the cytoplasm, the introns are removed by the spliceosome (see also Chapter 7), a highly dynamic large ribonucleoprotein complex. Two forms of spliceosomes have been identified, the major spliceosome, which is responsible for the majority of the pre-mRNA splicing events, namely splicing of the so called U2 type introns, and the minor spliceosome processes a rare class of pre-mRNA introns, the U12 type (Burge et al. 1999). Essential components of the major spliceosome are the five uridyl-rich small nuclear ribonucleoprotein particles (UsnRNPs) U1, U2, U4, U5, and U6 and several non-snRNP proteins (Hartmuth et al. 2002; Makarov et al. 2002). Each UsnRNP is composed of a UsnRNA and a set of seven proteins common, the Sm proteins, for U1, U2, U4, U5, or highly homologous proteins to those seven, the Lsm proteins, for U6. Additionally each U snRNP acquires a subset of particle specific proteins (reviewed in Will and Lührmann 2001).

Topics in Current Genetics H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106799 / Published online: 27 January 2005 © Springer-Verlag Berlin Heidelberg 2005

2 Achim Dickmanns and Ralf Ficner

2 snRNP biogenesis During U1, U2, U4, and U5snRNP biogenesis in metazoans from snRNA into fully assembled particles, they undergo one round of shuttling between the nuclear and cytoplasmic compartment and return into, and localize in subnuclear compartments within the nucleus. After transcription, addition of an m7G-cap to the 5’-end, and 3’-end modifications, the RNA is exported into the cytoplasm where the Sm proteins assemble to it. Interestingly, before reimport into the nucleus a second round of RNA modification takes place at both ends. The 3’ end is trimmed to the final length and the 5’ cap is hypermodified into a trimethyl-2,2,7G-cap. After reimport, the UsnRNA is modified a third time, in order to allow assembly of additional, particle-specific proteins, before the mature UsnRNP is formed. During both transport processes the structures of the RNA caps are important determinants mediating the transfer. In the following, we will describe in more detail the biogenesis of mammalian UsnRNPs especially focusing on the transport processes and structural aspects of recognition. 2.1 Transcription of snRNAs In eukaryotes, the site for RNA transcription and maturation lies within the nucleus. Three RNA polymerases (RNA Pols) have been identified and their responsibilities assigned. RNA Pol II exclusively transcribes all RNAs to be translated, whereas the untranslated RNAs are transcribed by all three RNA Pols. RNA Pol I transcribes 5.8S, 8S and 28S rRNA in the nucleolus and RNA Pol III is responsible for transcription of tRNAs and 5S rRNA (for survey, see Paule and White 2000). Like mRNAs, the U1, U2, U4, and U5 snRNAs are transcribed by RNAPol II, but the fate of the transcribed RNA is already determined by their promoters due to a simple structure comprising proximal and distal sequence elements (for a more detailed review, see Hernandez 2001). The primary transcripts are modified at both, the 5’ and 3’ end (for review, see Cougot et al. 2004). The snRNAs acquire a guanosine to the first nucleotide at the 5’ end of the RNA via a 5’-5’triphosphate linkage forming a GpppG/A-cap structure (Shatkin 1976; Coppola et al. 1983), catalyzed by a triphosphatase and a guanylyltransferase. Subsequently the cap guanosine is methylated at N7 by a methyltransferase (see also Chapter 5 for more details; for survey, see Shuman 2002). The three enzyme activities are part of the cellular capping apparatus. The addition and methylation of the cap occurs co-transcriptionally (Salditt-Georgieff et al. 1980) to all RNA Pol II transcripts. The m7G-cap structure protects the RNA to 5’ exonuclease activity (Furuichi et al. 1977; Shimotohno et al. 1977; Murthy et al. 1991) and is the important determinant for UsnRNA export (Shatkin 1976). The m7G-cap of UsnRNAs as well as mRNAs (see Chapter 5) is specifically recognized within the nucleus by the cap binding complex (CBC; Ohno et al. 1990; Jarmolowski et al. 1994). CBC is a heterodimer of the CBP20 and CBP80 proteins, and dimer-formation seems to be necessary for CBP20 to bind the RNA

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 3

(Izaurralde et al. 1994). Furthermore, it has been shown that CBP20 is directly interacting with the cap structure (Izaurralde et al. 1995). CBP20 belongs to the large family of ‘RNP motif’ or ‘RNA recognition motif’ containing proteins. The crystal structure of the complex of m7G-cap and CBP20 revealed a stable interaction with the m7G sandwiched between two tyrosines (see below for details). UsnRNAs are transcribed as pre-UsnRNAs with 3’ extensions of various lengths and the 3’ trimming of UsnRNAs require nuclear and cytoplasmic processing steps. The pre-UsnRNAs are processed at the 3’ end depending on a 3’ box element downstream of the RNA coding sequence requiring the C-terminal domain (CTD) of the large subunit RNA Pol II (Medlin et al. 2003; Uguen and Murphy 2003; Jacobs et al. 2004). The 5´cap formation has been shown to enhance the 3’ end processing (Uguen and Murphy 2004), supporting the idea that binding of the CBC to the 5’end cap structure might be involved like in mRNA 3’end processing (Flaherty et al. 1997). In contrast to U1, U2, U4, and U5snRNAs, the U6snRNA is transcribed by RNA Pol III (Kunkel et al. 1986; Krol et al. 1987; Reddy et al. 1987) and acquires a γ-monomethyl-triphosphate cap structure at its 5’ end. Subsequent assembly of the U6snRNP is thought to occur completely within the nucleus. U6snRNA assembles stably with a heptameric ring of Sm-like (LSm) proteins. Eight LSm proteins have been identified and depending on the composition of the heptameric ring the localization differs. U6snRNA assembled with LSm proteins 2-8 is involved in pre-mRNA splicing, while U6snRNA associated with LSm proteins 1-7 has been shown to localize in foci in the cytoplasmic compartment. Co-staining experiments showed that the foci also contain the decapping enzyme hDcp1/2 and the exonuclease hXrn1, key factors in mRNA degradation, suggesting a role in mRNA degradation of the cytoplasmic form of U6snRNP (Ingelfinger et al. 2002). 2.2 m7G-dependent nuclear export of UsnRNAs For further assembly, the m7G capped UsnRNAs bound to CBC are transported into the cytoplasmic compartment (Fig. 1). Interestingly, the CBC does not directly mediate the interaction to the export receptor. Another protein, PHAX (phosphorylated adaptor for RNA export, is required as a bridging molecule between the CBC/UsnRNA complex and the actual export receptor CRM1 (chromosome region maintenance 1) or XpoI (Exportin1; Ohno et al. 2000). PHAX in its phosphorylated state binds to both, the UsnRNA and the CBC, forming an export competent pre-complex (Ohno et al. 2000; Segref et al. 2001). The leucine-rich export signal of PHAX is recognized by a complex comprising CRM1 and RanGTP (Ras related nuclear antigen; Ohno et al. 2000; Segref et al. 2001). CRM1 belongs to the importin-β superfamily, the major group of import and export receptors. They all share common properties like NPC- (nuclear pore complex), Ran- and cargo-binding. (Fornerod et al. 1997; Görlich et al. 1997; Chaillan-Huntington et al. 2000).

4 Achim Dickmanns and Ralf Ficner

Fig. 1. After UsnRNA transcription (bottom), the assembly of the UsnRNA export complex takes place. The translocation through the nuclear pore complex is followed by disassembly of receptor from the export complex (see text for details).

Directionality of import and export processes is maintained by the small GTPase Ran (Melchior et al. 1993a; Moore and Blobel 1993; Görlich et al. 1996b). In the nuclear compartment the concentration of RanGTP is high due to the Ran guanine-nucleotide exchange factor RCC1 (Regulator of Chromosome condensation1; Ohtsubo et al. 1989; Bischoff and Ponstingl 1991b; Bischoff and Ponstingl 1991a). In the cytoplasmic compartment the GDP bound form prevails due to the function of Ran GAP (GTPase activating enzyme) that increases the in-

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 5

trinsic GTPase activity of Ran (Hopper et al. 1990; Melchior et al. 1993b; Bischoff et al. 1995). An even increased GTPase activity of Ran is observed in the presence of the RanBPs (Ran Binding Proteins) 1 and 2 (Coutavas et al. 1993; Lounsbury et al. 1994; Saitoh and Dasso 1995; Yokoyama et al. 1995). Crosscompetition experiments in Xenopus laevis oocytes demonstrated differences in export pathways of UsnRNAs and mRNAs, suggesting altered requirements for the CBC and the m7G-cap (Jarmolowski et al. 1994). Subsequent experiments supported these observations by showing that UsnRNAs are exported into the cytoplasmic compartment using the mRNA export pathway, when carrying an inserted mRNA exon sequence and thus exceeding a critical length (Ohno et al. 2002; Masuyama et al. 2004). This suggests the need for additional proteins like the shuttling protein Y14 and the export factor REF deposited on the exon sequences mRNA for proper export of mRNA (Le Hir et al. 2000, 2001). The hexameric export complex comprising UsnRNA, CBC, PHAX, CRM1, and RanGTP is translocated through the NPC, a large supramolecular structure of approximately 125 MDa (Reichelt et al. 1990) that penetrates the inner and outer membrane of the nucleus. The NPC is composed of around 30 different proteins, albeit with high redundancies (Rout et al. 2000; Allen et al. 2001; Cronshaw et al. 2002; Stoffler et al. 2003). Recent experiments using a permeabilized cell-based assay revealed p15/NXT1 (NTF2-related export protein 1; Black et al. 1999; Katahira et al. 1999) as a factor stimulating nuclear export of UsnRNA and required for the release from the cytoplasmic side of the NPC (Ossareh-Nazari et al. 2000; Black et al. 2001). Hydrolyzation of GTP by Ran in concert with RanBP1 or RanBP2 is triggered by RanGAP, presumably leading to dissociation of the precomplex and CRM1. Further dissassembly takes place upon PHAX dephosphorylation. So, additionally to the hydrolysis of GTP by Ran usually sufficient to trigger the release of CRM1, the dephosphorylation of PHAX is required as well. Curiously, PHAX and CBP20/80 stay bound to the m7G-cap of the UsnRNA (Fig. 1; see below). 2.3 Assembly of the snRNP core structure (SMN complex) In the cytoplasm of mammalian cells, the Sm proteins assemble specifically with the UsnRNA. All Sm proteins have two sequence motifs in common, denoted Sm1 and Sm2. Crystallographic studies revealed that the Sm regions adopt a common fold of β-strands, the Sm-fold, which is required for the interaction between Sm proteins and subsequent formation of a heptameric, doughnut shaped structure, binding to the Sm-site of the UsnRNA (Kambach et al. 1999a, 1999b; Stark et al. 2001). Among the seven Sm proteins three, namely B/B’, D1 and D3, carry a Cterminal extension characterized by arginine-glycine dipeptide (RG) repeats. Specific arginines in the RG rich regions are modified into symmetrical dimethylarginines (sDMAs) (Brahms et al. 2000; Friesen et al. 2001a). The sDMA modifications are carried out by the methylosome (Fig. 2), a complex consisting of pICln (initially proposed to be a nucleotide sensitive component of chloride channels), WD45/MEP50 (a WD repeat containing protein) and PRMT5 or JBP1, the

6 Achim Dickmanns and Ralf Ficner

Fig. 2. In the cytoplasm, the Sm proteins B,D1 and D3 are symmetrically hypermethylated by the methylosome and assembled to the Sm core by the SMN complex and CBC and PHAX are released. Subsequently the cap is hypermethylated (see text for details).

actual methyltransferase (Pu et al. 1999; Brahms et al. 2001; Friesen et al. 2001b, 2002; Meister et al. 2001b). The sDMAs are required for a strong interaction with the SMN (survival of motor neurons) complex and for the proper assembly of the Sm-core snRNP in vivo (Fischer et al. 1997).

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 7

In vitro experiments showed that the Sm proteins form preassembled subcomplexes and that the binding of the protein subcomplexes to the RNA occurs in a well ordered fashion. First proteins E/F/G and D1/D2 bind to the Sm-site of the UsnRNA forming a stable subcore which, upon binding of D3/B(or B’), reorganizes into the mature sm-core (Raker et al. 1996). In vivo, this assembly is assumed to follow a similar pathway, but is strictly dependent on the presence of the SMN complex binding to both, the Sm proteins and the UsnRNA (Fischer et al. 1997; Meister et al. 2001c; Yong et al. 2002). The subcomplexes of Sm proteins are presented on the SMN complex in a prearranged configuration. Upon binding of the snRNA structural rearrangements induce closing of the ring around the Sm site, presumably similar to the clamp loading reaction observed for DNA polymerase (Ellison and Stillman 2001). The SMN complex itself has an approximate size of 1 Megadalton and consists of the SMN protein in an oligomerized form and at least 18 other proteins (Meister et al. 2001a). (For a more detailed survey of the SMN complex refer to the following reviews Meister et al. 2002; Paushkin et al. 2002; Gubitz et al. 2004; Yong et al. 2004). The high effort taken for the assembly of the UsnRNPs and the stringency of interactions during the assembly pathway supports the notion of a timely ordered release of the UsnRNA from the export complex and integration into SMN complex and later release. The release of CRM1 due to GTP hydrolysis by Ran and PHAX dephosphorylation seem to trigger the SMN-complex addition onto the pre-complex consisting of UsnRNA, CBC, and PHAX (Fig. 2). After the assembly of the core the SMN-complex it remains on the UsnRNP, whereas PHAX and CBC are released (Massenet et al. 2002). This release is also modulated by the binding of the importin-α/β complex which helps to displace the core UsnRNP-SMN complex from CBC and PHAX (Görlich et al. 1996a). Interestingly, the presence of importin-α alone does not distort binding (Görlich et al. 1996a). CBC and PHAX are recycled into the nucleus by Importin-α/β heterodimer (Görlich et al. 1996a; Segref et al. 2001). RanGDP recycles with the help of NTF2 (Ribbeck et al. 1998), and CRM1 recycles by itself (Fig. 2). The resulting assembly, consisting of core UsnRNP and SMN-complex seems to be a prerequisite for the hypermethylation of the m7G-cap to the 2,2,7trimethyl-guanosine (m3G) cap (Massenet et al. 2002). 2.4 m7G-cap hypermethylation Initial studies on the cap hypermethylation of human U1snRNP showed that the snRNA-(guanosine-N2)-methyltransferase is a cytoplasmic, adenosylmethioninedependent enzyme that is not stably associated with the snRNP, but binds to the SmB/B´ proteins (Plessel et al. 1994). Further in vitro reconstitution experiments examining the assembly process of human snRNAs and Sm proteins revealed that the presence of the SmB/B´protein in the snRNP core complex is essential for the cap-hypermethylation (Raker et al. 1996). Finally, the cap-hypermethylase was identified in S. cerevisiae and denoted TGS1 (for Trimethyl-Guanosine Synthase;

8 Achim Dickmanns and Ralf Ficner

Mouaikel et al. 2002). Yeast TGS1 comprises 315 amino acids and its sequence contains the canonical signature of adenosylmethionine-dependent methyltransferases. A sequence database search revealed putative orthologs in several other eukaryotes. Interestingly, TGS1 of human and mouse are circa three times larger (about 850 amino acids) and their C-terminal domain harbours the conserved methyltransferase domain (Mouaikel et al. 2002). The methyltransferase domain of TGS1 is closely related to other RNA:m2G methyltransferases. Highest similarity was found for the putative methyltransferase Mj0882, which belongs to a family of bacterial rRNA:m2G methyltransferases. Based on the known crystal structure of Mj0882, a homology model of yeast TGS1 was generated, predicting the minimal size of the methyltransferase domain and the binding sites for AdoMet and m7Gppp (Mouaikel et al. 2003a). However, the minimal substrate requirements have not been characterized so far, and it is unknown, whether the catalytic activity is coupled to the interaction of TGS1 with the Sm core RNP complex. TGS1 activity has been analyzed in vivo by means of yeast mutants, but no in vitro enzymatic activity assay has yet been established. The interaction of TGS1 with the Sm proteins has been further analyzed. yTGS1 binds to the highly basic C-terminal domains of the SmB protein and, albeit much weaker, of the SmD protein (Mouaikel et al. 2002). On the other hand, the N- and C-terminal tails of yTGS1 are not required for binding to these Sm proteins suggesting that these interactions are mediated by the entire methyltransferase domain (residues 58-262) (Mouaikel et al. 2003a). These interactions are of particular interest with respect to the substrate variety, since TGS1 not only hypermethylates the m7G-cap of spliceosomal snRNAs, but as well the m7G-cap of distinct snoRNAs. TGS1 was shown to interact with the basic C-terminal tails of the common snoRNP proteins Nop58 and CBf5 (Mouaikel et al. 2002). The human TGS1 (hTGS1) shows some significant differences with respect to the yeast ortholog. It has a molecular weight of 90 kDa as it contains additionally a large N-terminal domain of unknown function. It was originally identified in a yeast two hybrid-screen on the basis of its interaction with the nuclear receptor coactivator PRIP (peroxisome proliferator-activated receptor-(PPAR)-interacting protein) and named PIMT (for PRIP-interacting protein with methyltransferase domain; Zhu et al. 2001). Recently, different isoforms of hTGS1 that localize either to the cytoplasm or the nucleus have been reported (Enünlü et al. 2003). The larger, 90 kDa isoform is expressed especially in rat brain and testis, and it is localized in the nucleus, while a 55 kDa isoform is expressed ubiquitously and mainly localized in the cytoplasm. Interestingly, hTGS1 was found to interact also with the SMN protein (Mouaikel et al. 2003b), a component of the SMN complex that mediates the assembly of snRNA and Sm proteins (see above, Fig. 2). A mutation in SMN that mimics the predominant isoform of the protein expressed in patients exhibiting the neurodegenerative disease spinal muscular atrophy disrupts the interaction between TGS1 and SMN (Mouaikel et al. 2003b). This suggests that, in addition to its role in the assembly of the Sm-core RNP, the SMN complex is also required for the recruitment of TGS1 and, in consequence, for the caphypermethylation.

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 9

Cap hypermethylation precedes the proper 3’ end maturation of the core UsnRNPs. The additional nucleotides are removed by a yet unidentified exoribonuclease (Neuman de Vegvar and Dahlberg 1990). Experiments in yeast on U4snRNAs revealed that the 3’ extensions are removed by the exosome complex of 3’exonucleases, a complex also found in humans within the nucleoplasm and cytoplasm (Allmang et al. 1999a, 1999b; van Hoof et al. 2000b). Remaining nucleotides may be cleaved off by a member of the RNaseD family of exonucleases in order to obtain the final length of the U4snRNA (van Hoof et al. 2000a). 2.5 Nuclear import The hypermethylated m3G-cap of the UsnRNP and the Sm-core RNP form a bipartite NLS required for import into the nucleus (Mattaj and De Robertis 1985; Hamm et al. 1990; Fischer et al. 1993). In vitro import experiments revealed that both pathways depend on importin-β as receptor, but importin-β by itself is not sufficient suggesting the requirement of nuclear transport adaptor molecules (Palacios et al. 1997; Huber et al. 1998). Indeed, for the m3G-cap-dependent import pathway of UsnRNPs, the interaction with the cargo is not achieved directly, but requires an adaptor molecule, termed Snurportin1 (SPN1) that specifically recognizes the nuclear transport signal and bridges the binding to the receptor (Huber et al. 1998). Several other adaptors have been identified like importin α for the basic type NLS (Görlich et al. 1994; Moroianu et al. 1995; Weis et al. 1995)and Rip α for RPA (replication protein A; Jullien et al. 1999). These import adaptors generally interact via their homologous N-terminal domain, the IBB (Importin-β binding) domain, with the receptor Importin-β and their C-terminal domain, which may differ significantly, with the cargo. SPN1 specifically recognizes the hypermethylated 5’ cap of the U snRNAs present in the snRNPs and discriminates between monomethylated m7G- and the hypermethylated m3G-cap RNAs (Huber et al. 1998; see below). This import complex translocates through the NPC and is released into the nucleoplasm. Upon binding of RanGTP to importin-β the complex is disassembled. Interestingly, the disassembly does not occur at the nuclear basket like for cargoes imported by the Importin-α/β dependent pathway, but seems to occur at a distinct nuclear site (see below). The Importin-β/RanGTP complex is transferred back into the cytoplasmic compartment as is, whereas SPN1 requires an export receptor, Crm1, and RanGTP for export (Paraskeva et al. 1999). Crm1 binding and UsnRNP binding are mutually exclusive, to ensure that no cargo is re-exported into the cytoplasm. Upon GTP hydrolysis by RanGAP and RanBP1 or RanBP2, RanGDP dissociates from the trimeric complex. Interestingly, Crm1 stays bound to SPN1 and is released only when UsnRNPs bind to SPN1. Upon binding of Importin-β another round of import can occur (Paraskeva et al. 1999).

10 Achim Dickmanns and Ralf Ficner

The mechanism of the import depending on the Sm-core domain is yet still ill defined. Recent data from yeast experiments (Bordonne 2000) show the importance in nuclear uptake for SmB and SmD1 C-terminal regions in yeast, deletion of either one impairs import and the deletion of both C-termini leads to lethality. Sequence alignments of Sm proteins B, D1 and D3 in mammals, revealed that their C-terminal regions contain, besides the GR dipeptides required for sDMAs, multiple arginine and lysine residues. They loosely resemble the canonical nuclear import signals or, to a higher extend, nuclear import signals found in ribosomal proteins (Girard et al. 2004). They consist of a number of basic residues dispersed over an entire region of the polypeptide and is imported by multiple members of the importin-β superfamily (Jäkel and Görlich 1998). The crystal structure of Sm proteins (Kambach et al. 1999a, 1999b) and the three-dimensional model of the U1snRNP obtained by cryo-electromicroscopy (Stark et al. 2001) suggest that the C-termini of the Sm proteins B, D1 and D3 form an accessible area for the interaction with an import adaptor (Girard et al. 2004). The SMN complex might act as adaptor as it has been shown to be part of a pre-import complex with SPN1 and importin-β (Massenet et al. 2002; Narayanan et al. 2002). Recent data show that the SMN complex or parts thereof are cotransported with the U snRNPs into the nucleus, or play an important role in mediating the interaction between Sm core proteins and importin-β (Narayanan et al. 2004 ). The actual extent of assembly of the snRNPs during import is still not fully revealed. U1 and U2snRNP-specific proteins have been shown to carry an NLS and to be transported into the nucleus independent of their cognate UsnRNA (Kambach and Mattaj 1992; Kambach and Mattaj 1994; Romac et al. 1994; Hetzer and Mattaj 2000). Furthermore, snRNA modifications, which occur in the nucleus, are a prerequisite for assembly of particle-specific proteins (see below). 2.6 Sub-nuclear localization In vivo localization experiments revealed, that after import into the nucleus, core U snRNPs are initially observed in cajal bodies (CBs; Fig. 3) suggesting a role for CBs as final assembly site (Sleeman and Lamond 1999; Sleeman et al. 2001). The idea that CBs function as assembly sites is further supported by the following recent findings. Modifications like pseudouridylation and 2’O-methylation are a prerequisite for assembly of the specific proteins and a fully functional particle as shown for U2snRNPs (Yu et al. 1998; Zhao and Yu 2004). The discovery of the RNAs that guide the base modification in UsnRNAs and their localization to the CBs strongly suggests that the modifications actually occur in the CBs (CarmoFonseca 2002; Darzacq et al. 2002; Kiss et al. 2002; Jady et al. 2003). In order to form a functional spliceosome U4 snRNPs have to anneal with U6snRNPs. The U4/U6snRNP assembly factor SART3 is imported independently of UsnRNPs and localizes in CBs (Stanek et al. 2003). The knockout of U4/U6 or U5 specific proteins leads to an accumulation of U4/U6 di snRNP in the CBs (Schaffert et al. 2004) and U5snRNP accumulates in speckles.

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 11

Fig. 3. Recognition of the hypermethylated cap by Snurportin 1, assembly of import complex, and subsequent translocation through the NPC. In the nucleoplasm, the import complex is disassembled and the final maturation occurs (see text for details).

The SMN protein has also been identified as a component of the cajal bodies and in some cell lines it is found in subnuclear structures termed gems (Gemini of cajal bodies; Liu and Dreyfuss 1996; Carvalho et al. 1999; Frey et al. 1999; Young et al. 2000; Frey and Matera 2001), where the SMN complex is thought to be involved in restructuring and metabolism of snRNPs (Pellizzoni et al. 1998). The findings that UsnRNPs are associated with the SMN complex throughout the cytoplasmic assembly pathway and even form a pre-import complex that might be transported into the nucleus (Massenet et al. 2002; Narayanan et al. 2002) raise the question why the import complex is released from the NPC prior to disassembly of the import complex (Huber et al. 2002). Are these sophisticated

12 Achim Dickmanns and Ralf Ficner Table 1. Three-dimensional structures of m7G-cap binding proteins Protein

Ligand

Reference

VP39

m7GpppG m7Gpp m7Gp m7G m7GpppGAAAAA m3A m1A m3C m1C m7Gpp m7Gpp m7GpppG m7Gppp m7GpppA m7GpppG

(Hodel et al. 1996) (Hodel et al. 1997) (Hodel et al. 1997) (Hodel et al. 1997) (Hodel et al. 1997) (Hodel et al. 1998) (Hu et al. 1999) (Hu et al. 1999) (Hu et al. 1999) (Hu et al. 1999) (Matsuo et al. 1997) (Marcotrigiano et al. 1997) (Niedzwiecka et al. 2002) (Tomoo et al. 2002, 2003) (Tomoo et al. 2002, 2003) (Mazza et al. 2001) (Calero et al. 2002; Mazza et al. 2002)

eIF4E (yeast) eIF4E (murine) eIF4E (human) CBC (human)

and energy consuming assemblies and import processes necessary to ensure the proper final assembly in the nucleus, the CBs? Subsequent to the final assembly the UsnRNPs concentrate in nuclear speckles or interchromatin granule clusters (ICGs; Sleeman and Lamond 1999). ICGs are believed to function as sites, where free snRNPs and splicing factors are transiently stored before reassembly of spliceosomal subcomplexes, like the U5.U4/U6 complex.

3 Structural basis for m7G- and m3G-cap recognition by proteins 3.1 Three-dimensional structures of m7G-cap binding proteins The three-dimensional structures of several m7G-cap binding proteins have been determined, namely of the viral mRNA-cap-dependent nucleoside 2´-Omethyltransferase VP39, of the eukaryotic translation initiation factor eIF4E, and of the cap binding complex CBC involved in the nuclear export of RNA (see Table 1 and references therein). Extensive crystallographic studies on VP39 including crystal structures of various complexes with methylated nucleobases and m7Gcap oligonucleotides (Table 1) have been carried out (for review, see Quiocho et al. 2000). Similarly, several structures of eIF4E and CBC with bound m7G-ligands have been determined (Table 1). Although these three m7G-cap binding proteins exhibit no structural similarity, they share a common strategy for the recognition of the m7G-cap as the methylated guanine always intercalates between two aromatic side chains of the protein. The nature of these aromatic side chain pairs,

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 13

Fig. 4. Comparison of the cap binding mode of the m7G-cap binding proteins CBP20, eIF4E, VP39, and of the m3G-cap binding snurportin1 (SPN1).

however, differs, as it is Trp-Trp in eIF4E, Tyr-Phe in VP39 and Tyr-Tyr in CBP20 of the CBC (Fig. 4). The π-π stacking interaction of the bound m7G and the two sandwiching aromatic side chains is significantly enhanced by a strong cation-π interaction, since the m7G base carries a positive electric charge (Hu et al. 2003; Ruszczynska et al. 2003). In VP39, this cation-π sandwich interaction appears to provide the major contribution to the free binding energy of the overall m7G cap – protein interaction and it prevents binding of unmethylated GTP due to the missing charge of the guanine base (Hodel et al. 1997; Hu et al. 1999). The VP39 – m7G-cap interaction was characterized in detail by different biophysical methods and site-directed mutagenesis showing that polar side chains interacting with the m7G base have only minor effect on the binding affinity and mainly define the orientation of the bound base (Hodel et al. 1998; Lockless et al. 1998; Hsu et al. 2000; Hu et al. 1999, 2002, 2003). Single and double substitutions of the sandwiching phenylalanine and tyrosine by tryptophan in VP39 increase the affinity for m7G by factors of 10 and 50, respectively (Hu et al. 2002). Significant structural differences between apo- and m7G-cap bound form have been observed for the CBP20 of the CBC suggesting an induced fold within CBP20 upon ligand binding (Calero et al. 2002; Mazza et al. 2002). Molecular

14 Achim Dickmanns and Ralf Ficner

Fig. 5. The monomethylated m7GpppG cap of core UsnRNPs is dimethylated in the cytoplasm by PIMT/TGS1 into the m2,2,7-trimethyl (m3) GpppG cap. S-Adenosylmethionine (SAM) serves as donor for the methyl groups.

dynamics simulations of ligand-free eIF4E also give a hint to conformational flexibility of the m7G-cap binding pocket (Tomoo et al. 2003). In contrast, VP39 shows no significant structural changes upon m7G-cap binding (Quiocho et al. 2000). 3.2 Three-dimensional structure of the m3G-cap binding domain of human snurportin1 The crystal structure of the m3G-cap binding domain of the nuclear import adaptor SPN1 was solved and refined at a resolution of 2.4 Å (A Strasser, A Dickmanns, R Lührmann, and R Ficner, manuscript in preparation). It comprises residues 97 to 300 and is composed of five α helices and 10 β strands forming two almost coplanar β sheets linked by two crossing β strands (Fig. 6). This fold exhibits no similarity to the known structures of m7G-cap binding proteins (see above). The m3Gcap binding pocket is located between the two β sheets and several residues of

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 15

Fig. 6. Ribbon plot of the three-dimensional structure of human snurportin1 (residues 79300). The bound m3GpppG dinucleotide is shown as stick representation.

strands β1, β3, β10 and adjacent loop regions interact with the entire m3G-cap dinucleotide (Fig. 5). The two bases of the bound m3GpppG are in a coplanar orientation 3.6 Å apart, but slightly displaced with respect to a perfect base stacking. The six-membered ring of the non-methylated guanine is almost perfectly centred on the di-methylated N2 atom. The base stack of the dinucleotide is continued by the sidechain of Trp276 on the side of the tri-methylated guanine, and flanked by Leu104 on the side of the non-methylated guanine. Additionally, the m3G base is in hydrophobic contact with the side chain of Trp 107, which is in an almost perpendicular orientation to the stack in a distance of 3.8 Å. Remarkably, the protein forms only two hydrogen bonds with the trimethylated guanine base, and the triphosphate and the two riboses form hydrogen bonds with several basic residues. This binding mode completely differs from that of m7G-cap binding proteins, like CBC, eIF4E, and VP39 (Fig. 3). During snRNP biogenesis the hypermethylation of the m7G-cap occurs after assembly of the Sm-core RNP and initiates the RNP import into the nucleus. Therefore, the mode of m3G-cap binding by SPN1 is of particular interest with regard to the discrimination of m7G-cap bearing snRNAs and mRNAs which should prevent their accidental reimport into the nucleus. Previous studies on the specificity and fidelity of m3G-cap recognition by full-length SPN1 demonstrated that, in comparison to m3GpppG, the binding of m7GpppG to SPN1 is approximately 1000fold less efficient (Huber et al. 1998), which obviously is related to the lack of the two methyl groups on N2. Since in the crystal structure of the m3GpppG SPN1 complex only one of the two N2 methyl groups is in VDW contact with Trp107, the differences in affinity cannot be exclusively caused by the contribution of this

16 Achim Dickmanns and Ralf Ficner

single VDW contact to the free binding energy. However, the hydration of the amino group of the m7G cap represents an important difference to the m3G cap, as the latter has not to be dehydrated upon binding. The presence of two methyl groups bound to N2 most likely changes the atomic charge distribution of m3G in comparison to m7G and therefore significantly effects the cation π interaction of the hypermethylated guanine with Trp276 and Trp107. The binding of the tetranucleotide m3GpppAmUmA, which corresponds to most UsnRNA sequences, to full-length SPN1 is about 100-fold stronger than that of m3GpppG (Huber et al. 1998). This could be due to the 2´O-methylated adenosine and/or due to the additional nucleotides. Remarkably, the 2´O-methylated adenosine could be harboured in the binding pocket as well, since there is no strong discrimination on the purine type by polar interactions. Inspection of the electrostatic surface potential of the m3G-cap binding domain gives no hint on binding sites for the additional nucleotides. However, the crystallized domain lacks the C-terminal 61 residues, which could extend the RNA-binding surface. Surprisingly, despite the lack of sequence homology, the overall fold of the m3G-cap binding domain exhibits high similarity to a domain common to the mRNA-guanylyltransferase, the enzyme transferring GMP to the 5´end of mRNAs via a 5´-5´-phosphodiester bridge, and to DNA-ligases (Hakansson et al. 1997; Lee et al. 2000; Doherty and Suh 2000). The GTP-binding domain of the mRNAguanylyltransferase and the ATP-binding domain of DNA-ligases as well show high structural similarities, especially concerning the two β sheets forming the nucleotide binding pocket. In the case of mRNA-guanylyltransferase, which shares the highest structural similarity to SPN1, the GTP binding pocket is located close to the m3G-cap binding site, but the exact position of the nucleotides and the residues forming the binding pocket are quite different. This unexpected structural homology raises the question whether the m3G-cap binding domain has a common ancestor with the GTP-binding domain of the mRNA-guanyltransferase and whether it was linked with an IBB domain by shuffling and fusion of the corresponding exons.

4 Conclusions and outlook The Snurportin1 structure reveals an interesting mode of interaction with the hypermethylated m3G cap. In contrast to monomethylated caps, which are stacked between two aromatic side chains, the m3G cap is stacked between the second base of the RNA and only one aromatic side chain, suggesting a model for the interaction with Snurportin1. An important contribution to the different affinities of Snurportin1 to m7G and m3G-caps is most likely caused by the difference in hydration of the N2 amino group. In addition, the conformation of the RNA is presumably altered in solution upon modification, thus, enabling entry and docking in the binding pocket. The changes in atomic charge distribution of the ring system alter the stacking properties of the m3G with the first base of the UsnRNA and the interaction with Snurportin1. The crystal structure of the full length Snurportin1

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 17

should help to answer the question whether the N- and/or C-terminal residues are involved in cap binding, thus altering the strength of the binding. Interestingly, a vast number of mRNAs from C. elegans carry an m3G cap, which is related to trans-splicing (Liou and Blumenthal 1990; Van Doren and Hirsh 1990; see also Chapter 5). Both, m3G caps and m7G caps are recognized in the cytoplasm by the eukaryotic initiation factor eIF4E named IFE. C. elegans possesses five isoforms of IFE, which discriminate to different extent between m7G- and m3G-cap (Jankowska-Anyszka et al. 1998; Keiper et al. 2000). One of them, IFE5 has been identified as the initiation factor that specifically binds to the m3G cap, in contrast to IFE3 and IFE4 which specifically bind the m7G-cap. The remaining two IFEs 1 and 2 show an intermediate behaviour. Additionally, all five IFEs must have a way to distinguish between mRNAs and UsnRNAs. Modelling of the IFEs based on eIF4E from mouse and yeast (Marcotrigiano et al. 1997; Matsuo et al. 1997) suggests that the binding pockets vary in width and depth (Miyoshi et al. 2002). It will be interesting to compare the structures of the different isoforms with bound m3G- or m7G-cap versus Snurportin1 complexed with the m3G cap. This might reveal a general underlying mechanism of discrimination between m3G caps versus m7G cap binding by cap binding proteins.

Acknowledgements The authors thank Anja Strasser for helpful comments and assistance with the figures.

References Allen NP, Huang L, Burlingame A, Rexach M (2001) Proteomic analysis of nucleoporin interacting proteins. J Biol Chem 276:29268-29274 Allmang C, Kufel J, Chanfreau G, Mitchell P, Petfalski E, Tollervey D (1999a) Functions of the exosome in rRNA, snoRNA and snRNA synthesis. EMBO J 18:5399-5410 Allmang C, Petfalski E, Podtelejnikov A, Mann M, Tollervey D, Mitchell P (1999b) The yeast exosome and human PM-Scl are related complexes of 3' --> 5' exonucleases. Genes Dev 13:2148-2158 Bischoff FR, Krebber H, Kempf T, Hermes I, Ponstingl H (1995) Human RanGTPaseactivating protein RanGAP1 is a homologue of yeast Rna1p involved in mRNA processing and transport. Proc Natl Acad Sci USA 92:1749-1753 Bischoff FR, Ponstingl H (1991a) Catalysis of guanine nucleotide exchange on Ran by the mitotic regulator RCC1. Nature 354:80-82 Bischoff FR, Ponstingl H (1991b) Mitotic regulator protein RCC1 is complexed with a nuclear ras-related polypeptide. Proc Natl Acad Sci USA 88:10830-10834 Black BE, Holaska JM, Levesque L, Ossareh-Nazari B, Gwizdek C, Dargemont C, Paschal BM (2001) NXT1 is necessary for the terminal step of Crm1-mediated nuclear export. J Cell Biol 152:141-155

18 Achim Dickmanns and Ralf Ficner Black BE, Levesque L, Holaska JM, Wood TC, Paschal BM (1999) Identification of an NTF2-related factor that binds Ran-GTP and regulates nuclear protein export. Mol Cell Biol 19:8616-8624 Bordonne R (2000) Functional characterization of nuclear localization signals in yeast Sm proteins. Mol Cell Biol 20:7943-7954 Brahms H, Meheus L, de Brabandere V, Fischer U, Lührmann R (2001) Symmetrical dimethylation of arginine residues in spliceosomal Sm protein B/B' and the Sm-like protein LSm4, and their interaction with the SMN protein. RNA 7:1531-1542 Brahms H, Raymackers J, Union A, de Keyser F, Meheus L, Lührmann R (2000) The Cterminal RG dipeptide repeats of the spliceosomal Sm proteins D1 and D3 contain symmetrical dimethylarginines, which form a major B-cell epitope for anti-Sm autoantibodies. J Biol Chem 275:17122-17129 Burge CB, Tuschl T, Sharp PA (1999) Splicing of precursors to mRNA by the spliceosomes. In: Gesteland RF, Cech TR, Atkins JF (eds) The RNA world. Cold Spring Harbour Laboratory Press, New York, pp526-560 Calero G, Wilson KF, Ly T, Rios-Steiner JL, Clardy JC, Cerione RA (2002) Structural basis of m7GpppG binding to the nuclear cap-binding protein complex. Nat Struct Biol 9:912-917 Carmo-Fonseca M (2002) Understanding nuclear order. Trends Biochem Sci 27:332-334 Carvalho T, Almeida F, Calapez A, Lafarga M, Berciano MT, Carmo-Fonseca M (1999) The spinal muscular atrophy disease gene product, SMN: A link between snRNP biogenesis and the Cajal (coiled) body. J Cell Biol 147:715-728 Chaillan-Huntington C, Braslavsky CV, Kuhlmann J, Stewart M (2000) Dissecting the interactions between NTF2, RanGDP, and the nucleoporin XFXFG repeats. J Biol Chem 275:5874-5879 Coppola JA, Field AS, Luse DS (1983) Promoter-proximal pausing by RNA polymerase II in vitro: transcripts shorter than 20 nucleotides are not capped. Proc Natl Acad Sci USA 80:1251-1255 Cougot N, van Dijk E, Babajko S, Seraphin B (2004) 'Cap-tabolism'. Trends Biochem Sci 29:436-444 Coutavas E, Ren M, Oppenheim JD, D'Eustachio P, Rush MG (1993) Characterization of proteins that interact with the cell-cycle regulatory protein Ran/TC4. Nature 366:585587 Cronshaw JM, Krutchinsky AN, Zhang W, Chait BT, Matunis MJ (2002) Proteomic analysis of the mammalian nuclear pore complex. J Cell Biol 158:915-927 Darzacq X, Jady BE, Verheggen C, Kiss AM, Bertrand E, Kiss T (2002) Cajal bodyspecific small nuclear RNAs: a novel class of 2'-O-methylation and pseudouridylation guide RNAs. EMBO J 21:2746-2756 Doherty A, Suh S (2000) Structural and mechanistic conservation in DNA ligases. Nucleic Acids Res 28:4051-4058 Ellison V, Stillman B (2001) Opening of the clamp: an intimate view of an ATP-driven biological machine. Cell 106:655-660 Enünlü I, Papai G, Cserpan I, Udvardy A, Jeang KT, Boros I (2003) Different isoforms of PRIP-interacting protein with methyltransferase domain/trimethylguanosine synthase localizes to the cytoplasm and nucleus. Biochem Biophys Res Commun 309:44-51 Fischer U, Liu Q, Dreyfuss G (1997) The SMN-SIP1 complex has an essential role in spliceosomal snRNP biogenesis. Cell 90:1023-1029

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 19 Fischer U, Sumpter V, Sekine M, Satoh T, Lührmann R (1993) Nucleo-cytoplasmic transport of U snRNPs: definition of a nuclear location signal in the Sm core domain that binds a transport receptor independently of the m3G cap. EMBO J 12:573-583 Flaherty SM, Fortes P, Izaurralde E, Mattaj IW, Gilmartin GM (1997) Participation of the nuclear cap binding complex in pre-mRNA 3' processing. Proc Natl Acad Sci USA 94:11893-11898 Fornerod M, Ohno M, Yoshida M, Mattaj IW (1997) CRM1 is an export receptor for leucine-rich nuclear export signals [see comments]. Cell 90:1051-1060 Frey MR, Bailey AD, Weiner AM, Matera AG (1999) Association of snRNA genes with coiled bodies is mediated by nascent snRNA transcripts. Curr Biol 9:126-135 Frey MR, Matera AG (2001) RNA-mediated interaction of Cajal bodies and U2 snRNA genes. J Cell Biol 154:499-509 Friesen WJ, Massenet S, Paushkin S, Wyce A, Dreyfuss G (2001a) SMN, the product of the spinal muscular atrophy gene, binds preferentially to dimethylarginine-containing protein targets. Mol Cell 7:1111-1117 Friesen WJ, Paushkin S, Wyce A, Massenet S, Pesiridis GS, Van Duyne G, Rappsilber J, Mann M, Dreyfuss G (2001b) The methylosome, a 20S complex containing JBP1 and pICln, produces dimethylarginine-modified Sm proteins. Mol Cell Biol 21:8289-8300 Friesen WJ, Wyce A, Paushkin S, Abel L, Rappsilber J, Mann M, Dreyfuss G (2002) A novel WD repeat protein component of the methylosome binds Sm proteins. J Biol Chem 277:8243-8247 Furuichi Y, LaFiandra A, Shatkin AJ (1977) 5'-Terminal structure and mRNA stability. Nature 266:235-239 Girard C, Mouaikel J, Neel H, Bertrand E, Bordonne R (2004) Nuclear localization properties of a conserved protuberance in the Sm core complex. Exp Cell Res 299:199-208 Görlich D, Dabrowski M, Bischoff FR, Kutay U, Bork P, Hartmann E, Prehn S, Izaurralde E (1997) A novel class of RanGTP binding proteins. J Cell Biol 138:65-80 Görlich D, Kraft R, Kostka S, Vogel F, Hartmann E, Laskey RA, Mattaj IW, Izaurralde E (1996a) Importin provides a link between nuclear protein import and U snRNA export. Cell 87:21-32 Görlich D, Pante N, Kutay U, Aebi U, Bischoff FR (1996b) Identification of different roles for RanGDP and RanGTP in nuclear protein import. EMBO J 15:5584-5594 Görlich D, Prehn S, Laskey RA, Hartmann E (1994) Isolation of a protein that is essential for the first step of nuclear protein import. Cell 79:767-778 Gubitz AK, Feng W, Dreyfuss G (2004) The SMN complex. Exp Cell Res 296:51-56 Hakansson K, Doherty A, Shuman S, Wigley D (1997) X-ray crystallography reveals a large conformational change during guanyl transfer by mRNA capping enzymes. Cell 89:545-553 Hamm J, Darzynkiewicz E, Tahara SM, Mattaj IW (1990) The trimethylguanosine cap structure of U1 snRNA is a component of a bipartite nuclear targeting signal. Cell 62:569-577 Hartmuth K, Urlaub H, Vornlocher HP, Will CL, Gentzel M, Wilm M, Lührmann R (2002) Protein composition of human prespliceosomes isolated by a tobramycin affinityselection method. Proc Natl Acad Sci USA 99:16719-16724 Hernandez N (2001) Small nuclear RNA genes: a model system to study fundamental mechanisms of transcription. J Biol Chem 276:26733-26736 Hetzer M, Mattaj IW (2000) An ATP-dependent, Ran-independent mechanism for nuclear import of the U1A and U2B" spliceosome proteins. J Cell Biol 148:293-303

20 Achim Dickmanns and Ralf Ficner Hodel AE, Gershon PD, Quiocho FA (1998) Structural basis for sequence-nonspecific recognition of 5'-capped mRNA by a cap-modifying enzyme. Mol Cell 1:443-447 Hodel AE, Gershon PD, Shi X, Quiocho FA (1996) The 1.85 Å structure of vaccinia protein VP39: a bifunctional enzyme that participates in the modification of both mRNA ends. Cell 85:247-256 Hodel AE, Gershon PD, Shi X, Wang SM, Quiocho FA (1997) Specific protein recognition of an mRNA cap through its alkylated base. Nat Struct Biol 4:350-354 Hopper AK, Traglia HM, Dunst RW (1990) The yeast RNA1 gene product necessary for RNA processing is located in the cytosol and apparently excluded from the nucleus. J Cell Biol 111:309-321 Hsu PC, Hodel MR, Thomas JW, Taylor LJ, Hagedorn CH, Hodel AE (2000) Structural requirements for the specific recognition of an m7G mRNA cap. Biochemistry 39:13730-13736 Hu G, Gershon PD, Hodel AE, Quiocho FA (1999) mRNA cap recognition: dominant role of enhanced stacking interactions between methylated bases and protein aromatic side chains. Proc Natl Acad Sci USA 96:7149-7154 Hu G, Oguro A, Li C, Gershon PD, Quiocho FA (2002) The "cap-binding slot" of an mRNA cap-binding protein: quantitative effects of aromatic side chain choice in the double-stacking sandwich with cap. Biochemistry 41:7677-7687 Hu G, Tsai AL, Quiocho FA (2003) Insertion of an N7-methylguanine mRNA cap between two coplanar aromatic residues of a cap-binding protein is fast and selective for a positively charged cap. J Biol Chem 278:51515-51520 Huber J, Cronshagen U, Kadokura M, Marshallsay C, Wada T, Sekine M, Lührmann R (1998) Snurportin1, an m3G-cap-specific nuclear import receptor with a novel domain structure. EMBO J 17:4114-4126 Huber J, Dickmanns A, Lührmann R (2002) The importin-beta binding domain of snurportin1 is responsible for the Ran- and energy-independent nuclear import of spliceosomal U snRNPs in vitro. J Cell Biol 156:467-479 Ingelfinger D, Arndt-Jovin DJ, Lührmann R, Achsel T (2002) The human LSm1-7 proteins colocalize with the mRNA-degrading enzymes Dcp1/2 and Xrnl in distinct cytoplasmic foci. RNA 8:1489-1501 Izaurralde E, Lewis J, Gamberi C, Jarmolowski A, McGuigan C, Mattaj IW (1995) A capbinding protein complex mediating U snRNA export. Nature 376:709-712 Izaurralde E, Lewis J, McGuigan C, Jankowska M, Darzynkiewicz E, Mattaj IW (1994) A nuclear cap binding protein complex involved in pre-mRNA splicing. Cell 78:657-668 Jacobs EY, Ogiwara I, Weiner AM (2004) Role of the C-terminal domain of RNA polymerase II in U2 snRNA transcription and 3' processing. Mol Cell Biol 24:846-855 Jady BE, Darzacq X, Tucker KE, Matera AG, Bertrand E, Kiss T (2003) Modification of Sm small nuclear RNAs occurs in the nucleoplasmic Cajal body following import from the cytoplasm. EMBO J 22:1878-1888 Jäkel S, Görlich D (1998) Importin beta, transportin, RanBP5 and RanBP7 mediate nuclear import of ribosomal proteins in mammalian cells. EMBO J 17:4491-4502 Jankowska-Anyszka M, Lamphear BJ, Aamodt EJ, Harrington T, Darzynkiewicz E, Stolarski R, Rhoads RE (1998) Multiple isoforms of eukaryotic protein synthesis initiation factor 4E in Caenorhabditis elegans can distinguish between mono- and trimethylated mRNA cap structures. J Biol Chem 273:10538-10542 Jarmolowski A, Boelens WC, Izaurralde E, Mattaj IW (1994) Nuclear export of different classes of RNA is mediated by specific factors. J Cell Biol 124:627-635

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 21 Jullien D, Görlich D, Laemmli UK, Adachi Y (1999) Nuclear import of RPA in Xenopus egg extracts requires a novel protein XRIPalpha but not importin alpha. EMBO J 18:4348-4358 Kambach C, Mattaj IW (1992) Intracellular distribution of the U1A protein depends on active transport and nuclear binding to U1 snRNA. J Cell Biol 118:11-21 Kambach C, Mattaj IW (1994) Nuclear transport of the U2 snRNP-specific U2B'' protein is mediated by both direct and indirect signalling mechanisms. J Cell Sci 107:1807-1816 Kambach C, Walke S, Nagai K (1999a) Structure and assembly of the spliceosomal small nuclear ribonucleoprotein particles. Curr Opin Struct Biol 9:222-230 Kambach C, Walke S, Young R, Avis JM, de la Fortelle E, Raker VA, Lührmann R, Li J, Nagai K (1999b) Crystal structures of two Sm protein complexes and their implications for the assembly of the spliceosomal snRNPs. Cell 96:375-387 Katahira J, Strasser K, Podtelejnikov A, Mann M, Jung JU, Hurt E (1999) The Mex67pmediated nuclear mRNA export pathway is conserved from yeast to human. EMBO J 18:2593-2609 Keiper BD, Lamphear BJ, Deshpande AM, Jankowska-Anyszka M, Aamodt EJ, Blumenthal T, Rhoads RE (2000) Caenorhabditis elegans. J Biol Chem 275:10590-10596 Kiss AM, Jady BE, Darzacq X, Verheggen C, Bertrand E, Kiss T (2002) A Cajal bodyspecific pseudouridylation guide RNA is composed of two box H/ACA snoRNA-like domains. Nucleic Acids Res 30:4643-4649 Krol A, Carbon P, Ebel JP, Appel B (1987) Xenopus tropicalis U6 snRNA genes transcribed by Pol III contain the upstream promoter elements used by Pol II dependent U snRNA genes. Nucleic Acids Res 15:2463-2478 Kunkel GR, Maser RL, Calvet JP, Pederson T (1986) U6 small nuclear RNA is transcribed by RNA polymerase III. Proc Natl Acad Sci USA 83:8575-8579 Le Hir H, Gatfield D, Izaurralde E, Moore MJ (2001) The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsensemediated mRNA decay. EMBO J 20:4987-4997 Le Hir H, Izaurralde E, Maquat LE, Moore MJ (2000) The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. EMBO J 19:6860-6869 Lee JY, Chang C, Song H, Moore JD, Yang JK, Kim HK, Kwon S, Suh S (2000) Crystal structure of NAD(+)-dependent DNA ligase: modular architecture and functional implications. EMBO J 19:1119-1129 Liou RF, Blumenthal T (1990) trans-spliced Caenorhabditis elegans mRNAs retain trimethylguanosine caps. Mol Cell Biol 10:1764-1768 Liu Q, Dreyfuss G (1996) A novel nuclear structure containing the survival of motor neurons protein. EMBO J 15:3555-3565 Lockless SW, Cheng HT, Hodel AE, Quiocho FA, Gershon PD (1998) Recognition of capped RNA substrates by VP39, the vaccinia virus-encoded mRNA cap-specific 2'-Omethyltransferase. Biochemistry 37:8564-8574 Lounsbury KM, Beddow AL, Macara IG (1994) A family of proteins that stabilize the Ran/TC4 GTPase in its GTP-bound conformation. J Biol Chem 269:11285-11290 Makarov EM, Makarova OV, Urlaub H, Gentzel M, Will CL, Wilm M, Lührmann R (2002) Small nuclear ribonucleoprotein remodeling during catalytic activation of the spliceosome. Science 298:2205-2208

22 Achim Dickmanns and Ralf Ficner Marcotrigiano J, Gingras AC, Sonenberg N, Burley SK (1997) Cocrystal structure of the messenger RNA 5' cap-binding protein (eIF4E) bound to 7-methyl-GDP. Cell 89:951961 Massenet S, Pellizzoni L, Paushkin S, Mattaj IW, Dreyfuss G (2002) The SMN complex is associated with snRNPs throughout their cytoplasmic assembly pathway. Mol Cell Biol 22:6533-6541 Masuyama K, Taniguchi I, Kataoka N, Ohno M (2004) RNA length defines RNA export pathway. Genes Dev 18:2074-2085 Matsuo H, Li H, McGuire A, M Fletcher C, M Gingras AC, Sonenberg N, Wagner G (1997) Structure of translation factor eIF4E bound to m7GDP and interaction with 4Ebinding protein. Nat Struct Biol 4:717-724 Mattaj IW, De Robertis EM (1985) Nuclear segregation of U2 snRNA requires binding of specific snRNP proteins. Cell 40:111-118 Mazza C, Ohno M, Segref A, Mattaj IW, Cusack S (2001) Crystal structure of the human nuclear cap binding complex. Mol Cell 8:383-396 Mazza C, Segref A, Mattaj IW, Cusack S (2002) Large-scale induced fit recognition of an m(7)GpppG cap analogue by the human nuclear cap-binding complex. EMBO J 21:5548-5557 Medlin JE, Uguen P, Taylor A, Bentley DL, Murphy S (2003) The C-terminal domain of pol II and a DRB-sensitive kinase are required for 3' processing of U2 snRNA. EMBO J 22:925-934 Meister G, Buhler D, Pillai R, Lottspeich F, Fischer U (2001a) A multiprotein complex mediates the ATP-dependent assembly of spliceosomal U snRNPs. Nat Cell Biol 3:945-949 Meister G, Eggert C, Buhler D, Brahms H, Kambach C, Fischer U (2001b) Methylation of Sm proteins by a complex containing PRMT5 and the putative U snRNP assembly factor pICln. Curr Biol 11:1990-1994 Meister G, Eggert C, Fischer U (2002) SMN-mediated assembly of RNPs: a complex story. Trends Cell Biol 12:472-478 Meister G, Hannus S, Plottner O, Baars T, Hartmann E, Fakan S, Laggerbauer B, Fischer U (2001c) SMNrp is an essential pre-mRNA splicing factor required for the formation of the mature spliceosome. EMBO J 20:2304-2314 Melchior F, Paschal B, Evans J, Gerace L (1993a) Inhibition of nuclear protein import by nonhydrolyzable analogues of GTP and identification of the small GTPase Ran/TC4 as an essential transport factor [published erratum appears in J Cell Biol 1994 Jan;124(12):217]. J Cell Biol 123:1649-1659 Melchior F, Weber K, Gerke V (1993b) A functional homologue of the RNA1 gene product in Schizosaccharomyces pombe: purification, biochemical characterization, and identification of a leucine-rich repeat motif. Mol Biol Cell 4:569-581 Miyoshi H, Dwyer DS, Keiper BD, Jankowska-Anyszka M, Darzynkiewicz E, Rhoads RE (2002) Discrimination between mono- and trimethylated cap structures by two isoforms of Caenorhabditis elegans eIF4E. EMBO J 21:4680-4690 Moore MS, Blobel G (1993) The GTP-binding protein Ran/TC4 is required for protein import into the nucleus. Nature 365:661-663 Moroianu J, Blobel G, Radu A (1995) Previously identified protein of uncertain function is karyopherin alpha and together with karyopherin beta docks import substrate at nuclear pore complexes. Proc Natl Acad Sci USA 92:2008-2011

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 23 Mouaikel J, Bujnicki JM, Tazi J, Bordonne R (2003a) Sequence-structure-function relationships of Tgs1, the yeast snRNA/snoRNA cap hypermethylase. Nucleic Acids Res 31:4899-4909 Mouaikel J, Narayanan U, Verheggen C, Matera AG, Bertrand E, Tazi J, Bordonne R (2003b) Interaction between the small-nuclear-RNA cap hypermethylase and the spinal muscular atrophy protein, survival of motor neuron. EMBO Rep 4:616-622 Mouaikel J, Verheggen C, Bertrand E, Tazi J, Bordonne R (2002) Hypermethylation of the cap structure of both yeast snRNAs and snoRNAs requires a conserved methyltransferase that is localized to the nucleolus. Mol Cell 9:891-901 Murthy KG, Park P, Manley JL (1991) A nuclear micrococcal-sensitive, ATP-dependent exoribonuclease degrades uncapped but not capped RNA substrates. Nucleic Acids Res 19:2685-2692 Narayanan U, Ospina JK, Frey MR, Hebert MD, Matera A (2002) SMN, the spinal muscular atrophy protein, forms a pre-import snRNP complex with snurportin1 and importin beta. Hum Mol Genet 11:1785-1795 Narayanan U, Achsel T, Lührmann R, Matera AG (2004) Coupled in vitro import of UsnRNPs and SMN, the spinal muscular atrophy protein. Mol Cell 16: 223-34 Neuman de Vegvar HE, Dahlberg JE (1990) Nucleocytoplasmic transport and processing of small nuclear RNA precursors. Mol Cell Biol 10:3365-3375 Niedzwiecka A, Marcotrigiano J, Stepinski J, Jankowska-Anyszka M, WyslouchCieszynska A, Dadlez M Gingras AC, Mak P, Darzynkiewicz E, Sonenberg N, Burley SK, Stolarski R (2002) Biophysical studies of eIF4E cap-binding protein: recognition of mRNA 5' cap structure and synthetic fragments of eIF4G and 4E-BP1 proteins. J Mol Biol 319:615-635 Ohno M, Kataoka N, Shimura Y (1990) A nuclear cap binding protein from HeLa cells. Nucleic Acids Res 18:6989-6995 Ohno M, Segref A, Bachi A, Wilm M, Mattaj IW (2000) PHAX, a mediator of U snRNA nuclear export whose activity is regulated by phosphorylation. Cell 101:187-198 Ohno M, Segref A, Kuersten S, Mattaj IW (2002) Identity elements used in export of mRNAs. Mol Cell 9:659-671 Ohtsubo M, Okazaki H, Nishimoto T (1989) The RCC1 protein, a regulator for the onset of chromosome condensation locates in the nucleus and binds to DNA. J Cell Biol 109:1389-1397 Ossareh-Nazari B, Maison C, Black BE, Levesque L, Paschal BM, Dargemont C (2000) RanGTP-binding protein NXT1 facilitates nuclear export of different classes of RNA in vitro. Mol Cell Biol 20:4562-4571 Palacios I, Hetzer M, Adam SA, Mattaj IW (1997) Nuclear import of U snRNPs requires importin beta. EMBO J 16:6783-6792 Paraskeva E, Izaurralde E, Bischoff FR, Huber J, Kutay U, Hartmann E, Lührmann R, Görlich D (1999) CRM1-mediated recycling of snurportin 1 to the cytoplasm. J Cell Biol 145:255-264 Paule MR, White RJ (2000) Survey and summary: transcription by RNA polymerases I and III. Nucleic Acids Res 28:1283-1298 Paushkin S, Gubitz AK, Massenet S, Dreyfuss G (2002) The SMN complex, an assemblyosome of ribonucleoproteins. Curr Opin Cell Biol 14:305-312 Pellizzoni L, Kataoka N, Charroux B, Dreyfuss G (1998) A novel function for SMN, the spinal muscular atrophy disease gene product, in pre-mRNA splicing. Cell 95:615-624

24 Achim Dickmanns and Ralf Ficner Plessel G, Fischer U, Lührmann R (1994) m3G cap hypermethylation of U1 small nuclear ribonucleoprotein (snRNP) in vitro: evidence that the U1 small nuclear RNA(guanosine-N2)-methyltransferase is a non-snRNP cytoplasmic protein that requires a binding site on the Sm core domain. Mol Cell Biol 14:4160-4172 Pu WT, Krapivinsky GB, Krapivinsky L, Clapham DE (1999) pICln inhibits snRNP biogenesis by binding core spliceosomal proteins. Mol Cell Biol 19:4113-4120 Quiocho FA, Hu G, Gershon PD (2000) Structural basis of mRNA cap recognition by proteins. Curr Opin Struct Biol 10:78-86 Raker VA, Plessel G, Lührmann R (1996) The snRNP core assembly pathway: identification of stable core protein heteromeric complexes and an snRNP subcore particle in vitro. EMBO J 15:2256-2269 Reddy R, Henning D, Das G, Harless M, Wright D (1987) The capped U6 small nuclear RNA is transcribed by RNA polymerase III. J Biol Chem 262:75-81 Reichelt R, Holzenburg A, Buhle EL Jr, Jarnik M, Engel A, Aebi U (1990) Correlation between structure and mass distribution of the nuclear pore complex and of distinct pore complex components. J Cell Biol 110:883-894 Ribbeck K, Lipowsky G, Kent HM, Stewart M, Görlich D (1998) NTF2 mediates nuclear import of Ran. EMBO J 17:6587-6598 Romac JM, Graff DH, Keene JD (1994) The U1 small nuclear ribonucleoprotein (snRNP) 70K protein is transported independently of U1 snRNP particles via a nuclear localization signal in the RNA-binding domain. Mol Cell Biol 14:4662-4670 Rout MP, Aitchison JD, Suprapto A, Hjertaas K, Zhao Y, Chait BT (2000) The yeast nuclear pore complex. Composition, architecture, and transport mechanism. J Cell Biol 148:635-652 Ruszczynska K, Kamienska-Trela K, Wojcik J, Stepinski J, Darzynkiewicz E, Stolarski R (2003) Charge distribution in 7-methylguanine regarding cation-pi interaction with protein factor eIF4E. Biophys J 85:1450-1456 Saitoh H, Dasso M (1995) The RCC1 protein interacts with Ran, RanBP1, hsc70, and a 340-kDa protein in Xenopus extracts. J Biol Chem 270:10658-10663 Salditt-Georgieff M, Harpold M, Chen-Kiang S, Darnell JE Jr (1980) The addition of 5' cap structures occurs early in hnRNA synthesis and prematurely terminated molecules are capped. Cell 19:69-78 Schaffert N, Hossbach M, Heintzmann R, Achsel T, Lührmann R (2004) RNAi knockdown of hPrp31 leads to an accumulation of U4/U6 di-snRNPs in Cajal bodies. EMBO J 23:3000-3009 Segref A, Mattaj IW, Ohno M (2001) The evolutionarily conserved region of the U snRNA export mediator PHAX is a novel RNA-binding domain that is essential for U snRNA export. RNA 7:351-360 Shatkin AJ (1976) Capping of eucaryotic mRNAs. Cell 9:645-653 Shimotohno K, Kodama Y, Hashimoto J, Miura KI (1977) Importance of 5'-terminal blocking structure to stabilize mRNA in eukaryotic protein synthesis. Proc Natl Acad Sci USA 74:2734-2738 Shuman S (2002) What messenger RNA capping tells us about eukaryotic evolution. Nat Rev Mol Cell Biol 3:619-625 Sleeman JE, Ajuh P, Lamond AI (2001) snRNP protein expression enhances the formation of Cajal bodies containing p80-coilin and SMN. J Cell Sci 114:4407-4419 Sleeman JE, Lamond AI (1999) Newly assembled snRNPs associate with coiled bodies before speckles, suggesting a nuclear snRNP maturation pathway. Curr Biol 9:1065-1074

Role of the 5’-cap in the biogenesis of spliceosomal snRNPs 25 Stanek D, Rader SD, Klingauf M, Neugebauer KM (2003) Targeting of U4/U6 small nuclear RNP assembly factor SART3/p110 to Cajal bodies. J Cell Biol 160:505-516 Stark H, Dube P, Lührmann R, Kastner B (2001) Arrangement of RNA and proteins in the spliceosomal U1 small nuclear ribonucleoprotein particle. Nature 409:539-542 Stoffler D, Feja B, Fahrenkrog B, Walz J, Typke D, Aebi U (2003) Cryo-electron tomography provides novel insights into nuclear pore architecture: implications for nucleocytoplasmic transport. J Mol Biol 328:119-130 Tomoo K, Shen X, Okabe K, Nozoe Y, Fukuhara S, Morino S, Ishida T, Taniguchi T, Hasegawa H, Terashima A, Sasaki M, Katsuya Y, Kitamura K, Miyoshi H, Ishikawa M, Miura K (2002) Crystal structures of 7-methylguanosine 5'-triphosphate (m(7)GTP)and P(1)-7-methylguanosine-P(3)-adenosine-5',5'-triphosphate (m(7)GpppA)-bound human full-length eukaryotic initiation factor 4E: biological importance of the Cterminal flexible region. Biochem J 362:539-544 Tomoo K, Shen X, Okabe K, Nozoe Y, Fukuhara S, Morino S, Sasaki M, Taniguchi T, Miyagawa H, Kitamura K, Miura K, Ishida T (2003) Structural features of human initiation factor 4E, studied by X-ray crystal analyses and molecular dynamics simulations. J Mol Biol 328:365-383 Uguen P, Murphy S (2003) The 3' ends of human pre-snRNAs are produced by RNA polymerase II CTD-dependent RNA processing. EMBO J 22:4544-4554 Uguen P, Murphy S (2004) 3'-box-dependent processing of human pre-U1 snRNA requires a combination of RNA and protein co-factors. Nucleic Acids Res 32:2987-2994 Van Doren K, Hirsh D (1990) mRNAs that mature through trans-splicing in Caenorhabditis elegans have a trimethylguanosine cap at their 5' termini. Mol Cell Biol 10:17691772 van Hoof A, Lennertz P, Parker R (2000a) Three conserved members of the RNase D family have unique and overlapping functions in the processing of 5S, 5.8S, U4, U5, RNase MRP and RNase P RNAs in yeast. EMBO J 19:1357-1365 van Hoof A, Lennertz P, Parker R (2000b) Yeast exosome mutants accumulate 3'-extended polyadenylated forms of U4 small nuclear RNA and small nucleolar RNAs. Mol Cell Biol 20:441-452 Weis K, Mattaj IW, Lamond AI (1995) Identification of hSRP1 alpha as a functional receptor for nuclear localization sequences. Science 268:1049-1053 Will CL, Lührmann R (2001) Spliceosomal UsnRNP biogenesis, structure and function. Curr Opin Cell Biol 13:290-301 Yokoyama N, Hayashi N, Seki T, Pante N, Ohba T, Nishii K, Kuma K, Hayashida T, Miyata T, Aebi U, Fukui M, Nishimoto T (1995) A giant nucleopore protein that binds Ran/TC4. Nature 376:184-188 Yong J, Golembe TJ, Battle DJ, Pellizzoni L, Dreyfuss G (2004) snRNAs contain specific SMN-binding domains that are essential for snRNP assembly. Mol Cell Biol 24:27472756 Yong J, Pellizzoni L, Dreyfuss G (2002) Sequence-specific interaction of U1 snRNA with the SMN complex. EMBO J 21:1188-1196 Young PJ, Le TT, thi Man N, Burghes AH, Morris GE (2000) The relationship between SMN, the spinal muscular atrophy protein, and nuclear coiled bodies in differentiated tissues and cultured cells. Exp Cell Res 256:365-374 Yu YT, Shu MD, Steitz JA (1998) Modifications of U2 snRNA are required for snRNP assembly and pre-mRNA splicing. EMBO J 17:5783-5795

26 Achim Dickmanns and Ralf Ficner Zhao X, Yu YT (2004) Pseudouridines in and near the branch site recognition region of U2 snRNA are required for snRNP biogenesis and pre-mRNA splicing in Xenopus oocytes. RNA 10:681-690 Zhu Y, Qi C, Cao WQ, Yeldandi AV, Rao MS, Reddy JK (2001) Cloning and characterization of PIMT, a protein with a methyltransferase domain, which interacts with and enhances nuclear receptor coactivator PRIP function. Proc Natl Acad Sci USA 98:1038010385

Dickmanns, Achim Abteilung für Molekulare Strukturbiologie, Institüt für Mikrobiologie und Genetik, Georg August Universität Göttingen, Justus von Liebig Weg 11, D37077 Göttingen, Germany Ficner, Ralf Abteilung für Molekulare Strukturbiologie, Institüt für Mikrobiologie und Genetik, Georg August Universität Göttingen, Justus von Liebig Weg 11, D37077 Göttingen, Germany [email protected]

Role of a conserved pseudouridine in U2 snRNA on the structural and electrostatic features of the spliceosomal pre-mRNA branch site Nancy L. Greenbaum

Abstract A pseudouridine (ψ) residue in a phylogenetically conserved position of U2 snRNA that pairs with the intron to form the pre-mRNA branch site helix of S. Cerevisiae has been shown to induce a dramatically altered architectural landscape compared with that of its unmodified counterpart. In the ψ-dependent structure the branch site adenosine in an extrahelical position, with the nucleophilic 2’OH positioned at the surface of the widened major groove. Clustering of electronegative functional groups and kinking of the backbone in the modified structure also result in a region of exceptional negativity in the region of the 2’OH. These features may assist in recognition and activity of the branch site during the first step of splicing. This is the first case in which a native ψ has been shown to induce a major alteration in structure. However, it is likely that other conserved modification sites in the spliceosome and ribosome may impact structurally on assembly and function.

1 Introduction 1.1 The spliceosome Following transcription, precursor messenger (pre-m)RNA molecules undergo a series of processing reactions prior to translation of their message into protein. Among these essential reactions is the removal of introns, or noncoding regions of the pre-mRNA, and the ligation of exons, the flanking coding regions. In some cases, such as Group I and Group II introns, the catalytic power resides entirely within the RNA component. In eukaryotes, the splicing reaction utilizes the same chemical mechanism as the Group II intron, but is catalyzed by the spliceosome, a dynamic ribonucleoprotein machine requiring both small nuclear (sn)RNA and protein components (Moore et al. 1993). Although the RNA components only make up a small percentage of the total spliceosomal mass, evidence is accumulating to support the hypothesis that the RNA fraction is the catalytic agent. The cyclic nature of spliceosome activity requires that at least some of the components must reassemble around each pre-mRNA substrate. The dynamic and highly specific nature of spliceosome activity makes details of its assembly and Topics in Current Genetics H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106846 / Published online: 27 January 2005 © Springer-Verlag Berlin Heidelberg 2005

2 Nancy L. Greenbaum

the molecular basis of recognition among components a particularly fascinating and challenging target of study. 1.1.1 Role of snRNAs in splicing chemistry Among the five spliceosomal U snRNAs, U2 and U6 snRNA are the only two to be involved in both the first and second transesterification reactions of splicing, and they exhibit the greatest phylogenetic conservation. Multiple segments of pairing interactions between U2 and U6 snRNA (Fig. 1) are critical for splicing activity (Madhani and Guthrie 1992). Evidence for the direct role of U2 and U6 snRNA in splicing comes from experiments by Valadkhan and Manley (2001, 2003), which showed that a protein-free complex of these snRNAs, along with an intron strand and divalent metal ions, was sufficient to catalyze covalent products. 1.1.2 The pre-mRNA branch site In addition to essential pairing interactions between U2 and U6 snRNA, a segment of U2 snRNA identifies and pairs with a consensus sequence of the intron (Fig. 1). This sequence is absolutely conserved in yeast, but greater flexibility is seen in high eukaryotes; however, the relative position of purines and pyrimidines, as well as the identity of several residues, is strictly conserved. The U2 snRNA-intron pairing forms a short complementary helix (seven base pairs in yeast) with an invariant single unpaired adenosine residue on the intron strand. The 2'OH of this adenosine is the nucleophile in the first cleavage reaction; it is known as the branch site because of the 2’-3’-5’ branched product. In addition to its catalytic role, specific recognition of the pre-mRNA (i.e. prior to the splicing reaction) branch site region is likely to assist in recruitment of other components of the active spliceosome. The structural features contributing to positioning of the 2’OH for nucleophilic activity and subsequent activation, and/or recognition by other components of the catalytic core, are major issues in understanding the structural biology of RNA splicing. 1.2 Modified bases in structural RNAs Shape and charge are two major criteria in predicting interaction between biomolecules. RNA molecules, through formation of loops bulges, base mismatches, presents varied topological and electrostatic landscape to potential ligands. As a mechanism to increase opportunities for folding and recognition, to enhance ion binding affinity, and/or increase thermal stability, structural/functional RNA molecules (e.g. tRNAs, rRNAs, snRNAs) expand the limited vocabulary of four bases by undergoing site-specific post-transcriptional chemical modification of selected bases. Such modifications alter the local electrostatic and topological landscape of regions important for intra- or intermolecular interaction and, therefore, have the potential to augment opportunities for RNA-ligand recognition.

Conserved pseudouridine in U2 snRNA 3

Fig. 1. RNA components of the spliceosomal catalytic core. U6 snRNA (upper strand) pairs with U2 snRNA to form several helices; a consensus sequence of U2 pairs with a complementary segment of the intron to form a helix with a single bulged adenosine. It is the 2’OH of this adenosine that becomes the nucleophile in the first step of splicing. The duplexes uBP (left) and ψBP (right) represent the minimal experimental sequence for the unmodified and ψ-modified branch site helix, respectively. Bases shown in gray were added to stabilize the helices in solution.

1.2.1 Pseudouridine (ψ) and its features The most abundant of the modified bases is pseudouridine (ψ; 2,3[1H,3H]pyrimidinedione,5-β-D-ribofuranosyl; Fig. 2), a rotation isomer of uridine catalyzed by protein enzymes in a snoRNA-dependent or –independent reaction (see related articles in this volume). ψ residues are particularly common in tRNA, ribosomal (r)RNA, and some snRNAs of the spliceosome. Higher eukaryotes have an increased number of modifications as compared with lower forms. When in an anti conformation with respect to the ribose, the modified base maintains the same relative position of potential hydrogen bond donors and acceptors on its WatsonCrick face as does uridine. However, the N1 atom of ψ, no longer involved in the glycosidic bond to the ribose, is exposed in the major groove and is protonated at physiological pH (NH1). Presence of ψ in paired strands (Davis and Poulter 1991; Hall and McLaughlin 1991) or in a native position within a tRNA structure (Durant and Davis 1999, Arnez and Steitz 1994) is not typically associated with significant structural change as compared with unmodified analogues, although there is evidence for conformers unique to native (modified) tRNAs not seen in transcribed (unmodified) analogues (Derrick and Horowitz 1993). However, ψ in a sequence typically

4 Nancy L. Greenbaum

Fig. 2. Uridine (U, left) and pseudouridine (ψ, right).

confers additional thermal stability to the structure (e.g. Davis and Poulter 1991, 1999; Hall and McLaughlin 1991; Arnez and Steitz 1994; Kowalak et al. 1994; Newby and Greenbaum 2001). Inclusion of one or two ψ residues in a complementary RNA helix was accompanied by increased thermal stability of approximately –0.5 kcal per ψ (Hall and McLaughlin 1991). In accord with these data, native tRNA molecules, which include a number of ψ and other posttranscriptionally modified bases, exhibit higher, and more cooperative, melting transitions than their unmodified counterparts. Interestingly, a positive correlation has been observed between the number of modified bases in tRNA of thermophilic bacteria and their growth temperature (Kowalak et al. 1994). In contrast with these findings, however, Meroueh et al. (2000) found that some of the conserved ψ residues in the LSU 1915 loop of rRNA were associated with lower melting transitions than analogous unmodified loops. The increased thermal stability of RNA secondary structure by ψwas hypothesized to result from additional hydrogen bonds involving the ψNH1 (Davis and Poulter 1991; Auffinger and Westhof 1998) or from more favorable stacking interactions between ψ and neighboring bases (Yarian et al. 1994; Chui et al. 2002). In support of the former hypothesis, a water-mediated hydrogen bond was observed between ψNH1 and a phosphate oxygen atom of the same or neighboring residue in a crystal structure of tRNAGln (Arnez and Steitz 1994). Also, results of NMR studies imply that a similar interaction involving ψNH1 occurs in a complementary RNA duplex in solution (Newby and Greenbaum 2002b). 1.2.2 ψ residues in the spliceosome ψ residues are prevalent in the 5' segment of U2 snRNA, where some of them have been associated with spliceosome assembly (Yu et al. 1998; Zhao and Yu 2004; other chapters in this volume). A ψ residue in the region of U2 snRNA that opposes the 5' neighbor of the branch site adenosine has been identified in all eukaryotes investigated to date (Reddy and Busch 1988; Patton et al. 1994; Gu et al. 1996; Massenet et al. 1999), with the exception of C. elegans (Patton and Padgett 2003). Along with the branch site adenosine, this is the most invariant nucleoside in the branch site region.

Conserved pseudouridine in U2 snRNA 5

Fig. 3. Structural models of uBP, the unmodified branch site duplex. Shown is a representative structure (one of an ensemble of 10 low-energy models) of the unmodified branch site duplex. Analysis of the models suggests hydrogen bond formation between both A24 (the branch site A) and A23 (its 5’ neighbor) and the opposing U6. The sequence of the strands is shown in Figure 1 (lower left).

2 Structure of branch site duplexes Studies to evaluate the role of ψ in the region of U2 snRNA pairing with the intron region addressed the relative stability and solution structural features of ψmodified and unmodified duplexes representing the pre-mRNA branch site of S. Cerevisiae (Newby and Greenbaum 2001, 2002a). Measurement of melting transitions indicated that the ψ-modified duplex was only ~0.7 kcal/mole more stable than its unmodified analogue, and displayed a slightly more cooperative melting transition, although neither were as thermally stable as a complementary duplex (Newby and Greenbaum 2001). More detailed structural features were determined by NMR spectroscopic techniques (Newby and Greenbaum 2001, 2002a, 2002b). A torsion angle molecular dynamics (TAMD) protocol (Rice and Brünger 1994; Stein et al. 1997) was utilized in combination with simulated annealing as part of the X-PLOR program (Brünger 1996) for the generation of families of structures of from random starting structures. Coordinates for each structure in the ensemble of 10 lowest energy unmodified structures (1LMV) and the 9 lowest energy structures of the ψmodified sequence (1LPW) have been deposited in the Protein Data Bank. 2.1 Structural features of the unmodified branch site helix The unmodified branch site duplex (uBP) displays a continuous A-type helical geometry, with the branch site adenosine (A24) stacked within the helix. Both A24 and A23, its 5’ neighbor, form hydrogen bonds with the opposing U6. The struc-

6 Nancy L. Greenbaum

ture (Fig. 3) is very similar to that determined for an RNA stem loop representing the phage coat protein-binding site, which differs only in one of the base pairs flanking the extra adenosine residue (Smith and Nikonowicz 2001). These authors noted that, at low pH, the adenine base 5’ to the branch site adenine (A23 in our studies) was protonated. Our structure of the unmodified duplex was calculated from spectral data acquired at pH 6.4 (at which pH A23 would not have been protonated). However, we observed no change in resonance location at pH values +/one pH unit, indicating that the helical structure was not pH-dependent. Several lines of evidence suggest that A24 sometimes adopts an extrahelical position in uBP. The paucity of NOEs involving exchangeable protons in this region, and the broadness of A23 and A24 ribose proton resonances suggests conformational flexibility. As further evidence, we see a small, very broad peak in a one-dimensional spectrum at the chemical shift as A24H2 of ψBP. Moreover, the lower melting temperature of uBP than ψBP (by ~0.7 kcal/mole; Newby and Greenbaum 2001), combined with fluorescence data suggesting at uBP A24 is not fully stacked, paints a dynamic picture of this region. Such conformational flexibility may explain why knockout of the pseudouridylase gene is not lethal and why certain base substitutions in the region are tolerated to some extent. It, therefore, appears that the role of the pseudouridine is to stabilize this extrahelical conformation, as one of many mechanisms utilized by the splicing machinery to facilitate correct branch site selection and to maintain branching efficiency. 2.2 Structural features of the ψ-modified branch site helix A markedly different structure was observed for the ψ-modified duplex (ψBP). The U2 snRNA strand maintains roughly helical parameters, but the backbone of the intron strand in the branch site region has a pronounced kink (Fig. 3) and a number of riboses in this region (the branch site A and the two previous bases) have non-C3'-endo conformations. Most notably, the branch site A is extruded from the helix. Additional support for the extrahelical position of the branch site adenine came from measurements of fluorescence of 2-aminopurine (2AP) in duplexes in which the fluorophore replaced adenine (Zagorowska and Adamiak 1996). 2AP fluorescence is quenched by the adjacent stacking of bases and not by hydrogen bonding (Rachofsky et al. 2001) and, like A, 2AP can form two hydrogen bonds with an opposing U. 2AP fluorescence in uBP was similar to that of the probe in a single intron strand, yet several times greater than in the complementary duplex, implying partial stacking in the unmodified branch site duplex. Fluorescence of 2AP in the ψBP, however, was almost twice that of uBP and six times greater than in the complementary duplex, providing convincing evidence that the branch site adenosine adopts an extrahelical orientation in the ψ-modified duplex (Newby and Greenbaum 2001, 2002a).

Conserved pseudouridine in U2 snRNA 7

2.2.1 Stabilization of the extrahelical branch site adenosine by a base triple The major groove edge of the extrahelical base forms close contacts with the minor groove of the helix, including hydrogen bonds between A24 H62 and N3 of A7, two base pairs upstream. This hydrogen bonding pattern is similar to that seen for the G-A pseudo-pair in a GNRA tetraloop (Jucker et al. 1996), but different from the “A-minor motif” common in the ribosome (Nissen et al. 2001). The result is a nearly coplanar base triple between A24 and the A7-U22 Watson-Crick pair (Fig. 3). This local conformation is consistent with biochemical data demonstrating accessibility of functional groups of the adenine base (Query et al. 1994). The importance of the base triple in stabilizing the observed structure was shown by base mutation studies. When the relative position of A7 and U22 were reversed (i.e. U7 and A22), a base pair is maintained, yet base triple formation is disfavored because there is no longer a hydrogen bond donor accessible in the position where the A7N3 had been. Fluorescence studies of this mutated sequence in which 2AP replaced the branch site adenine (as above) were performed on this mutated helix and compared with the other helices. The relative fluorescence of 2AP in the helix with the A7-U22→U7-A22 mutation was very similar to that of the unmodified branch site helix, and considerably less than the ψ-modified helix. These results imply that when base triple was disfavored, the branch site A reverts to a predominantly intrahelical position (Nelson and Greenbaum, unpublished data). 2.2.2 Interaction of ψ with the opposing adenosine At ambient temperature, many imino protons exchange rapidly with solvent and are exchange broadened beyond detection, and are, therefore, not useful in structure characterization. This was the case for the NH3 proton of ψ6 in the modified branch site duplex. No difference was observed in spectra acquired at pH values ranging from ~4.5 to 7, so that effects of pKa shifts (leading to a protonated A23 or deprotonated ψ) could be excluded from this or any of the other ψ-dependent structural changes. Structural calculations resulted in models in which ψ6 and A23 appeared partially overlapped in most structures instead of forming a canonical base pair, with no consensus as to which base stacks closer to G5. In order to slow the exchange of imino protons sufficiently to identify the resonance location of ψNH3, we performed NMR experiments of exchangeable protons at supercooled temperatures (Schroeder, Skalicky, and Greenbaum, submitted). As temperatures were decreased below –15 °C, a relatively broad new peak appeared at ~11.2, in the region where non-Watson-Crick imino protons resonate. This resonance location was similar to that for the proton in free ψmonophosphate (~10.7) and very different from that of ψNH3 when forming a Watson-Crick base pair (13.1 ppm). This finding further supports the original model that ψ and the opposing A23 do not form a base pair in the branch site helix. The G5-C25 base pair downstream of the branch site adenosine is of normal Watson-Crick geometry. It may seem somewhat paradoxical that an unusual con-

8 Nancy L. Greenbaum

formation would be stable without maximal base pairing. Nonetheless, it has been shown that the integrity of DNA helices is not primarily dependent upon base pair formation between strands (e.g. Smirnov et al. 2002). 2.2.3 Exposure of the 2’OH nucleophile Significantly for biological function, the chain reversal also exposes the 2'OH of A24 to the major groove of the helix (Fig. 3). Despite the apparent flexibility of the backbone in the region of the bulged base, the position of residues forming the base triple that anchors branch site A was well fixed and the orientation of its 2'OH was essentially identical in all structures. The spatial context of the 2'OH provides a structural basis for recognition and access to the nucleophile by the RNA substrate strand in the first step of splicing.

3 Stabilization of the ψ-modified pre-mRNA branch site helix by interactions with water molecules It is not only inter-nucleotide interaction that provides stability for a particular RNA conformation, but nucleotide-solvent interactions as well. There have been a number of studies addressing hydration of DNA helices (e.g. Berman and Schneider 1999) and fewer of RNA (e.g. Egli et al. 1996; Sundaralingam and Pan 2002; Auffinger and Westhof 2001). However, our understanding of the role of water molecules in stabilizing RNA motifs is still rudimentary (e.g. Auffinger and Westhof 1997). 3.1 Global interactions with solvent Adenosine residues appear unpaired more often than other nucleosides, perhaps because adenine is the only base that cannot form three hydrogen bonds and it does not have a carbonyl oxygen, i.e. a strong electron-withdrawing group. “Extra”, i.e. unpaired adenosines are most often found stacked within the helix (Borer et al. 1995; Smith and Nikonowicz 1998; Lynch and Puglisi 2001), presumably because of the energetic cost of solvating the relatively hydrophobic base in polar solvent, although exceptions have been noted (Greenbaum et al. 1996; Berglund et al. 2001). Numerous studies have shown the stacking propensity of adenosine (e.g. Chou and Chin 2001; Znosko et al. 2002; Freier et al. 1986). It is, therefore, particularly interesting that this adenosine is in an extrahelical position under all conditions examined (i.e. this is not a pH-dependent effect). For this reason, we have been examining the contribution of water molecules on both a global and local scale. Calculation of solvent accessible surface area (SASA) of the ψ-modified branch site duplex and its unmodified counterpart has shown that there was no significant difference between the total accessible area or ratio of total po-

Conserved pseudouridine in U2 snRNA 9

lar:nonpolar groups between modified and unmodified duplexes (Xu et al. 2005). This finding is consistent with the very small difference in ∆G for helix formation of the modified and unmodified helices (ψBP was ~0.7 kcal/mole more stable than uBP; Newby and Greenbaum 2001). Calculations revealed substantially greater exposure of nonpolar area of the adenine base in the ψ-modified structure (Xu, Greenbaum, Fenley, submitted), which is apparently compensated energetically by formation of additional nonbonding interactions. At the same time, there is less exposure of the 2’OH, which is depicted in structural models at the surface of the major groove). We speculate that this environment, where there would be lesser shielding by polar groups in solution, may also contribute to recognition of this group. 3.2 Local interactions with water The question arose how ψ, which differs from U primarily in the presence of an additional NH group, favors formation of such a markedly different structure with implication on function. In all NMR-derived structural models, the ψNH1, which resides in the major groove, is in line with its own phosphate oxygen (O1P) within an appropriate distance (3.6 ± 0.1 Å; Newby and Greenbaum 2002a) to form a water-mediated hydrogen bond. An NOE between ψNH1 and the hydrogen atom on the adjacent carbon (ψH6) implies that this proton is protected from exchange with solvent. NMR experiments have shown that, analogous to the case observed in a complementary duplex containing ψ (Newby and Greenbaum 2002b), the ψNH1 proton in the branch site RNA duplex is exchanging with the bulk solvent on a timescale faster than the correlation time of the molecule and is cross-relaxing with water, entirely consistent with a water-mediated hydrogen bond in the major groove. By comparison, no such cross relaxation events involving imino protons occur elsewhere in the ψ-modified helix or in the unmodified helix. Additionally, preliminary data from molecular dynamics simulations identify two water molecules associated with the ψ-modified duplex with lifetimes ~100 ps: one at ψNH1 and one associated with the base triple (Nagan, Clark, Xu, Fenley, and Greenbaum, unpublished data). The presence of ψ in the branch site helix is associated with a number of differences in hydrogen bonding, stacking, and interactions with water. The only change that can be directly attributed to the ψ base itself is formation of this particular hydrogen bond. It is not reasonable to think in terms of a single hydrogen bond as the sole factor in stabilizing an alternative structure in the “all or nothing” sense. It is more realistic to think of it as favoring the alternative conformation, which then is stabilized by an array of different stacking and hydrogen bonding interactions.

10 Nancy L. Greenbaum

Fig. 4. Structural model of the ψBP, the ψ-modified branch site duplex (one of an ensemble of 9 low-energy models). A24, the branch site adenosine, is in an extrahelical position. The base triple between A24 and the A7-U22 Watson-Crick pair is shown to the upper left, and the 2’OH pointing into the major groove is shown in the lower left. The sequence of the strands is shown in Figure 1 (lower right).

4 Global features of the ψ-modified branch site helix In addition to the local changes associated with the ψ-dependent structure, backbone deformation may contribute to formation of recognition elements in the splicing process. The observed over-winding of the helix (Fig. 3) is the result of a lateral excursion of the intron strand backbone at the level of the extrahelical adenosine. The net result is an offset of the helical axis in the ψ-modified duplex between the stems flanking the branch site base, as compared with a single axis through the duplex of the unmodified counterpart (Fig. 4). 4.1 Electrostatic surface features of the ψ-dependent branch site helix The electrostatic potential mapped on the surface of the unmodified branch site displayed similar features to that of a typical A-form RNA helix. As expected, the backbones have mostly negative electrostatic potential as a result of the negatively charged phosphate groups. The grooves display some patches of positive potential, reflecting the presence of electropositive base atoms, but the overall surface potential of the major groove is considerably more negative than that of the minor groove, typical of the A-form RNA helix. Presence of the additional adenosine stacked in the helix did not impact the electrostatic surface.

Conserved pseudouridine in U2 snRNA 11

Fig. 5. Schematic view of branch site helices in the unmodified (left) and ψ-modified (right) duplexes. Presence of the modified base results in extrusion of the branch site adenosine into the minor groove, which, in turn, results in marked deformation of the helical structure seen in the unmodified helix. The axis of the unmodified helix is shown by a single arrow. The axis of the ψ-modified duplex, however, appears to be offset at the level of the branch site residue as a result of the overwinding of the intron strand.

The distortion of the backbone and base pairing in this structural motif has implications on the electrostatic profile of the major groove at the level of the branch site. Electrostatic calculations indicate a region of enhanced electronegativity in the major groove in the vicinity of the 2’OH that is not observed in the unmodified analogue (Fig. 5; Xu et al. 2005). The electrostatic surface potential calculation of the ψ-modified helix points out an intriguing difference between the two duplexes that is likely to play an important role in recognition. The backbone region and the grooves of the regions several base pairs away from the branch site have similar features to the unmodified helix, but there is a distinct and exceptionally negative region in the major groove (Fig. 4b, 4c) corresponding to the region surrounding the 2’-OH of the branch site adenosine. The electrostatic potential in the minor groove of ψBP was not significantly different from that of the minor groove of uBP. NMR and luminescence experiments have not indicated direct metal ion binding in this region, nor is there any direct evidence that these electrostatic features are directly correlated with splicing chemistry. Nonetheless, we speculate that this region of pronounced negativity contributes to cation binding in the context of the catalytic core in order to assist in activation of the nucleophile and/or is directly recognized by components during assembly of the U2 snRNP. Calculations in which the charges contributed by the backbone phosphates and partial charges contributed by bases and riboses were considered independently suggest that the kinked backbone and the bases are each partially responsible for

12 Nancy L. Greenbaum

Fig. 6. Surface electrostatic potential maps of the major groove of uBP (left) and ψBP (right). The atomic charge and radii are those from the AMBER94 molecular mechanical force field. The color scheme used in this map is as follows: yellow is the most negative (-3 kcal/mol/e) and green is the most positive (1 kcal/mol/e). White is neutral. Red and blue represent negative and positive potentials, respectively. The calculation was performed using a hybrid nonlinear PBE approach. The surface electrostatic potential map of the branch site region of ψBP, combined with a ball-and-stick structure of the molecule, is shown in expanded form. The yellow region within the major groove of ψBP suggests a region of significant electronegative potential surrounding the 2’-OH of the branch site nucleophile.

creating the exceptionally negative electrostatic potential region. However, the backbone makes the largest contribution to the total potential in this pocket of enhanced negative electrostatic potential.

5 Extrapolation from the branch site duplex to the native context It can be argued that such a minimalist view of the pre-mRNA branch site helix does not ensure that a similar structure exists in the context of the assembled spliceosome. However, several lines of evidence suggest that ψ impacts on both structural and functional features in the larger, more native, context. For example, measurements of 2AP fluorescence of a longer intron strand paired with a U2 snRNA strand in the context of its interaction with U6 snRNA indicated that the branch site adenosine was still in an extrahelical (increased fluorescence) position (Popovic, Mundoma, Greenbaum, unpublished). Functional experiments by Yu

Conserved pseudouridine in U2 snRNA 13

and coworkers (described elsewhere in this volume) provide strong evidence that this ψ, in its in situ context, exerts a biological role. Also, the observation of significantly enhanced formation of ‘RNA X’ in the protein-free splicing system when this one ψ residue was included in U2 snRNA (Valadkhan and Manley 2003) implies that the structural features observed in the minimal duplex were maintained in the larger complex and that presence of ψ correlates with increased efficiency of the splicing reaction.

6 Functional role of ψ in the branch site The apparently strict conservation of this particular base modification argues strongly that it maintains an important functional role in vivo. Two lines of experimentation provide compelling evidence that this particular ψ modification (ψ35 in S. cerevisiae) favors formation of a structure that facilitates splicing: 1) formation of a catalytic product (‘RNA X’) by a protein-free complex of U2 snRNA, U6 snRNA, and a segment of the intron was greatly enhanced in the presence of this ψ (Valackhan and Manley 2003); and 2) genetic knockout of the pseudouridylating enzyme responsible for modifying this site in yeast, Pus 7, produced cells that were growth disadvantaged when grown in competition with wild type cells (Y-T Yu, personal communication). We also speculate that the ψdependent structure assists in recognition of the branch site region by U2 snRNP splicing factors in assembly of the spliceosome. A ψ has been identified at this specific position of U2 snRNA in virtually every eukaryote investigated thus far (Zhao and Yu 2004b), as well as in the human U12 snRNA of the atac spliceosome (Massenet and Branlant 1999), and the structural features of ψBP are entirely consistent with all biochemical observations concerning branch site recognition and activity. We, therefore, consider it likely that the structural role of this ψ in the branch site motif seen in S. cerevisiae is a universal feature of eukaryotic spliceosomes and that, by helping to define a stable orientation for the branch site nucleophile, the branch site motif favored by ψ establishes a rationale for its phylogenetic preservation. We also note that in addition to our observations of nucleophile positioning in the spliceosomal pre-mRNA branch site, the exogenous guanosine nucleophile in Group I self-splicing introns is anchored by formation of a base triple (Kitamura et al. 2002)

7 Outlook The spliceosome is not only place where ψ residues appear to maintain important functional roles. Ribosome assembly and function may well also depend on presence of certain base modifications, particularly ψ a number of which are located in and around the peptidyl transfer region of the E. coli large ribosomal subunit. For

14 Nancy L. Greenbaum

example, helix 69 of the large ribosomal subunit contains 3 ψ in its stem loop (ψ1911, mψ1915, and ψ1917; Ofengand 2002), the latter two of which are conserved in all three kingdoms. This helix contacts tRNAs at both the A and P sites (Yusupov et al. 2001; Startk et al. 2002; Bashan et al. 2003), suggesting a role in subunit assembly, so it is likely that conservation of these modification relates to recognition. Similarly, loss of a conserved ψ modification in the regions of the A site involved in binding tRNA was shown to decrease the level of translocation substantially (King et al 2003). As was the case in U2 snRNA (Zhao and Yu 2004a), synergistic effects were observed upon combined loss of other modified sites. It is, therefore, very possible that other ψ sites in the spliceosome, ribosome, or other RNA machines may exert their effect through structural control.

Acknowledgements The author thanks Dr. Marcia Fenley, Darui Xu, Dr. Maria Nagan, Nina Clarke, Kersten Schroeder, Dr. Claudius Mundoma, Joycelynn Nelson, and Milena Popovic for sharing unpublished data, and Darui Xu, Dr. Meredith Newby Lambert and Kersten Schroeder for assistance with figures. This work was supported by NIH Grant 2RO1-GM54008 to N.L.G.

References Arnez J, Steitz T (1994) Crystal structure of unmodified tRNAGln complexed with glutaminyl-tRNA synthetase and ATP suggests a possible role for pseudo-uridines in stabilization of RNA structure. Biochemistry 30:7560-7567 Auffinger P, Westhof E (1998) Effects of pseudouridylation in tRNA hydration and dynamics; a theoretical approach. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington DC, pp 103-112 Bashan A, Agmon I, Zarivach R, Schluenzen F, Harms J, Berisio R, Bartels H, Franceschi F, Auerbach T, Hansen HA, Kossoy E, Kessler M, Yonath A (2003) Structural basis of the ribosomal machinery for peptide bond formation, translocation, and nascent chain progression. Mol Cell 11:91-102 Berglund JA, Rosbash M, Schultz SC (2001) Crystal structure of a model branchpoint U2 snRNA duplex containing bulged adenosines. RNA 7:682-691 Borer PN, Lin Y, Wang S, Roggenbuck MW, Gott JM, Uhlenbeck OC, Pelczer I (1995) Proton NMR and structural features of a 24-nucleotide RNA hairpin. Biochemistry 34:6488-6503 Brünger AT (1996) X-PLOR Version 3.851: A System for Crystallography and NMR, Yale University Press, New Haven Charette M, Gray MW (2000 Pseudouridine in RNA: what, where, how, and why. IUBMB Life 49:341-351

Conserved pseudouridine in U2 snRNA 15 Chou SH, Chin KH (2001) Novel cross-strand three-purine stack of the highly conserved 5’-GAAAG-5’ internal loop at the 3’-end termini of Parvovirus genomes. J Biomol NMR 21:307-319 Chui HM-P, Desaulniers J-P, Scaringe SA, Chow CS (2002) Synthesis of helix 69 of Escherichia coli 23S rRNA containing its natural modified nucleosides, m3ψ and ψ. J Org Chem 67:8847-8854 Davis D, Poulter C (1991) 1H-15N NMR studies of Escherichia coli tRNAphe from hisT mutants: A structural role for pseudouridine. Biochemistry 30:4223-4231 Derrick WB, Horowitz J (1993) Probing structural differences between native and in vitro transcribed Escherichia coli valine transfer RNA: evidence for stable base modification-dependent conformers. Nucleic Acids Res 21:4948-4953 Desaulniers J-P, Ksebati B, Chow CS (2003) Synthesis of 15N-enriched pseudouridine derivatives. Org Lett 5:4093-4096 Durant P, Davis D (1999) Stabilization of the anticodon stem-loop of the tRNALys by an A+-C basepair and by pseudouridine. J Mol Biol 285:115-131 Freier SM, Kierzek R, Jaeger JA, Sugimoto N, Caruthers MH, Neilson T, Turner DH ((1986) Improved free-energy parameters for predictions of RNA duplex stability. Proc Natl Acad Sci USA 83:9373-9377 Greenbaum NL, Radhakrishnan I, Patel DJ, Hirsh D (1996) Solution structure of the donor site of a trans-splicing RNA. Structure 4:725-733 Gu J, Patton J, Shimba S, Reddy R (1996) Localization of modified nucleotides in Schizosaccharomyces pombe spliceosomal small nuclear RNAs: modified nucleotides are clustered in functionally important regions. RNA 2:909-918 Hall K, McLaughlin L (1991) Properties of a U1/mRNA 5’ splice site duplex containing pseudouridine as measured by thermodynamic and NMR methods. Biochemistry 30:1795-1801 Hwang T-L, Mori S, Shaka AJ, van Zijl PCM (1997) Application of phase-modulated CLEAN chemical exchange spectroscopy (CLEANEX-PM) to detect water-protein proton exchange and intermolecular NOES. J Am Chem Soc 119:6203-6204 Jucker FM, Heus HA, Yip PF, Moors EHM, Pardi A (1996) A network of heterogeneous hydrogen bonds in GNRA tetraloops. J Mol Biol 264:968-980 King TH, Liu B, McCully RR, Fournier MJ (2003) Ribosome structure and activity are altered in cells lacking snoRNPs that form pseudouridines in the peptidyl transferase center. Mol Cell 11:425-435 Kitamura A, Muto Y, Watanabe S, Kim I, Ito T, Nishiya Y, Sakamoto K, Ohtsuki T, Kawai G, Watanabe K, Hosono K, Takaku H, Katoh E, Yamazaki T, Inoue T, Yokoyama S (2002) Solution structure of an RNA fragment with the P7/P9.0 region and the 3’terminal guanosine of the Tetrahymena group I intron. RNA 8:440-451 Kowalak JA, Dalluge JJ, McCloskey JA, Stetter KO (1994) The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochemistry 33:7869-7876 Lane BG (1998) Historical perspectives on RNA nucleoside modification. In: (Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington DC, pp 120 Lynch SR, Puglisi JD (2001) Structure of a eukaryotic decoding region A-site. J Mol Biol 306:1023-1035

16 Nancy L. Greenbaum Madhani HD, Guthrie C (1994) A novel base-pairing interaction between U2 and U6 snRNAs suggests a mechanism for the catalytic activation of the spliceosome. Annu Rev Genet 28:1-26 Maiväli Ü, Remme J (2004) Definition of bases in 23S rRNA essential for ribosomal subunit association. RNA 10:600-604 Massenet S, Branlant C (1999) A limited number of pseudouridine residues in the human atac spliceosomal U snRNAs as compared to human major spliceosomal U snRNAs. RNA 5:1495-1503 Massenet S, Motorin Y, Lafontaine D, Hurt E, Grosjean H, Branlant C (1999) Pseudouridine mapping in the Saccharomyces cerevisiae spliceosomal U small nuclear RNAs (snRNAs) reveals that pseudouridine synthase pus1p exhibits a dual substrate specificity for U2 snRNA and tRNA. Mol Cell Biol 19:2142-2154 Meroueh M, Grohar PJ, Qiu J, Santa Lucia J, Scaringe SA, Chow CS (2000) Nucl Acids Res 28:2075-2083 Moore M, Query C, Sharp P (1993) Splicing of precursors to mRNA by the spliceosome. In: Gesteland RF, Atkins JF (eds) The RNA World. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, pp 303-357 Newby MI, Greenbaum NL (2001) A conserved pseudouridine modification in eukaryotic U2 snRNA induces a change in branch site architecture. RNA 7:833-845 Newby MI, Greenbaum NL (2002a) Sculpting of the spliceosomal branch site recognition motif by a conserved pseudouridine. Nat Struct Biol 9:958-965 Newby MI, Greenbaum NL (2002b) Investigation of Overhauser effects between pseudouridine and water protons in RNA helices. Proc Natl Acad Sci 99:12697-12702 Nissen P, Ippolito JA, Ban N, Moore PB, Steitz TA (2001) RNA tertiary interactions in the large ribosomal subunit: the A-minor motif. Proc Natl Acad Sci 98:4899-4903 Ofengand J, Malhotra A, Remme J, Gutgsell NS, Del Camp M, Jean-Charles S, Peil L, Kaya Y (2001) Pseudouridines and pseudouridine synthases of the ribosome. Cold Spring Harb Symp Quant Biol 66:147-159 Patton J, Jacobsen M, Pederson T (1994) Pseudouridine formation in U2 small nuclear RNA. Proc Natl Acad Sci USA 91:3324-3328 Patton JR, Padgett RW (2003) Caenorhabditis elegans pseudouridine synthase 1 activity in vivo: tRNA is a substrate, but not U2 small nuclear RNA. Biochem J 372:595-602 Query C, Strobel S, Sharp P (1996) Three recognition events at the branch-site adenine. EMBO J 15:1392-1402 Rachofsky EL, Osman R, Ross JBA (2001) Probing structure and dynamics of DNA with 2-aminopurine. Biochemistry 40:946-956 Reddy R, Busch H (1988) Small nuclear RNAs: RNA sequences, structure, and modifications. From Structure and Function of Major and Minor Small Nuclear Ribonucleoprotein Particles. In: Birnstiel ML (ed) Structure and Function of Major and Minor Small Nuclear Ribonucleoprotein Particles. Springer-Verlag Press, Berlin, pp 1-37 Rice LM, Brünger AT (1994) Torsion angle dynamics – reduced variable conformational sampling enhances crystallographic structure refinement. Proteins 19:277-290 Smirnov S, Matray TJ, Kool ET, de los Santos C (2002) Integrity of duplex structures without hydrogen bonding: DNA with pyrene paired at abasic sites. Nucleic Acids Res 30:5561-5569 Smith JS, Nikonowicz EP (1998) NMR structure and dynamics of an RNA motif common to the spliceosome branch-point helix and the RNA-binding site for phage GA coat protein. Biochemistry 37:13486-13498

Conserved pseudouridine in U2 snRNA 17 Stark H, Rodnina MV, Wieden HJ, Zemlin F, Wintermeyer W, van Heel M (2002) Ribosome interactions of aminoacyl-tRNA and elongation factor Tu in the codonrecognition complex. Nat Struct Biol 9:849-854 Stein EG, Rice LM, Brünger AT (1997) Torsion-angle molecular dynamics as a new efficient tool for NMR structure calculation. J Magn Res 124:154-164 Valadkhan S, Manley JM (2001) Splicing-related catalysis by protein-free snRNAs. Nature 413:701-707 Valadkhan S, Manley JM (2003) Characterization of the catalytic activity of U2 and U6 snRNAs. RNA 9:892-904 Yarian CS, Basti MM, Cain RJ, Ansari G, Guenther RH, Sochacka E, Czerwinska G, Malkiewicz A, Agris PF (1999) Structural and functional roles of the N1- and N3-protons of ψ at tRNA’s position 39. Nucleic Acids Res 27:3543-3549 Xu D, Greenbaum NL, Fenley MO (2005) Recognition of the spliceosomal branch site RNA helix on the basis of surface and electrostatic features. Nucleic Acids Res, in press Yu Y-T, Shu M-D, Steitz JA (1998) Modifications of U2 snRNA are required for snRNP assembly and pre-mRNA splicing. EMBO J 17:5783-5795 Yusupov MM, Yusupova GZ, Baucom A, Lieberman K, Earnest TN, Cate JH, Noller HF (2001) Crystal structure of the ribosome at 5.5 Å resolution. Science 292:883-896 Zagorowska I, Adamiak RW (1996) 2-Aminopurine labeled RNA bulge loops: synthesis and thermodynamics. Biochimie 78:123-130 Zhang L, Doudna JA (2002) Structural insights into Group II intron catalysis and branchsite selection. Science 295:2084-2087 Zhao X, Yu YT (2004a) Pseudouridines in and near the branch site recognition region of U2 snRNA are required for snRNP biogenesis and pre-mRNA splicing in Xenopus oocytes. RNA 10:681-690 Zhao X, Yu YT (2004b) Detection and quantitation of RNA base modifications. RNA 6:996-1002 Znosko BM, Burkard ME, Krugh TR, Turner DH (2002) Molecular recognition in purinerich internal loops: thermodynamic, structural, and dynamic consequences of purine for adenine substitution in 5’(rGGCAAGCCU)2. Biochemistry 41:14978-14987

Greenbaum, Nancy L. Department of Chemistry and Biochemistry, Florida State University, Tallahassee, FL 32306-4390, USA [email protected]

Mechanisms and functions of RNA-guided RNA modification Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Abstract RNA-guided 2'-O-methylations and pseudouridylations occur in several different types of RNAs and in a wide range of organisms. Hundreds of the RNAs that guide these modifications have been identified, leading to breakthroughs in our understanding of the mechanisms of RNA-guided RNA modifications and, to some extent, the functions of 2'-O-methylated residues and pseudouridines. There are two classes of guide RNAs, namely box C/D and box H/ACA RNAs, which direct 2'-O-methylations and pseudouridylations, respectively. The guide RNAs function primarily by binding to complementary regions in the target RNAs. Cellular guide RNAs exist in RNA-protein complexes comprised of one guide RNA and a set of proteins that includes the modifying enzyme (2'-O-methylase or pseudouridylase). We are beginning to understand the basis for the importance of the RNA-guided modifications, which are well conserved and clustered in functionally important regions of RNAs. Recent reports indicate that modified nucleotides in rRNAs and spliceosomal snRNAs contribute to protein synthesis and premRNA splicing, respectively.

1 Introduction Post-transcriptional modifications occur in a large number of cellular RNAs and are an important component of RNA maturation. Modifications can occur within the base, sugar ring (ribose), or both, and thereby increase the diversity and functional potential of RNAs. In fact, a large collection of naturally occurring modified nucleotides has been identified (Motorin and Grosjean 1998). Importantly, modified nucleotides are, in most cases, conserved from species to species and are often clustered in regions of functional importance within RNAs (Massenet et al. 1998; Ofengand and Fournier 1998; Decatur and Fournier 2002). The fact that modified RNA nucleotides are widespread, conserved and located in strategic locations within RNAs leaves little doubt about their functional relevance. Yet despite intense work over many years, the question of what roles the modified nucleotides play in cellular processes remains largely unanswered. Pseudouridylation and 2'-O-methylation are the most abundant internal modifications found in stable RNAs, namely tRNAs (Bjork 1995; Grosjean et al. 1995; Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b105585 / Published online: 14 December 2004 © Springer-Verlag Berlin Heidelberg 2005

224 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Auffinger and Westhof 1998; Sprinzl et al. 1998; Hopper and Phizicky 2003), rRNAs (Maden 1990; Bachellerie and Cavaille 1998; Ofengand and Fournier 1998) and spliceosomal snRNAs (some snoRNAs as well) (Reddy and Busch 1988; Massenet et al. 1998). In fact, these are the predominant modifications found in rRNAs and spliceosomal snRNAs. The mammalian rRNAs contain ~100 pseudouridines (Ψ) and ~100 2'-O-methylated residues (Maden 1990; Bachellerie and Cavaille 1998; Ofengand and Fournier 1998), and a total of 30 2'-Omethylated residues and 24 pseudouridines have been reported in the major vertebrate spliceosomal snRNAs (including U1, U2, U4, U5, and U6 snRNAs) (Reddy and Busch 1988; Massenet et al. 1998). 2'-O-methylation and pseudouridylation are also the predominant modifications in U3, a well-characterized snoRNA (Reddy and Busch 1988). While pseudouridylations and 2'-O-methylations may not be the most prevalent modifications in tRNA, they are present in all tRNAs (Bjork 1995; Grosjean et al. 1995; Auffinger and Westhof 1998). RNA modifications can be categorized as either RNA-dependent or RNAindependent, based on the mechanism by which they are generated. RNAdependent modifications are introduced by RNA-protein complexes (for example small nucleolar ribonucleoprotein complexes or snoRNPs), in which the RNA component serves as a guide that base-pairs with the target RNA to direct modification at a specific site(s) (Kiss 2001; Bachellerie et al. 2002; Filipowicz and Pogacic 2002; Kiss 2002; Terns and Terns 2002; Decatur and Fournier 2003). RNAindependent modifications are catalyzed by a single protein or a protein complex that recognizes and binds to a specific RNA sequence or structure (Bjork 1995; Alexandrov et al. 2002; Ofengand 2002; Ferre-D'Amare 2003; Ma et al. 2003). While most RNA base modifications are catalyzed by the RNA-independent (protein only) mechanism, 2'-O-methylations and pseudouridylations are introduced by both RNA-independent and RNA-dependent mechanisms depending on the RNA type and organism. Computational and experimental evidence indicates that 2'-O-methylation and pseudouridylation of eukaryotic and archaeal rRNAs and higher eukaryotic spliceosomal snRNAs are almost exclusively catalyzed by the RNA-dependent mechanism (Dennis et al. 2001; Kiss 2001, 2002; Bachellerie et al. 2002; Terns and Terns 2002; Decatur and Fournier 2003; Omer et al. 2003). 2’O-methylation of archaeal tRNA is also catalyzed by RNA-dependent mechanism (Omer et al. 2000; Clouet d'Orval et al. 2001; Dennis et al. 2001). Recent reports suggest that RNA-guided modifications may occur in certain mRNAs as well (Cavaille et al. 2000; Liang et al. 2002). In this review, we focus on RNAdependent RNA modifications, including RNA-guided pseudouridylation and 2'O-methylation in various organisms.

2 Discovery of eukaryotic snoRNAs that guide rRNA modifications The nucleolus of eukaryotic cells harbors, in addition to precursor and mature rRNAs and ribosomal proteins, a huge number of small RNAs (termed small nu-

Mechanisms and functions of RNA-guided RNA modification 225

cleolar or snoRNAs) ranging from ~60 to ~300 nucleotides in length in metazoans and from ~60 to ~600 nucleotides in unicellular organisms. In recent years, scores of snoRNAs have been identified and characterized, allowing the elucidation of common features and mechanisms of action. We now understand that most of the identified snoRNAs function as guides that direct site-specific 2'-O-methylations and pseudouridylations in rRNA (and perhaps other RNA substrates that pass transiently through the nucleolus). The first snoRNA was discovered nearly four decades ago. We have come a long way in our appreciation of the snoRNAs and our understanding of the mechanisms underlying rRNA modifications. 2.1 Early studies of snoRNAs Research on snoRNAs started in the late 1960s and early 1970s. The first snoRNAs discovered and characterized were a subset that are involved in pre-rRNA processing rather than modification. Characterization of these first snoRNAs provided a foundation for understanding the closely related modification guide RNAs. In 1968, U3 became the first snoRNA to be identified in mammalian cells (Hodnett and Busch 1968). Because of its unusually high abundance (~2 × 105 copies per cell), U3 is readily detectable by denaturing gel electrophoresis of total nuclear RNA. U3 is a relatively large snoRNA (~200 nt in length) that possesses a 5' trimethylguanosine (TMG) cap structure (like the nucleoplasmic spliceosomal snRNAs), associates with the abundant nucleolar protein fibrillarin (Nop1p), a target of autoantibodies, and is essential in yeast (Reddy and Busch 1988; Tollervey et al. 1991). Careful inspection of U3 sequences from various species revealed short conserved sequence elements, two of which [termed boxes C’ (UGAUGA/U) and D (CUGA)] are critical for U3 RNA processing, transport, protein association and function (Terns and Terns 2002). UV cross-linking analysis indicates that U3 binds to the 5' ETS (external transcribed spacer) region of pre-rRNA and is involved in ETS primary cleavage, an early step during prerRNA processing (Maser and Calvet 1989; Stroke and Weiner 1989; Kass et al. 1990; Maxwell and Fournier 1995). U3 also contributes to pre-rRNA processing at the ITS1 (the first internal transcribed spacer)-5.8S boundary (Gerbi et al. 1990; Maxwell and Fournier 1995). It was suspected that there might be additional snoRNAs in the nucleolus that participated in pre-rRNA processing but were not initially detected due to lower abundance. Several techniques were developed to isolate or enrich nucleolar RNAs. For instance, small nucleolar RNAs were separated from other cellular RNAs by nucleolar fractionation, sucrose gradient fractionation, cross-linking or hybridization to rRNAs, and immunoprecipitation using antibodies (or autoantibodies) against the TMG cap structure or fibrillarin (Maxwell and Martin 1986; Trinh-Rohlik and Maxwell 1988; Tyc and Steitz 1989; Ruff et al. 1993). Various approaches, employed independently or in combination and coupled with denaturing gel electrophoresis and sequencing, yielded fruitful results. By 1995, many new snoRNAs were identified in mammals, including U8, U13, U14–U24, MRP7.2 RNA, E2 and E3 (Maxwell and Fournier 1995). Indeed, these snoRNAs are

226 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

relatively low in abundance, with U8 and U13 (both 5'-TMG capped) being present at ~4 × 104 copies per cell and the others at roughly 103–104 copies per cell (Maxwell and Fournier 1995; Yu et al. 1999). Biochemical approaches were also used to identify small nucleolar RNAs in yeast (Wise et al. 1983; Zagorski et al. 1988; Li et al. 1990; Balakin et al. 1993). For instance, Wise et al. (1983) used semi-denaturing two-dimensional gel electrophoresis followed by 5' cap labeling (decapping followed by recapping with [α-32P]GTP) to identify 5' TMG-capped snoRNAs in yeast. It quickly became clear that the small RNAs found in nucleoli were a disparate group. Only a few of the newly identified snoRNAs (U8, U14, U22, snR30, and 7.2/MRP) were shown to be involved in pre-rRNA processing; most of the RNAs were not essential and their functions were unknown (Maxwell and Fournier 1995). Unlike U3, most mammalian snoRNAs, including U14–U24, MRP-7.2 RNA, E1, E2, and E3, do not possess a 5' TMG cap. In addition, while some of the new RNAs shared conserved sequence elements box C and box D with U3, others (mammalian TMG-minus snoRNAs U17/E1, E2, E3, U19 and U23, and a large number of yeast snoRNAs, both TMG-capped and TMG-minus) do not contain a box C/D motif and do not associate with fibrillarin (or Nop1p in yeast) (Maxwell and Fournier 1995). 2.2 Two classes of snoRNAs—box C/D and box H/ACA snRNAs In an effort to classify the snoRNAs, the Fournier group conducted a comparative analysis that identified conserved sequences and structural elements common to snoRNAs lacking boxes C and D (Balakin et al. 1996). They showed that, except for MRP-7.2 RNA, all human and yeast non-C/D snoRNAs possess a common predicted secondary structure of two or more stem-loops separated and flanked by single-stranded regions (Balakin et al. 1996; Ni et al. 1997). The structure of this class of RNAs was later refined to the current “hairpin-hinge-hairpin-tail” model (Ganot et al. 1997a) (Fig. 1). Another defining feature, the trinucleotide sequence ACA, was found in the 3' single-stranded region, three nucleotides away from the 3' terminus (Balakin et al. 1996; Ganot et al. 1997a; Ni et al. 1997). In addition, there was a variant ACA sequence (ANANNA), termed the H box, located in the hinge region between the two hairpins. The H and ACA boxes were found to be essential for RNA stability and for binding of the nucleolar protein Gar1p (Balakin et al. 1996; Ganot et al. 1997a, 1997b). Thus, with the exception of the MRP-7.2 RNA, all known snoRNAs were classified into two major families: box C/D snoRNAs that associate with fibrillarin and box H/ACA snoRNAs that bind Gar1 (Balakin et al. 1996) (Fig. 1). 2.3 Discovery that Box C/D snoRNAs guide rRNA 2'-O-methylation A great deal of effort was devoted to determining the function of the rapidly increasing number of newly identified snoRNAs. Sequence inspection revealed significant complementarity between box C/D snoRNAs and rRNAs, suggesting that

Mechanisms and functions of RNA-guided RNA modification 227

Fig. 1. RNA-guided RNA 2'-O-methylation and pseudouridylation. Box C/D and box H/ACA snoRNAs guide 2'-O-methylation and pseudouridylation, respectively, by binding to complementary regions in target RNAs. Boxes C, D, C', D', H, and ACA are shown. 2'OMe represents the target 2'-O-methylation site that is always the fifth nucleotide from box D or D', and Ψ is the target pseudouridylation site that is always left unpaired in the pseudouridylation pocket. N, any nucleotide.

snoRNAs might function as chaperones for ribosome biogenesis (Bachellerie et al. 1995; Steitz and Tycowski 1995). A correlation between the locations of 2'-Omethylated residues in rRNAs and regions of snoRNA-rRNA complementarity was also noted (Bachellerie et al. 1995). Moreover, it was found that mutations in Nop1p, the yeast fibrillarin homologue, greatly reduced 2'-O-methylation of rRNA (Tollervey et al. 1993). These observations led to the hypothesis that box C/D snoRNAs function as guides that direct 2'-O-methylation of rRNA (Bachellerie et al. 1995). The experimental evidence arrived soon after the hypothesis was proposed. In 1996, it was discovered that rRNA 2'-O-methylation always occurs in the residue basepaired to the nucleotide in snoRNA precisely 5 nucleotides upstream from box D (or D') (Fig. 1) (Cavaille et al. 1996; Kiss-Laszlo et al. 1996) . In addition, it was shown that deletion of a particular box C/D snoRNA resulted in loss of 2'-O-methylation at the target site in rRNA in yeast. Importantly, sitespecific methylation could be restored upon reintroduction of the box C/D snoRNA into the deletion strain, and modification of novel target sites could be directed by introduction of snoRNAs with appropriate guide sequences (Cavaille et al. 1996). Thus, the “Box D +5 rule” for prediction of the site of 2’-O-methylation guided by snoRNAs was established in 1996, and has since been confirmed in various organisms including Xenopus and human, suggesting that RNA-guided 2'O-methylation of rRNA is universal among eukaryotes (Peculis 1997; Smith and Steitz 1997; Kiss 2001, 2002).

228 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

2.4 Discovery that Box H/ACA snoRNAs guide rRNA pseudouridylation The determination of the function of the box H/ACA snoRNAs was more challenging (due to the lack of extensive contiguous stretches of complementarity to rRNAs). However, psoralen cross-linking (which favors the detection of basepairing interactions) had generated cross-links between box H/ACA snoRNAs and rRNAs (Rimoldi et al. 1993), suggesting a basepairing interaction between rRNA and box H/ACA snoRNAs. Inspired by the work linking box C/D snoRNAs to rRNA 2'-O-methylation, the Fournier group and the Kiss group soon demonstrated that box H/ACA snoRNAs function as guides that direct rRNA pseudouridylation, the other major type of rRNA modification for which no mechanism had been ascribed (Ganot et al. 1997a; Ni et al. 1997). The guide sequences in box H/ACA RNAs are found in two segments in the linear RNA sequence that are brought together in internal loops within the hairpins (Fig. 1). Base-pairing between the bipartite guide sequence and the rRNA positions the target uridine at the base of the upper stem of the hairpin, leaving it unpaired within the so-called "pseudouridylation pocket" and located about 14–16 nucleotides upstream of box H or box ACA (Fig. 1). The snoRNA-guided pseudouridylation mechanism has been tested and verified in various systems (Ganot et al. 1997a; Ni et al. 1997; Jady and Kiss 2001; Zhao et al. 2002). 2.5 Toward identification of complete sets of rRNA modification guide snoRNAs The recognition that snoRNAs serve as guides for 2'-O-methylation and pseudouridylation constituted a major step in understanding the rRNA modification puzzle. Still, the number of snoRNAs identified at that time could not account for the number of modified nucleotides known to exist in rRNAs. This fact has prompted large-scale searches for box C/D and box H/ACA snoRNAs, and development of powerful new approaches for identifying small non-coding RNAs. Bioinformatic approaches have been particularly productive in the identification of new box C/D snoRNAs, and more recently also box H/ACA RNAs. The finding that box C/D snoRNAs guide rRNA 2'-O-methylation allowed the development of computer algorithms that were designed to search genome sequences for patterns of conserved sequence elements (e.g. box C and box D) and complementarities to rRNA (Lowe and Eddy 1999; Samarsky and Fournier 1999; Brown et al. 2003b). Computer-assisted searches have identified a large number of putative 2'-O-methylation guides in various eukaryotic organisms (Lowe and Eddy 1999; Qu et al. 1999; Samarsky and Fournier 1999; Barneche et al. 2001; Brown et al. 2001, 2003b; Accardo et al. 2004). In yeast in particular, almost all the box C/D snoRNAs necessary to account for the known rRNA 2'-O-methylations (51/55) have been identified and confirmed experimentally (Lowe and Eddy 1999; Samarsky and Fournier 1999; Brown et al. 2003b). The computational identification of box H/ACA snoRNAs is more difficult (due to the bipartite nature of the

Mechanisms and functions of RNA-guided RNA modification 229

guide sequence and shorter conserved sequence elements), but was very recently accomplished (Schattner et al. 2004; Huang et al. 2004) and brought the number of yeast rRNA pseudouridines with corresponding H/ACA guide RNAs up to 41 (of 44 known) (Schattner et al. 2004). Powerful experimental approaches have also been applied to snoRNA identification, including yeast genome analysis followed by Northern blotting (Olivas et al. 1997) and construction and analysis of cDNA libraries of selected cellular RNAs, an approach used in identification of some of the original snoRNAs and now referred to as RNomics (Bachellerie and Cavaille 1997; Huttenhofer et al. 2001, 2004; Kiss and Jady 2004). The “RNomics” approach has produced huge numbers of new box C/D and box H/ACA snoRNAs in various organisms (Dunbar et al. 2000; Gaspin et al. 2000; Omer et al. 2000; Huttenhofer et al. 2001; Qu et al. 2001; Kiss 2002; Marker et al. 2002; Tang et al. 2002; Vitali et al. 2003; Yuan et al. 2003; Kiss et al. 2004). Most of the identified snoRNAs are predicted to guide rRNA modification, though the experimental approach (more so than the computational approach, which has largely selected for complementarity to rRNA) also produces snoRNAs that are thought to target other cellular RNAs (see below). The combined approaches appear to have yielded nearly complete sets of rRNA modification guide snoRNAs not only in yeast but also in human where at least 100 of ~105–107 known 2'-O-methylation sites and 79 of ~97 pseudouridylation sites in rRNA are accounted for with snoRNAs (Bachellerie et al. 2002; Huttenhofer et al. 2002; Vitali et al. 2003; Kiss et al. 2004). While it is possible that some of the remaining modifications might be catalyzed by RNA-independent mechanisms, the fact that all but one (see Chapter 9 in this volume) of the eukaryotic rRNA modifications characterized, thus far, are guided by snoRNAs argues that there are a few additional snoRNAs to be identified.

3 RNAs also guide the pseudouridylation and 2'-Omethylation of snRNAs It has long been known that the spliceosomal snRNAs contain a large number of modified nucleotides (Fig. 2), yet research on the modification mechanisms did not begin until shortly after the discovery of snoRNA-guided rRNA modifications. Given that both snRNAs and rRNAs are extensively modified by 2'-O-methylation and pseudouridylation, it was reasonable to suspect that, like rRNA modification, spliceosomal snRNA modifications are catalyzed by a snoRNA-guided mechanism as well. In 1998, U6 snRNA became the first spliceosomal snRNA for which a snoRNA-guided modification mechanism was reported (Tycowski et al. 1998). Taking advantage of conserved elements identified in RNAs that guide rRNA 2'O-methylation, (i.e., the C/D boxes and the guide sequence complementary to the target RNA across the modified residue), Tycowski et al. (1998) searched available databases and identified two possible box C/D snoRNA guides (mgU6-47 and mgU6-77) in various organisms that might be responsible for U6 snRNA 2'O-methylation. Using the Xenopus oocyte reconstitution system, they showed that

230 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Fig. 2. Pseudouridines and 2'-O-methylated residues in human spliceosomal snRNAs. Primary and secondary structures of human spliceosomal snRNAs are shown. 2'-O-methylated nucleotides are indicated by circles, and pseudouridines (Ψ) are indicated by rectangles. The thick lines denote the nucleotides involved in RNA-RNA interactions or implicated in catalysis during pre-mRNA splicing. The gray boxes indicate the Sm-binding sites. The 5' caps (2, 2, 7 trimethylated guanosine cap for U1, U2, U4, and U5, and γ-methylated guanosine cap for U6) are also shown.

Mechanisms and functions of RNA-guided RNA modification 231

depletion of the putative guide RNAs completely abolished U6 2'-O-methylation at the predicted sites. Moreover, site-specific 2'-O-methyaltion was rescued upon injection of the in vitro transcribed guide RNA into depleted oocytes. Interestingly, one of the box C/D snoRNAs (mgU6-77) exhibited dual substrate specificity, guiding 2'-O-methylation at position 2970 in 28S rRNA as well as position 77 in U6 (Tycowski et al. 1998). Later, another U6 2'-O-methylation guide (mgU653) was identified in human, and snoRNA-guided modification was further demonstrated for mammalian U6 (Ganot et al. 1999). It was unclear whether the guide mechanism for U6 2'-O-methylation applied to the other spliceosomal snRNA modifications as well, since U6 differs from all the other major spliceosomal snRNAs (U1, U2, U4, and U5) in many ways (Yu et al. 1999). For instance, U6 is transcribed by RNA polymerase III (Pol III), whereas the other spliceosomal snRNAs are RNA polymerase II (Pol II) transcripts. U6 contains a γ-methyl cap and does not bind to Sm core proteins, whereas the other snRNAs possess a TMG cap and bind tightly to Sm proteins. Finally, whereas U6 probably never leaves the nucleus, the other snRNAs travel through the cytoplasm during their biogenesis. Indications that modification of the Pol II–transcribed spliceosomal snRNAs might also be catalyzed by an RNA-guided mechanism came from the identification of a novel RNA (U85) in human and fruit fly that contains both box C/D and box H/ACA motifs (Jady and Kiss 2001). Careful inspection of the RNA revealed sequences complementary to U5 snRNA, which could position C45 and U46 in U5 for 2'-O-methylation and pseudouridylation, respectively. Indeed, this prediction was confirmed by modification experiments both in vivo and in vitro (Jady and Kiss 2001). Soon after this report, three more such "hybrid" guide RNAs (U87, U88, and U89) were identified and predicted to guide 2'-O-methylation of U4 and U5 as well as pseudouridylation of U5 (Darzacq et al. 2002). Interestingly, some of these RNAs appeared to have overlapping guide functions that could target the same nucleotide for modification, suggesting redundant modes of modification at certain sites. However, as more spliceosomal snRNA–specific box C/D and box H/ACA guide RNAs were discovered, not all exhibited the hybrid composition of U85, U87, U88, and U89. Instead, most of them fell into either the box C/D or box H/ACA category (Kiss et al. 2004). One exception was U93, which appeared to combine two box H/ACA domains, resulting in four hairpins instead of two (Kiss et al. 2002). One of the hairpins was predicted to guide the formation of Ψ54 in U2. While one H/ACA guide RNA that directs U2 pseudouridylation at two different sites in the branch site recognition region appears to reside within the nucleoplasm of Xenopus oocytes (Zhao et al. 2002), most if not all the other guide RNAs directing spliceosomal snRNA modifications are localized to Cajal bodies (Darzacq et al. 2002). These guide RNAs are therefore designated scaRNAs, for small Cajal body-specific RNAs (Darzacq et al. 2002). Data accumulated, thus far, from several labs (Tycowski et al. 1998; Ganot et al. 1999; Huttenhofer et al. 2001; Jady and Kiss 2001; Darzacq et al. 2002; Zhao et al. 2002; Vitali et al. 2003; Kiss et al. 2002, 2004) suggest that at least 28 spliceosomal snRNA–specific guide RNAs have been identified in various organisms (e.g. human, mouse, Xenopus, fruit fly), although some of these guide RNAs con-

232 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

stitute homologs among the different organisms. These RNAs have been proven (Tycowski et al. 1998; Jady and Kiss 2001; Zhao et al. 2002) or predicted (Ganot et al. 1999; Huttenhofer et al. 2001; Darzacq et al. 2002; Vitali et al. 2003; Kiss et al. 2002, 2004) to guide either 2'-O-methylation at 13 of 30 known sites or pseudouridylation at 12 of 24 known sites in the 5 spliceosomal snRNAs (U1, U2, U4, U5, and U6). In contrast to the data accumulated for rRNA modifications, only a small fraction of spliceosomal snRNA modification sites have been accounted for. Thus, the challenge remains to identify the guide RNAs for the majority of these sites. However, it is possible that some of the modifications may be catalyzed by an RNA-independent mechanism. In this regard, at least two (Ψ35 and Ψ44) of the three pseudouridylation sites in yeast U2 are generated by single-polypeptide enzymes, Pus7p and Pus1p, respectively (Massenet et al. 1999; Ma et al. 2003). The continued search for small guide RNAs will undoubtedly clarify this issue and further our understanding of the general mechanisms underlying spliceosomal snRNA modifications.

4 sno/scaRNAs may also guide mRNA modifications Experimental RNomics of mouse brain cells identified four brain-specific snoRNAs, including three box C/D and one box H/ACA snoRNAs (Cavaille et al. 2000; Filipowicz 2000). Interestingly, the three box C/D snoRNA genes, two of which are tandemly repeated multiple times, are located within a chromosomal region implicated in Prader-Willi syndrome (PWS), a neurogenetic disease caused by deficient paternal gene expression. Because these snoRNAs have not been detected in PWS patients or in a PWS mouse model, the expression of the brainspecific snoRNAs are believed to be paternally imprinted (Cavaille et al. 2000). Although sequence inspection revealed no target sites in rRNAs or in the spliceosomal snRNAs, an 18-nucleotide region in one of the box C/D snoRNAs is complementary to the mRNA for the brain-specific serotonin 2C receptor, suggesting a role for this snoRNA in mRNA 2'-O-methylation and processing/function (Cavaille et al. 2000). However, before the connection can be established, it is important to determine whether the predicted site in the mRNA is indeed 2'-O-methylated and where within the cell the box C/D snoRNA is localized. Given that the other three brain-specific snoRNAs exhibit no complementarity to rRNAs or spliceosomal snRNAs, they may also target mRNAs that are yet to be determined. In this regard, many snoRNAs identified by experimental RNomics lack target sites in rRNAs and snRNAs as well, and thus, it is of great interest to determine whether they have target sites in mRNAs (Bachellerie et al. 2002). SLA-1 was originally identified as the spliced leader (SL)-associated RNA in trypanosomes (Watkins et al. 1994). However, it has been reported that SLA-1 in fact shares characteristics of eukaryotic box H/ACA snoRNAs (Liang et al. 2002). The presumed bipartite guide sequences in the pseudouridylation pocket exhibit complementarity to the SL RNA, perfectly positioning a conserved SL uridine in

Mechanisms and functions of RNA-guided RNA modification 233

the pseudouridylation pocket for pseudouridylation. Pseudouridylation mapping using CMC modification followed by primer-extension confirmed that the predicted target residue is indeed a pseudouridine. Mutagenesis analysis further demonstrated that the formation of pseudouridine at this position in SL RNA is dependent on the intact guide sequences in the box H/ACA-like RNA (Liang et al. 2002). Because the SL sequence is donated to the 3' exon during trans-splicing of pre-mRNA, the newly generated mature mRNAs in trypanosomes inherit a pseudouridine. In this sense, the box H/ACA RNA indirectly targets mRNA for pseudouridylation. Modified nucleotide(s) in mRNAs could play an important role in mRNA processing, transport, stability, or protein translation.

5 Small RNA–guided RNA modification of rRNA and tRNA in archaea The presence of homologs of the proteins that function with box C/D and box H/ACA RNAs in archaea predicted the presence of homologous RNAs and RNAguided rRNA modification in this domain of life (Lafontaine and Tollervey 1998; Watanabe and Gray 2000). In 2000, Omer et al. co-precipitated a number of box C/D RNAs using antibodies against the Sulfolobus solfataricus homologs of fibrillarin and Nop56/58 (another box C/D snoRNA–specific protein) (Omer et al. 2000). The computational algorithm developed for identification of yeast box C/D RNAs (Lowe and Eddy 1999) was re-trained to recognize the more compact archaeal RNAs and used to search archaeal genomes for candidate box C/D RNAs (Omer et al. 2000). In the end, over 200 potential RNAs were identified in 7 archaeal genomes (Omer et al. 2000). At the same time the Bachellerie laboratory also identified and characterized predicted box C/D RNAs in the three Pyrococcus genomes (Gaspin et al. 2000). The existence of the computationally predicted RNAs and of modifications at predicted target sites have been confirmed in many cases (Gaspin et al. 2000; Omer et al. 2000). These results suggest that RNAguided rRNA 2'-O-methylation is an ancient mechanism that functions not only in eukaryotes but also in archaea, which lack a nucleus. When archaeal box C/D RNAs are injected into a eukaryotic nucleus (Xenopus oocyte) the RNAs localize to the eukaryotic nucleolus, interact with the eukaryotic box C/D RNP proteins and guide rRNA 2'-O-methylation in a site-specific manner, indicating the conservation of the essential features of this class of RNAs over 2000 million years of divergent evolution (Speckmann et al. 2002). Interestingly, the number of 2’-Omethylations and sRNAs identified in archaeal species increases with increasing optimum growth temperature (Noon et al. 1998; Dennis et al. 2001; Omer et al. 2003). This correlation may reflect the function of rRNA 2'-O-methylation (see section 9). With the identification of the box C/D RNAs in archaea, another target for RNA-guided RNA modification was uncovered. Some of the newly identified archaeal box C/D RNAs were found to contain complementarity to tRNA (Omer et al. 2000; Clouet d'Orval et al. 2001; Dennis et al. 2001). Currently, as many as 21

234 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

predicted box C/D RNAs are thought to target a total of 23 different sites in tRNAs (or pre-tRNAs) (Omer et al. 2000; Clouet d'Orval et al. 2001; Dennis et al. 2001). Some of the RNAs have the potential to modify a site that is common to multiple (up to 19) tRNAs (http://rna.wustl.edu/snoRNAdb/). The most frequently targeted site is the wobble position within the anticodon loop of tRNAs (position 34) (http://rna.wustl.edu/snoRNAdb/). Box H/ACA RNAs that guide pseudouridylation of rRNA were identified in archaea by RNomics as well as computational approaches (Klein et al. 2002; Tang et al. 2002; Rozhdestvensky et al. 2003). Unlike most eukaryotic box H/ACA snoRNAs that have two hairpin structures, archaeal box H/ACA RNAs can have one, two, or three hairpin structures, each containing a pseudouridylation pocket (Tang et al. 2002; Rozhdestvensky et al. 2003). Five of six predicted rRNA target sites in Archaeoglobus fulgidus have been experimentally confirmed (Tang et al. 2002; Rozhdestvensky et al. 2003). Thus, it appears that rRNA pseudouridylation is also carried out by an RNA-guided modification system in this domain of life.

6 Gene organization and biosynthesis of snoRNAs As more vertebrate snoRNAs have been identified, it is clear that only a small fraction is transcribed from independent promoters (and contain a 5' TMG cap). Most of the vertebrate snoRNAs contain a 5' monophosphate group (Maxwell and Fournier 1995; Yu et al. 1999) and are encoded within introns of protein coding genes (Maxwell and Fournier 1995; Tycowski and Steitz 2001). Mouse U14 was the first such intron-encoded snoRNA to be identified (Liu and Maxwell 1990). U14 is positioned within an intron of the mouse hsc70 heat shock gene, and its production is coupled with hsc70 pre-mRNA processing (Leverette et al. 1992). Soon after this finding, many snoRNAs were identified within the introns of various host genes, thus accounting for a large number of the cap-minus snoRNAs (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002). In general, in vertebrates a single snoRNA gene is found within an intron, however, in other organisms, including Drosophila and rice, clusters of snoRNA genes have been reported within introns (Chen et al. 2003; Huang et al. 2004). Two pathways for processing intron-encoded snoRNAs have been reported in vertebrates (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002). In the major pathway, the intron is first spliced out of the host gene pre-mRNA in the form of a lariat. Following debranching of the lariat intron, exonucleolytic trimming of the linear RNA produces a snoRNA with mature 5' and 3' termini. In the minor pathway (e. g. for U16 and U18 in Xenopus and also for U18 in yeast), prototypical splicing does not occur. Instead, endonucleolytic activities initiate cleavages at sites upstream and downstream of snoRNA regions, generating linear snoRNAcontaining fragments that are then trimmed from both ends to generate mature snoRNAs (Caffarelli et al. 1994, 1996; Villa et al. 1998). In yeast and Arabidopsis, while a few snoRNAs are intron-encoded, most snoRNAs are independently transcribed as mono-, di-, or polycistronic precursors

Mechanisms and functions of RNA-guided RNA modification 235

that then mature through a series of endo- and exonucleolytic cleavages (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002; Brown et al. 2003a). In Arabidopsis, most snoRNA genes, including both box C/D and box H/ACA snoRNA genes, are organized into gene clusters scattered over the chromosomes (Brown et al. 2003a). Interestingly, most snoRNA host genes encode proteins involved in nucleolar function, ribosome biogenesis or structure, or protein translation (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002). This gene organization may imply that a coordinated regulatory mechanism is imposed on the synthesis of rRNA, ribosomal proteins and proteins of the translational apparatus. Remarkably, in vertebrates and fruit flies, some snoRNA host genes appear to serve purely as a source of snoRNAs (Tycowski and Steitz 2001; Terns and Terns 2002). In these cases, the ligated exons do not appear to have protein coding potential (Tycowski et al. 1996; Bortolin and Kiss 1998; Pelczar and Filipowicz 1998; Smith and Steitz 1998). Another interesting observation is that most if not all vertebrate and fly snoRNA host genes, including protein-coding and non-protein-coding genes, belong to the TOP (terminal oligopyrimidine) family, which represents a group of housekeeping genes that are constitutively transcribed (Pelczar and Filipowicz 1998; Smith and Steitz 1998). The fact that all snoRNAs reside in introns of TOP genes suggests that snoRNAs are produced in a coordinated manner, perhaps at the level of RNA transcription. Much less is known about the biogenesis of box C/D and box H/ACA RNAs in archaea, though it does appear likely that the biogenic pathways will be novel in these organisms. In archaea, sRNA genes are generally not clustered and are typically positioned in the regions between protein coding genes, sometimes overlapping the 5’ or 3’ end of a predicted ORF (Gaspin et al. 2000; Omer et al. 2000). One box C/D RNA is unique in its location within the intron of the tRNATrp gene in several organisms, and is likely processed out via the bulge-helix-bulge processing pathway that removes tRNA introns in archaea (Nieuwlandt et al. 1993). Interestingly, this box C/D RNA directs 2'-O-methylation at positions 34 and 39 of its host tRNA either via a cis (Clouet d'Orval et al. 2001) or trans mechanism (Singh et al. 2004). In P. furiosus, it appears that most (or all) box C/D RNAs exist in circular as well as linear forms (Starostina et al. 2004). Since none of the eukaryotic snoRNAs are known to exist as circular RNAs at any point in biogenesis (and indeed very few single-stranded circular RNAs are known to exist in biological systems), these findings imply the existence of a novel pathway for the biogenesis of box C/D RNAs in archaea.

7 Modification guide RNAs function as RNA-protein complexes The box C/D and box H/ACA guide RNAs establish sites for RNA modification by base-pairing with target RNAs as described above. However, the modifications are executed by protein enzymes that are part of a core set of proteins specifically

236 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

associated with each family of guide RNAs. The box C/D and box H/ACA RNPs are comprised of three or four core proteins and a guide RNA. A focus of current research is to understand the roles of the proteins within the RNPs. The structure and function of the RNPs is being investigated in archaea as well as eukaryotes, where the RNPs are fundamentally similar and the distinctions that exist broaden our understanding of the modification guide RNP system. Differences in the protein composition of the RNPs found in archaea and eukaryotes seem to reflect gene duplications that allowed an increase in the complexity and specialization of the RNPs in eukaryotes. 7.1 Protein components of methylation guide RNPs Individual box C/D RNAs direct a set of common proteins to sites of modification. In eukaryotes, the set is comprised of four proteins: fibrillarin, 15.5 kDa, Nop56, and Nop58 (Fig. 3). In archaea, there is a homologous set of three core proteins: fibrillarin, L7Ae and Nop56/58 (Fig. 3). Fibrillarin (Nop1p in yeast) is the methyltransferase, found in both eukaryotic and archaeal box C/D RNPs (Ochs et al. 1985; Galardi et al. 2002; Omer et al. 2002). The second core component of the box C/D RNP is an RNA binding protein: 15.5 kDa in higher eukaryotes, snu13p in yeast, or L7Ae in archaea (Watkins et al. 2000; Kuhn et al. 2002). This protein interacts directly with the kink-turn (K-turn), an RNA motif formed by the signature sequences of the box C/D RNAs, box C and box D (Watkins et al. 2000; Klein et al. 2001; Kuhn et al. 2002). The core box C/D RNP is completed by Nop56/58 in archaea, or two paralogues, Nop56 and Nop58 (also called Nop5p in yeast), in eukaryotes (Gautier et al. 1997; Lafontaine and Tollervey 1999, 2000). 7.2 Protein components of pseudouridylation guide RNPs The core box H/ACA RNP is comprised of a set of four proteins: Cbf5, Gar1, Nhp2 (L7Ae in archaea), and Nop10 (Henras et al. 1998; Watkins et al. 1998; Dragon et al. 2000; Pogacic et al. 2000; Watanabe and Gray 2000; Rozhdestvensky et al. 2003; Wang and Meier 2004) (Fig. 3). The conversion of uridine to pseudouridine by box H/ACA RNPs is very likely catalyzed by Cbf5, a core component that strongly resembles other pseudouridine synthases (Koonin 1996; Zebarjadian et al. 1999). In humans, this protein is called dyskerin, and mutation of the dyskerin gene results in X-linked dyskeratosis congenita (DC) (Heiss et al. 1998). DC patients typically have abnormal skin pigmentation and nail dystrophy, and often develop life-threatening bone marrow failure and epithelial cancers (Marrone and Mason 2003; Meier 2003). In vertebrates, the RNA component of telomerase, the enzyme involved in telomere length maintenance, also contains an H/ACA motif (Mitchell et al. 1999a; Chen et al. 2000; Lukowiak et al. 2001; Jady et al. 2004; Zhu et al. 2004) and the four H/ACA RNP proteins are also core components of telomerase (Mitchell et al. 1999b; Dragon et al. 2000; Pogacic et al. 2000; Wang and Meier 2004).

Mechanisms and functions of RNA-guided RNA modification 237

Fig. 3. Protein components of RNA-guided RNA modification RNPs in Eukarya and Archaea. Homologous protein components of box C/D and box H/ACA RNPs in eukaryotes and archaea are shown. In eukaryotes, box C/D RNAs are associated with a set of four common proteins: fibrillarin (or Nop1p), 15.5 kD (or Snu13p), Nop56 and Nop58 (or Nop5p). The eukaryotic box H/ACA RNP also contains four common proteins: Cbf5 (or dyskerin or Nap57) Nhp2, Nop10, and Gar1. Archaeal homologs of the same name exist for fibrillarin, Cbf5, Nop10, and Gar1. Nop56 and Nop58 are related eukaryotic proteins with a single homolog in archaea that is equally similar to the two eukaryotic proteins, and thus, called Nop56/58. L7Ae is homologous to three related eukaryotic proteins, 15.5 kD and Nhp2 as well as L7a, and is thought to be a component of box C/D and box H/ACA RNPs as well as the ribosome in archaea. In eubacteria, individual (or small numbers of) nucleotide modifications are catalyzed by dedicated protein enzymes.

7.3 Evolutionary relationships between archaeal and eukaryotic modification guide RNPs The discovery that box C/D and box H/ACA RNPs exist in archaea as well as eukaryotes indicates the ancient evolutionary origin of these RNA-guided RNA modification systems (Terns and Terns 2002; Omer et al. 2003; Tran et al. 2004). Examination of the protein components of the box C/D and box H/ACA RNPs in archaea and eukaryotes suggests that more complex and specialized RNPs have arisen in eukaryotes following gene duplications that occurred after the divergence

238 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

of archaea and eukaryotes approximately 2000 million years ago. For example, the Nop56 and Nop58 proteins that are each essential components of box C/D RNPs in eukaryotes appear to be paralogues derived from a single gene found in contemporary archaea (and presumably in the last common ancestor of archaea and eukaryotes). Perhaps the most interesting instance of divergence and specialization, however, involves a set of proteins that bind directly to box C/D and box H/ACA RNAs. In archaea, the L7Ae protein is a core component of box C/D RNPs and box H/ACA RNPs, and also of the ribosome (Ban et al. 2000; Kuhn et al. 2002; Rozhdestvensky et al. 2003). L7Ae binds K-turns present in the RNA components of each of these RNPs in archaea (Ban et al. 2000; Klein et al. 2001; Kuhn et al. 2002; Rozhdestvensky et al. 2003). Duplications of an ancestral L7Ae gene apparently allowed the evolution of three related but distinct RNA binding proteins in eukaryotes – 15.5 kDa (component of box C/D RNP and U4/U6 snRNP), Nhp2 (component of box H/ACA RNP), and L7a (component of ribosome). In eukaryotes, these proteins are not redundant or interchangeable. Each of the three proteins is essential for viability and is specifically associated with particular families of RNAs. Two of the eukaryotic proteins have acquired additional domains. Both Nhp2 and L7a have N terminal extensions not found in the archaeal protein L7Ae (or in 15.5 kDa). Eukaryotic L7a contains an additional C terminal extension. The L7Ae gene duplications may also have allowed greater divergence of the RNA families in eukaryotes than is observed in archaea. In particular, Nhp2 and the eukaryotic H/ACA RNAs appear to have co-evolved significantly from the ancestral L7Ae / K-turn RNA pair. Eukaryotic box H/ACA RNAs do not contain recognizable K-turns and the Nhp2 protein appears to have reduced RNA binding specificity (Henras et al. 2001; Wang and Meier 2004). The divergence of the Nhp2 / box H/ACA pair in eukaryotes presumably would have required covariation of the multiple individual RNAs with the protein – a considerable evolutionary constraint. This suggests that the divergence may have occurred at an early point in the evolution of the box H/ACA RNAs, perhaps before significant expansion in the number of individual H/ACA RNAs. Many fewer box H/ACA RNAs are known to exist in archaea, consistent with the possibility that a common ancestor had few box H/ACA RNAs. On the other hand, the box C/D RNAs, which are more numerous in archaea, appear to have diverged less from the L7Ae / K-turn model in eukaryotes. Eukaryotic box C/D RNAs retain the K-turn. In addition, we have found that the eukaryotic 15.5 kDa protein (and all other eukaryotic box C/D RNP proteins essential for function) recognize archaeal box C/D RNAs (Speckmann et al. 2002). Archaeal box C/D RNAs associate with eukaryotic proteins, localize to the nucleolus and guide rRNA modification when injected into the nucleus of a eukyarotic cell (Speckmann et al. 2002). The gene duplications may also have allowed the evolution of a key spliceosomal RNP in eukaryotes, the U4/U6 snRNP. snRNP-mediated splicing is not known to exist in archaea. However, at least two protein components of the U4/U6 snRNP appear to be directly related to archaeal box C/D RNP proteins. The eukaryotic 15.5 kDa protein (related to archaeal L7Ae) is a component of the U4/U6 snRNP as well as box C/D RNPs in eukaryotes (Watkins et al. 2000). 15.5 kDa

Mechanisms and functions of RNA-guided RNA modification 239

recognizes very similar K-turn motifs in U4 snRNA and box C/D RNAs. In addition, like the L7Ae gene, the Nop56/58 gene appears to have undergone two gene duplications giving rise to three related genes in eukaryotes: Nop56, Nop58, and Prp31 (Gautier et al. 1997; Watkins et al. 2000; Terns and Terns 2002). The Prp31 protein is an essential component of the U4/U6 snRNP and required for mRNA splicing in eukaryotes (Weidenhammer et al. 1997; Makarova et al. 2002). Thus duplications in box C/D RNP protein genes appear to have contributed to the development of snRNP-mediated mRNA processing in eukaryotes.

8 Assembly and structural organization of modification guide RNPs Very recently, significant advances have been made toward detailing the global architecture of box C/D and box H/ACA modification guide RNPs and dissecting the molecular interactions that underlie their mechanisms of action. Progress has been accelerated by both development of cell-free modification guide systems and determination of the structures of components of the RNPs . Studies with cell-free 2’-O-methylation (Galardi et al. 2002; Omer et al. 2002; Bortolin et al. 2003; Rashid et al. 2003; Tran et al. 2003) and pseudouridylation (Wang et al. 2002; Wang and Meier 2004) systems have confirmed that the known core proteins are sufficient to support RNA-guided, site-specific modification. Furthermore, the cell-free systems have enabled detailed analysis of the roles of RNA/protein and protein/protein interactions in the assembly and function of the modification guide RNPs. In addition, guide RNP research has recently reached the “atomic age” primarily thanks to the availability of crystallization-friendly proteins from archaea. Great insight has been gained by glimpsing the details revealed in recent Xray structures of individual guide RNP proteins and co-crystallized RNA/protein and protein/protein complexes. 8.1 Methylation guide RNP structure The methylation guide RNPs in archaea and eukaryotes are fundamentally similar in structure and function, and a common general model can be visualized (Fig. 4). As described above, the methylation guide RNAs contain one or two functional box C/D units. The homologous proteins L7Ae (in archaea) and 15.5 kDa (in eukaryotes) interact directly with K-turns formed in the RNAs by boxes C and D, and initiate complex formation. Two molecules of Nop56/58 (in archaea) or one each of Nop56 and Nop58 (in eukaryotes) are associated with each RNP and likely bridge between the two box C/D units. Structural and mutational analysis indicates that fibrillarin catalyzes the 2’-O-methylation of the target RNA, and a fibrillarin molecule presumably resides in proximity to each of the two potential guide sequences. The target RNA is positioned for 2’-O-methylation by basepairing with the guide sequence.

240 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Fig. 4. Assembly of modification guide RNP proteins on box C/D and box H/ACA RNAs. A) Assembly of eukaryotic (top half) and archaeal (bottom half) box C/D proteins on a box C/D RNA. The homologs 15.5 kD (in eukaryotes) and L7Ae (in archaea) interact with box C/D RNAs in the absence of the other proteins, which is thought to nucleate assembly of the RNPs. Archaeal Nop56/58 can then interact with the L7Ae/RNA complex and fibrillarin can then join the complex. The eukaryotic assembly is shown to be similar in this model. B) Assembly of eukaryotic box H/ACA proteins on a box H/ACA RNA. In this case, the proteins pre-assemble into a protein complex that interacts specifically with box H/ACA RNAs. The stoichiometry and organization of the proteins within the complex is not known.

One interest that has arisen with regard to box C/D RNP organization is the degree of symmetry between the box C/D and box C’/D’ units in an individual RNP. As is detailed below, there is necessarily greater asymmetry in the eukaryotic box C/D RNPs, where the box C’/D’ unit is often degenerate and non-functional (KissLászló et al. 1998), and where distinct Nop56 and Nop58 proteins are found at the box C’/D’ and box C/D units respectively (Cahill et al. 2002). Even in a model

Mechanisms and functions of RNA-guided RNA modification 241

eukaryotic double guide RNA, 15.5 kDa was only detected at the box C/D unit, not at the box C’/D’ unit (Szewczak et al. 2002). The archaeal RNPs are more symmetric, with two functional box C/D units found in nearly all of the RNAs, and homodimers of the Nop56/58 protein in place of Nop56 and Nop58. However, while it is clear that L7Ae (the 15.5 kDa homologue) can bind to either a box C/D or box C’/D’ unit, binding of L7Ae at only one unit is sufficient for activity at both units (Rashid et al. 2003; Tran et al. 2003). Thus, it is possible that box C/D RNPs may be asymmetric with regard to L7Ae (i.e. that L7Ae may be present at only one C/D unit) in both archaea and eukaryotes. The recent studies that have illuminated the organization of the box C/D RNPs in eukaryotes and archaea are described herein. 8.1.1 Structure of eukaryotic methylation guide RNPs The organization of the eukaryotic methylation guide RNP has been studied in vivo and in vitro using a model eukaryotic double guide RNA (Cahill et al. 2002; Szewczak et al. 2002). The contacts of individual RNP proteins with the conserved box C, C’, D, and D’ elements in vivo were mapped by site-specific UV crosslinking experiments performed following injection of U25 box C/D RNA into Xenopus oocytes (Cahill et al. 2002). These studies clearly demonstrated interaction of the Nop58 protein with the box C element (of the terminal C/D motif) and of the Nop56 protein with the box C’ element (of the internal C’/D’ motif). The terminal box C/D motif is known to be particularly important in protecting box C/D RNAs against degradation, and consistent with the mapping results, Nop58 (but not Nop56) is required for the stability of all box C/D snoRNAs in yeast (Lafontaine and Tollervey 1999, 2000). Interaction of the Xenopus 15.5 kDa protein with U25 was not detected by UV crosslinking, but recombinant 15.5 kDa protein assembled into active RNP complexes following injection into Xenopus oocytes (Cahill et al. 2002). In a separate study, interference mapping both in vivo and in vitro showed that the 15.5 kDa protein interacts with the C/D but not C’/D’ motif (Szewczak et al. 2002). Interactions of fibrillarin with both the terminal C/D and internal C’/D’ motifs (predominantly at box D and box D’) were observed by crosslinking (Cahill et al. 2002). Taken together, these studies indicate that the box C/D and box C’/D’ units do not act as simple structural mirror images of one another and do not each bind all of the four core protein components. Instead, the core protein components are asymmetrically organized with respect to these two functional motifs. 8.1.2 Structure of archaeal methylation guide RNPs The recent availability of in vitro systems capable of reconstituting enzymatically active, archaeal methylation guide RNPs are greatly facilitating investigation of box C/D RNP structure/function relationships (Omer et al. 2002; Bortolin et al. 2003; Rashid et al. 2003; Tran et al. 2003). Site-specific 2’-O-methylation can be reconstituted by incubating a box C/D guide RNA with a target rRNA sequence (both synthesized in vitro), three recombinant box C/D RNP proteins (i.e. L7Ae,

242 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Nop56/58, and fibrillarin) and the methyl donor AdoMet. Order of addition experiments have been used to assess the requirements for assembly, and the data indicate that L7Ae can bind the RNA in the absence of other components. Nop56/58 can interact with the RNA only in the presence of L7Ae, and association of fibrillarin (the methyltransferase) with the RNA requires both L7Ae and Nop56/58. These results suggest an ordered assembly of proteins on the RNA to form a functional RNP complex: L7Ae, Nop56/58, fibrillarin. It should be noted, however, that while the box C/D RNP proteins are capable of interaction with a box C/D RNA in a specific sequential order, it is not known whether some or all of the protein components (e.g. Nop56/58 and fibrillarin) pre-assemble prior to interaction with the RNA in vivo. High resolution X-ray structures of fibrillarin proteins from Methanococcus jannashii (Wang et al. 2000), Archaeoglobus fulgidus (Aittaleb et al. 2003), and Pyrococcus furiosus (Deng et al. 2004) provide compelling evidence that fibrillarin is the RNP component responsible for catalyzing site-specific, RNA-guided 2’-O-methylation. The archaeal fibrillarins exhibit the hallmark topology (a seven stranded β−sheet flanked by three α-helices on each side), conserved motifs, Adomet binding site, and invariant amino acid residues present in the catalytic domains of all known AdoMet-dependent methyltransferases (Martin and McMillan 2002; Schubert et al. 2003). The recent functional studies with reconstituted box C/D RNPs have also provided direct evidence that fibrillarin is the methyltransferase (Omer et al. 2002; Bortolin et al. 2003; Rashid et al. 2003; Tran et al. 2003). Solution of the structure of co-crystals of Archaeoglobus fulgidus fibrillarin and Nop56/58 revealed the existence of a tetramer consisting of two molecules of each protein in the configuration fibrillarin-Nop56/58-Nop56/58-fibrillarin (Aittaleb et al. 2003). The interaction of Nop56/58 with fibrillarin is primarily brought about by extensive surface complementarity between the concave N-terminal domain of the Nop56/58 protein and a convex central domain of fibrillarin. Interestingly, Nop56/58 interaction with fibrillarin appears to be a requirement for stable binding of the methyl donor (Adomet) to fibrillarin, indicating the functional importance of the interaction. Extensive coiled-coil interactions between two Nop56/58 proteins result in the formation of a four helix bundle that mediates the dimerization of the fibrillarin-Nop56/58 heterodimers. These findings suggest a model in which the tetramer bridges the two box C/D units, positioning a fibrillarinNop56/58 heterodimer at each unit (see Fig. 4). The overall features of this model likely hold true for eukaryotic box C/D RNP organization as well, though in this case a heterodimer of Nop56 and Nop58 (rather than a Nop56/58 homodimer) would bridge the units. Finally, the atomic details of the interaction of L7Ae with short, model box C/D motif RNAs have been described (Hamma and Ferre-D'Amare 2004; Moore et al. 2004) and have illuminated the initial step of box C/D RNP assembly. Like 15.5 kDa in eukaryotes, L7Ae is an RNA binding protein that interacts specifically with K-turns (Kuhn et al. 2002; Rozhdestvensky et al. 2003). The K-turn is an RNA motif originally identified in the U4 spliceosomal RNA (Vidovic et al. 2000) and later recognized to occur in other RNAs including ribosomal RNAs, box C/D

Mechanisms and functions of RNA-guided RNA modification 243

RNAs, and archaeal (but apparently not eukaryotic) box H/ACA RNAs (Ban et al. 2000; Kuhn et al. 2002; Rozhdestvensky et al. 2003). A canonical K-turn consists of a short, asymmetric internal loop in which the phosphodiester backbone undergoes a major bend (or “kink”) of ~1200, and which is flanked by a canonical (Watson-Crick base-paired) stem structure on one side and a non-canonical stem structure that contains tandem sheared G-A base-pairs typically followed by a U-U or G-U base-pair on the other side (Klein et al. 2001). A K-turn is thought to form at the Box C/D motif and be recognized by L7Ae (in archaea) and 15.5 kDa (in eukaryotes). The ability of L7Ae to recognize relaxed (non-canonical) K-turn structures likely accounts for its binding to box C’/D’ motifs and also H/ACA RNAs in archaea, which lack the canonical Watson-Crick base-paired stem (Hamma and Ferre-D'Amare 2004). The eukaryotic 15.5 kDa protein demonstrates stricter binding requirements (Kuhn et al. 2002; Hamma and Ferre-D'Amare 2004). With regard to the question of symmetry between the two box C/D units of the archaeal box C/D RNP, in vitro studies performed with recombinant proteins and mutated or truncated RNAs indicate that all three proteins can interact with either the box C/D or box C’/D’ unit (Rashid et al. 2003; Tran et al. 2003). However, there is evidence that binding of L7Ae at both units is not necessary for assembly of the other proteins (Nop56/58 and fibrillarin) at both units, or for function (as is the case for 15.5 kDa in eukaryotes). Mutations that disrupt L7Ae interaction at either the box C/D or box C’/D’ unit do not prevent binding of L7Ae at the other unit, or recruitment of the fibrillarin-Nop56/58-Nop56/58-fibrillarin tetramer to the complex (Rashid et al. 2003). In addition, RNPs formed with reduced concentrations of L7Ae protein (where it is likely that there is only one molecule of L7Ae per RNP) are functional in methylation assays (Rashid et al. 2003). Thus, it is possible that the archaeal RNPs, like the eukaryotic RNPs, are asymmetric with regard to L7Ae protein distribution between the two units. Examination of whether box C/D RNPs possess one or two molecules of L7Ae in vivo is required to resolve this issue. In addition, it was recently recognized that box C/D RNAs exist as circular as well as linear RNAs, at least in the hyperthermophilic archaeon P. furiosus (Starostina et al. 2004). The circular box C/D RNAs are found in complexes with box C/D RNP proteins in extracts from P. furiosus. It remains to be determined whether and how this fundamental difference in RNA structure will effect the organization of box C/D RNPs. We look forward to a more complete understanding of the molecular interactions underlying methylation guide RNP function that we expect to come from high resolution structures of the entire guide RNP complex including the guide RNA, target RNA and all three protein components. 8.2 Pseudouridylation guide RNP structure Relative to the box C/D RNPs, less is known about how box H/ACA RNPs are organized and function. While there is solid evidence that Cbf5 is the enzyme that catalyzes uridine isomerization (Koonin 1996; Zebarjadian et al. 1999), the roles

244 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

of Gar1, Nop10, and Nhp2 (or L7Ae in archaea) in box H/ACA RNP function are less clear. The organization of the proteins and RNAs within the RNP is also not known. Interestingly, recent evidence indicates that specific interaction with box H/ACA RNAs requires pre-assembly of a complex of most or all of the four core proteins in eukaryotes (Wang and Meier 2004), suggesting that assembly of the proteins onto the guide RNA may occur in a single step. At the same time, recently published studies and unpublished work in our laboratory indicate that two of the box H/ACA RNP proteins (L7Ae and Cbf5) can interact specifically and independently with box H/ACA RNAs in archaea (Rozhdestvensky et al. 2003; Baker, Youssef, Terns and Terns, unpublished data). The availability of H/ACA pseudouridylation guide RNPs from archaea, plus the development of an in vitro system to study site-specific pseudouridylation (Wang et al. 2002; Wang and Meier 2004) portend rapid progress in understanding this class of modification guide RNPs in the near future. 8.2.1 Structure of eukaryotic pseudouridylation guide RNPs A fundamental question that remains to be answered with regard to eukaryotic box H/ACA RNPs is the mechanism of specific recognition of the box H/ACA RNAs by the core proteins. UV crosslinking studies of in vitro reconstituted mammalian H/ACA RNPs indicate that all four core proteins contact the RNA in assembled complexes (Dragon et al. 2000), but it does not appear that any single protein interacts directly with the RNA in a sequence-specific fashion in eukaryotes. Nhp2 might be expected to interact specifically with box H/ACA RNAs given the demonstrated roles of the related proteins p15.5 kDa and L7Ae in binding specific motifs in box C/D and other RNAs in eukaryotes and archaea. However, while recombinant Nhp2 does exhibit general RNA binding capacity in vitro, there is no indication that the protein specifically recognizes H/ACA RNAs (Henras et al. 2001; Wang and Meier 2004). Gar1 has also been reported to bind RNA in vitro (Bagni and Lapeyre 1998), but because the studies used proteins expressed in translation lysates that contain additional proteins and RNAs, it is not clear that Gar1 interacts directly with box H/ACA RNAs. Moreover, depletion of Gar1 in vivo does not affect H/ACA snoRNA stability (Bousquet-Antonelli et al. 1997), and immunodepletion of Gar1 from extracts does not prevent assembly of complexes in vitro (Dragon et al. 2000). These observations suggest that Gar1 does not play a role as a primary H/ACA RNA binding protein. Recent work indicates that formation of a complex of the core proteins precedes RNA binding in the assembly of box H/ACA RNPs in eukaryotes (Wang and Meier 2004). This mode of assembly contrasts the proposed stepwise box C/D RNP assembly pathway, which is thought to be initiated by binding of 15.5 kDa (or L7Ae in archaea) to the box C/D RNA (see above). Data with mammalian components suggest that Cbf5, Nhp2, and Nop10 (and perhaps also Gar1) form a stable, multi-protein complex in the absence of the H/ACA RNA, and that this protein complex (rather than any individual protein) recognizes H/ACA RNAs in a sequence-specific manner. In yeast, there is evidence for a Cbf5p/Gar1p/Nop10p complex that forms in vivo independent of an association with either Nhp2p or

Mechanisms and functions of RNA-guided RNA modification 245

H/ACA snoRNAs (Henras et al. 2004). In the assembled RNP, both Cbf5 (the pseudouridine synthetase) and GAR1 contact the target uridine, suggesting that both of these proteins play important functional roles in target RNA modification (Wang and Meier 2004). It should be noted that the H/ACA RNPs assembled from proteins produced by in vitro transcription and translation were not active in pseudouridylation of target RNAs. However, cell-free, specific pseudouridylation of target RNA was obtained with H/ACA complexes reconstituted from mammalian cytosolic extracts and immunopurified complexes that appear to contain just the four core proteins (Wang et al. 2002; Wang and Meier 2004). Another interesting issue is the overall architecture of the eukaryotic box H/ACA RNP. On one hand, there are indications that each H/ACA RNA guide unit (i.e. hairpin) may bind one complete set of the four core proteins (and thus that the typical double guide RNP would contain two sets of the four proteins). The bipartite nature of the RNAs in most eukaryotes (two hairpins each containing a guide sequence) suggests two independent, symmetrical protein binding domains. Consistent with this view, electron micrographs of purified yeast H/ACA RNPs exhibit a V-like structure that has been interpreted to reflect two sets of the four core proteins, each interacting with one hairpin (Watkins et al. 1998). The estimated molecular weight of the complexes is also in general agreement with this possibility (Lubben et al. 1995; Watkins et al. 1998). In addition, the existence of functional single hairpin H/ACA RNAs in trypanosomes (early diverging unicellular eukaryotes) indicates that a single hairpin is sufficient for the binding of the four core proteins (Uliel et al. 2004). On the other hand, emerging evidence indicates that, like box C/D RNPs, the core H/ACA proteins may be asymmetrically arranged in the RNPs despite the symmetric structure of the double hairpin RNAs. First, mutation of either box H or box ACA prevents function at both pseudouridylation pockets of an H/ACA RNA (rather than just at the adjacent hairpin) (Bortolin et al. 1999). Second, the pre-assembled box H/ACA protein complex appears to contain sub-stoichiometric amounts (~1/2 the level) of Nhp2 protein relative to Cbf5 and Nop10 (Wang and Meier 2004). Third, in some cases, in vitro reconstitution of H/ACA RNPs in mammalian extract systems has been found to require two hairpins (e.g. U19 and U64). In the case of two RNAs that contain H/ACA motifs but are not known to function as pseudouridylation guides (U17 and telomerase RNA) a single hairpin appears to be sufficient for RNP formation (Pogacic et al. 2000) indicating that the structural organization of the four core proteins may differ in box H/ACA RNPs that are not involved in RNA modification. 8.2.2 Structure of archaeal pseudouridylation guide RNPs Evidence for the existence of box H/ACA RNPs in archaea has only recently emerged. Archaeal genomes contain putative homologs of the four core eukaryotic box H/ACA proteins (Watanabe and Gray 2000) and box H/ACA RNAs (Tang et al. 2002) (Fig. 3 and 4). The presence of modifications at predicted rRNA target sites provides support for the function of putative box H/ACA RNAs as pseudouridylation guide RNAs (Tang et al. 2002). However, the roles of the protein

246 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

homologs in RNA-guided pseudouridylation, and even the fundamental presence of the protein homologs in H/ACA RNP complexes in archaea remain to be established. Studies to investigate the structural organization and mechanism of action of the archaeal H/ACA RNPs are already providing key information as well as surprises. In contrast to eukaryotic H/ACA RNAs, which generally have two hairpins, predicted archaeal H/ACA RNAs contain one, two, or three hairpins (and corresponding box H or box ACA elements) (Tang et al. 2002). In addition, the archaeal H/ACA RNAs contain non-canonical K-turns located near the terminal loops of the hairpins that have been shown to serve as specific L7Ae binding sites (Rozhdestvensky et al. 2003; Hamma and Ferre-D'Amare 2004). (As described above, the archaeal homolog of Nhp2 is L7Ae, which is also a component of box C/D RNPs and ribosomes in archaea.) Unpublished studies from our laboratory indicate that Cbf5 (the likely pseudouridine synthase) also interacts directly and specifically with H/ACA RNAs in the absence of other proteins (Baker, Youssef, Terns and Terns, unpublished data). The independent and specific interaction of two of the core proteins, Cbf5 and L7Ae, with box H/ACA RNAs in archaea (Rozhdestvensky et al. 2003) contrasts the binding of a pre-assembled protein complex proposed for eukaryotic H/ACA RNPs. The archaeal and eukaryotic box H/ACA RNPs appear to have diverged to a greater extent than the box C/D RNPs (see above), suggesting that more enlightening differences between RNA-guided RNA modification systems in these two domains of life await discovery.

9 Function of pseudouridylation and 2'-O-methylation The positions of the extensive modifications found in both rRNAs and spliceosomal snRNAs are similar in various organisms. For instance, the 2'-Omethylated residues and pseudouridines found in the five spliceosomal snRNAs from various species are virtually all concentrated in the 5' half of each RNA molecule, and are all clustered in regions known to be important for pre-mRNA splicing (Fig. 2) (Yu et al. 1999). Although not located in identical positions, the modified nucleotides of rRNAs from different organisms are virtually all distributed in conserved regions known to be functionally important for protein synthesis (Decatur and Fournier 2002; Omer et al. 2003). The conservation in the location of these modified nucleotides in critical regions within each RNA strongly suggest that both 2'-O-methylated residues and pseudouridines are functionally relevant. In fact, globally blocking yeast rRNA 2'-O-methylation by deleting nop1 (Tollervey et al. 1991) or pseudouridylation by mutating cbf5 (Zebarjadian et al. 1999) causes a severe growth defect phenotype. Likewise, mutation of the human homologue of Cbf5, dyskerin, results in loss of rRNA pseudouridylation, and in dyskeratosis congenita, a disease characterized by bone marrow failure (Meier 2003; Ruggero et al. 2003). Recent work is beginning to shed light on the functional roles of these modifications in both rRNA and spliceosomal snRNA, as described in the following sections.

Mechanisms and functions of RNA-guided RNA modification 247

9.1 rRNA modifications occur primarily in functionally important regions of the ribosome Taking advantage of the recently acquired high-resolution crystal structures of ribosomes, the Fournier group modeled the positions of the known 2'-O-methylated residues and pseudouridines in the context of the ribosome structure and deduced three-dimensional modification maps for E. coli and yeast cytoplasmic ribosomes (Decatur and Fournier 2002). They found that modified nucleotides, which are not distributed randomly in rRNA at the secondary structural level, remain highly concentrated in important sites at the three-dimensional level, including the peptidyl transferase center and the sites where ribosomal subunits interact. Interestingly, modified nucleotides are concentrated in areas free of ribosomal proteins, suggesting that rRNA modifications might not be directly involved in the binding of proteins to rRNA. Likewise, Omer and colleagues mapped predicted 2'-Omethylations on the archaeal rRNA crystal structures and deduced the threedimensional distribution of the 2'-O-methylated nucleotides in the archaeal ribosome structure (Omer et al. 2003). Their results indicated that 2’-O-methylated nucleotides in archaeal rRNA are likewise located in regions known or expected to be important for ribosome function. It is noteworthy that, as discussed earlier, the number of box C/D small RNA guides (and perhaps the number of modified nucleotides in rRNAs) is higher in archaeal organisms that grow at high temperatures compared with those that grow at low temperatures (Noon et al. 1998; Dennis et al. 2001). This correlation suggests the possibility that the 2'-O-methylated residues in rRNAs contribute directly to the thermostability of the ribosome. In this regard, it has been demonstrated that 2'-O-methylated RNA-RNA structures are more stable than those involving RNARNA interactions alone (Davis 1998). However, it is also possible that higher temperatures require more sRNAs to act as chaperones to direct rRNA folding processes. 9.2 rRNA modifications in the peptidyl transferase center contribute to ribosome function and cell growth The fact that most 2'-O-methylated residues and pseudouridines correlate with functional sites in the ribosome led to the plausible hypothesis that rRNA modifications might contribute directly to ribosome function and protein synthesis, although it is also possible that they affect the biogenesis of rRNA and ribosomes (Decatur and Fournier 2002). To experimentally test the functional relevance of the pseudouridines, the Fournier group (King et al. 2003) mutated/deleted five yeast box H/ACA snoRNAs predicted to pseudouridylate the large subunit (LSU) rRNAs at positions 2822, 2861, 2876, 2919, 2940, and 2971, all of which are located at the peptidyl transferase center of the ribosome. They subsequently assessed the effects of the mutations/deletions on pseudouridine synthesis, rRNA processing, ribosome function and cell growth. They found that a point mutation in the guide region in snR10 and the deletion of the other four snRNAs essentially

248 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

had no effect on rRNA processing, but specifically abolished pseudouridylation at expected site(s). Importantly, a blockade of pseudouridylation at a single site by mutating/deleting a single box H/ACA snoRNA caused a slight defect in polypeptide synthesis and cell growth, suggesting that an individual pseudouridine in the peptidyl transferase center in rRNA contributes only modestly to healthy growth. Strikingly, however, the simultaneous mutation/deletion of all five snoRNAs eliminated pseudouridylation at all six sites and consequently reduced the rate of protein synthesis, thereby causing a more severe growth defect phenotype. These results suggest that these pseudouridines may contribute to ribosome function in a synergistic manner. Consistent with these results, the Jacquier group reported that the removal of a single box H/ACA snoRNA responsible for the formation of the two most highly conserved pseudouridines in the yeast LSU rRNA (Ψ2258 and Ψ2260) also caused a slight but consistent cell growth defect phenotype, suggesting a functional role of the two pseudouridines in translation (Badis et al. 2003). However, it should be noted that it is still possible that the snoRNAs have additional roles in ribosome function aside from directing pseudouridylation. In this regard, a previous report by the Ofengand group showed that the deletion of RluD, a pseudouridylase responsible for the formation of Ψ1911, Ψ1915 and Ψ1917 in E. coli LSU rRNA, caused a severe growth defect (Raychaudhuri et al. 1998). However, when a point mutation was introduced into the catalytic center of the enzyme, pseudouridylation was completely abolished and yet the mutant cells grew as well as wild type E. coli, suggesting that the enzyme has a second function (unrelated to pseudouridylase activity) that is crucial for cell growth (Gutgsell et al. 2001). 9.3 Spliceosomal snRNA modifications are required for pre-mRNA splicing Clues that spliceosomal snRNA modifications might be important for pre-mRNA splicing come from observations in several reconstitution systems in which U2 snRNA function can be assessed. While cellularly derived U2 (fully modified) is fully competent for splicing (Yu et al. 1998), in vitro transcribed U2, which contains no modifications, does not reconstitute splicing in U2-depleted Xenopus oocytes (Pan et al. 1989; Yu et al. 1998) or in U2-depleted HeLa nuclear extract (Segault et al. 1995). Upon prolonged reconstitution, in vitro transcribed U2 is modified at the expected positions and the splicing activity in Xenopus oocytes is regenerated (Yu et al. 1998). Moreover, in vitro transcribed yeast U2 is modified in yeast splicing extracts (Ma and Yu, unpublished data) and reconstitutes premRNA splicing in yeast cell extracts depleted of endogenous U2 snRNA (McPheeters et al. 1989; McPheeters and Abelson 1992). By creating chimeric U2 snRNA molecules in which some of the sequences are from cellularly derived U2 whereas others are from in vitro transcribed U2, Yu et al. (1998) further dissected the modification of U2 and demonstrated that the functionally important modified nucleotides reside within the 5'-most 27 nucleotides, including three pseudouridines and six 2'-O-methylated residues (see Fig. 2). Na-

Mechanisms and functions of RNA-guided RNA modification 249

tive gel analysis indicated that the U2 snRNA (containing no modifications within the 5'-most 27 nucleotides) did not participate in spliceosome assembly, suggesting that the effect of these modified nucleotides on pre-mRNA splicing may be at an earlier stage. Subsequent analyses using anti-Sm immunoprecipitation, oligonucleotide affinity chromatography, and glycerol gradient centrifugation argued that U2 snRNA modification may directly contribute to the full assembly of the functional U2 snRNP that is essential for spliceosome assembly and splicing. Very recently, the Luhrmann group performed similar experiments in HeLa nuclear extracts (Donmez et al. 2004). Consistent with the results above, they found that most modified nucleotides (both 2'-O-methylated residue and pseudouridines) within the 5' end region of U2 are necessary for splicing. Furthermore, their data suggest that the effect of these modified nucleotides on splicing occurs at an early stage of pre-mRNA splicing, namely during complex E assembly (Donmez et al. 2004). Further dissection of U2 modifications indicates that the pseudouridines in the branch site recognition region of U2 are also required for pre-mRNA splicing in Xenopus oocytes (Zhao and Yu 2004b). Using the Xenopus microinjection system, Zhao and Yu (2004) observed that pseudouridylation occurs so fast in the U2 branch site recognition region that it is already complete before the splicing assay is performed. This rapid modification precludes the possibility of analyzing the modified nucleotides in this region using the conventional Xenopus oocyte reconstitution system described above (Yu et al. 1998). In order to analyze these pseudouridines, Zhao and Yu (2004) took advantage of the fact that injection of oocytes with synthetic U2 snRNA containing 5-fluorouridines only in the branch site recognition region specifically inhibits pseudouridylation in the same region of in vitro transcribed U2 snRNA injected at a later time. The reconstitution results indicate that prior injection of 5-fluorouridine-containing U2 into U2-depleted oocytes almost completely abrogates the ability of in vitro transcribed U2 to rescue splicing whereas full rescue is achieved with either cellular U2 or U2 containing pseudouridines in the branch site recognition region. Further analyses using glycerol-gradient and native gel electrophoresis indicate that U2 RNAs lacking pseudouridines in the branch site recognition region do not participate in the assembly of the fully functional U2 snRNP and the spliceosome. However, because pseudouridylation at all six pseudouridine positions (Fig. 2) is inhibited, it remains to be determined whether the pseudouridines act synergistically or if individual pseudouridines in this region are critical for function. In this regard, the change of a single uridine in the branch site recognition region (U34) to pseudouridine (Ψ34) greatly enhances the production of X-RNA, a product generated by a splicing-related branching reaction in a cell- and protein-free system (Valadkhan and Manley 2003), suggesting that at least one pseudouridine in the branch site recognition region plays a critical role in splicing. This notion is supported by published NMR structural data that indicate that Ψ34 is important both for stabilizing the RNA-RNA duplex between the branch site recognition region in U2 and the branch site sequence in pre-mRNA and for maintaining the bulge of the branch point nucleotide (adenosine) for nucleophilic attack during splicing (Newby and Greenbaum 2001; Newby and Greenbaum 2002; see chapter by Greenbaum).

250 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

9.4 How do modified nucleotides contribute to RNA function? Although some important modified nucleotides have been identified, the question of how these modifications contribute to function remains unclear. One possibility is that the modifications alter RNA structure either locally or globally, thereby, altering function. Accordingly, it has been reported that pseudouridines are important for RNA folding, perhaps acting by stabilizing local base stacking (Davis 1995, 1998). The crystal structure of tRNAGln also revealed the specific binding of a pseudouridine to a water molecule, forming a local structure that may be critical for stabilizing the tRNA (Arnez and Steitz 1994). 2'-O-methylated residues can also contribute to the stabilization of RNA secondary structure (Davis 1998). Another possibility of how RNA modifications contribute to function is that the modified nucleotides are directly recognized by another RNA(s) or protein(s) and that the resulting complex is essential for function. It is clear that both 2'-Omethylation and pseudouridylation can change the chemical properties of nucleotide residues. Specifically, the conversion of uridine into pseudouridine creates an extra hydrogen donor, whereas 2'-O-methylation certainly makes a nucleotide residue more hydrophobic (Davis 1998). These property changes may allow the modified nucleotides to interact differentially with other proteins or RNAs. Alternatively, changes in nucleotide properties may directly contribute to catalysis. In this regard, it is especially relevant that many modified nucleotides in 28S rRNA are located in the peptidyl transferase center (Decatur and Fournier 2002; Omer et al. 2003) and that the U2 branch site recognition region, which contains a number of pseudouridines (Fig. 2), is believed to contribute to the catalytic center of the spliceosome that mediates the splicing reaction (Yu et al. 1999). 9.5 Are RNA modifications reversible? Another important question regarding RNA modifications is whether they are reversible. Molecular modifications are reversible in many instances, and this is especially well-appreciated for post-translational modifications of proteins (phosphorylation, methylation, acetylation, etc.). Cells utilize the reversible protein modification strategy to regulate gene expression in response to changes in the intra- or extracellular environment. However, to date there is no evidence for the reversibility of RNA modifications. On the other hand, this lack of data does not disprove the possible existence of such a process, and therefore further scrutiny is necessary to clarify this issue. As a first step toward solving this problem, it is important to quantitate RNA modification at naturally occurring sites. Although it is widely assumed that RNA modification at a given site are fully (100%) achieved in the cell, this assumption clearly needs further experimental examination. With recently developed methods for detecting and quantitating RNA modifications (Bakin and Ofengand 1993; Yu et al. 1997; Grosjean et al. 2004; Zhao and Yu 2004a), it is anticipated that this issue will soon be addressed.

Mechanisms and functions of RNA-guided RNA modification 251

10 Concluding remarks RNA modification provides for an increase in the diversity and complexity of the RNA products encoded by a genome. RNA-guided RNA modification is a particularly flexible system, allowing modification of multiple distinct sites by a single enzyme with modular RNA guides. Expansion in the extent of modification and evolution of new modification sites is possible via changes to short linear guide sequences (as opposed to evolution of novel RNA recognition domains in dedicated modification proteins). In addition, the extent and possible reversibility of modifications provide additional potential for regulation and fine-tuning of the function of substrate RNAs. Significant progress has been made in recent years in understanding the mechanisms of RNA-guided RNA modification in eukaryotes and archaea. While many important questions remain about the mechanisms of modification, the more fundamental questions now seem to center on the roles of the modifications in the target RNAs. Why are so many nucleotides within rRNAs and spliceosomal snRNAs 2’-O-methylated and pseudouridylated? What are the functional consequences of the modifications in protein translation and pre-mRNA splicing? Are the modified nucleotides present in all spliceosomal snRNAs (U1, U4, U5, and U6 as well as U2) and in snoRNAs (U3) functionally important? How do the modified nucleotides contribute to function? To what extent are other classes of RNAs, including mRNAs and tRNAs, modified by the RNA-guided modification system? Is every target site fully modified? Is modification reversible? Does it function as a means to regulate the function of the target RNAs? With regard to the functions of the modifications introduced by the RNA-guided system, there are clearly more questions than answers. The answers will require a combination of experimental approaches such as functional reconstitution of spliceosomal snRNPs in Xenopus oocytes (Yu et al. 1998; Zhao and Yu 2004b) or in HeLa nuclear extracts (Donmez et al. 2004), yeast genetics targeting rRNA modifications (Badis et al. 2003; King et al. 2003), and high resolution NMR (Newby and Greenbaum 2001; Newby and Greenbaum 2002) and X-ray crystallography that will provide insight into the detailed macromolecular changes induced by discrete post-transcriptional nucleotide modifications.

Acknowledgments We thank Henri Grosjean for extremely helpful discussions and valuable comments on the manuscript. We also thank our colleagues in the Yu lab and in the Terns lab for discussions and inspiration. Our work was supported by grant GM62937 (to Y.-T. Yu) and grant GM54682 (to M.P. and R.M. Terns) from the National Institutes of Health.

252 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

References A ccardo MC, Giordano E, Riccardo S, Digilio FA, Iazzetti G, Calogero RA, Furia M (2004) A computational search for box C/D snoRNA genes in the D. melanogaster genome. Bioinformatics, Advance Access published online on August 5, 2004 Aittaleb M, Rashid R, Chen Q, Palmer JR, Daniels CJ, Li H (2003) Structure and function of archaeal box C/D sRNP core proteins. Nat Struct Biol 10:256-263 Alexandrov A, Martzen MR, Phizicky EM (2002) Two proteins that form a complex are required for 7-methylguanosine modification of yeast tRNA. RNA 8:1253-1266 Arnez JG, Steitz TA (1994) Crystal structure of unmodified tRNA(Gln) complexed with glutaminyl-tRNA synthetase and ATP suggests a possible role for pseudo-uridines in stabilization of RNA structure. Biochemistry 33:7560-7567 Auffinger P, Westhof E (1998) Location and distribution of modified nucleotides in tRNA. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 569-576 Bachellerie J-P, Cavaille J (1998) Small nucleolar RNAs guide the ribose methylations of eukaryotic rRNAs. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 255-272 Bachellerie JP, Cavaille J (1997) Guiding ribose methylation of rRNA. Trends Biochem Sci 22:257-261 Bachellerie JP, Cavaille J, Huttenhofer A (2002) The expanding snoRNA world. Biochimie 84:775-790 Bachellerie JP, Michot B, Nicoloso M, Balakin A, Ni J, Fournier MJ (1995) Antisense snoRNAs: a family of nucleolar RNAs with long complementarities to rRNA. Trends Biochem Sci 20:261-264 Badis G, Fromont-Racine M, Jacquier A (2003) A snoRNA that guides the two most conserved pseudouridine modifications within rRNA confers a growth advantage in yeast. RNA 9:771-779 Bagni C, Lapeyre B (1998) Gar1p binds to the small nucleolar RNAs snR10 and snR30 in vitro through a nontypical RNA binding element. J Biol Chem 273:10868-10873 Bakin A, Ofengand J (1993) Four newly located pseudouridylate residues in Escherichia coli 23S ribosomal RNA are all at the peptidyltransferase center: analysis by the application of a new sequencing technique. Biochemistry 32:9754-9762 Balakin AG, Schneider GS, Corbett MS, Ni J, Fournier MJ (1993) SnR31, snR32, and snR33: three novel, non-essential snRNAs from Saccharomyces cerevisiae. Nucleic Acids Res 21:5391-5397 Balakin AG, Smith L, Fournier MJ (1996) The RNA world of the nucleolus: two major families of small RNAs defined by different box elements with related functions. Cell 86:823-834 Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000) The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science 289:905-920 Barneche F, Gaspin C, Guyot R, Echeverria M (2001) Identification of 66 box C/D snoRNAs in Arabidopsis thaliana: extensive gene duplications generated multiple isoforms predicting new ribosomal RNA 2'-O-methylation sites. J Mol Biol 311:57-73 Bjork GR (1995) Biosynthesis and function of modified nucleotides. In: Soll D, RajBhandary U (eds) tRNA: Structure, biosynthesis, and function. ASM Press, Washington, DC, pp 165-205

Mechanisms and functions of RNA-guided RNA modification 253 Bortolin ML, Bachellerie JP, Clouet-d'Orval B (2003) In vitro RNP assembly and methylation guide activity of an unusual box C/D RNA, cis-acting archaeal pre-tRNA(Trp). Nucleic Acids Res 31:6524-6535 Bortolin ML, Ganot P, Kiss T (1999) Elements essential for accumulation and function of small nucleolar RNAs directing site-specific pseudouridylation of ribosomal RNAs. EMBO J 18:457-469 Bortolin ML, Kiss T (1998) Human U19 intron-encoded snoRNA is processed from a long primary transcript that possesses little potential for protein coding. RNA 4:445-454 Bousquet-Antonelli C, Henry Y, G'elugne JP, Caizergues-Ferrer M, Kiss T (1997) A small nucleolar RNP protein is required for pseudouridylation of eukaryotic ribosomal RNAs. EMBO J 16:4770-4776 Brown JW, Clark GP, Leader DJ, Simpson CG, Lowe T (2001) Multiple snoRNA gene clusters from Arabidopsis. RNA 7:1817-1832 Brown JW, Echeverria M, Qu LH (2003a) Plant snoRNAs: functional evolution and new modes of gene expression. Trends Plant Sci 8:42-49 Brown JW, Echeverria M, Qu LH, Lowe TM, Bachellerie JP, Huttenhofer A, Kastenmayer JP, Green PJ, Shaw P, Marshall DF (2003b) Plant snoRNA database. Nucleic Acids Res 31:432-435 Caffarelli E, Arese M, Santoro B, Fragapane P, Bozzoni I (1994) In vitro study of processing of the intron-encoded U16 small nucleolar RNA in Xenopus laevis. Mol Cell Biol 14:2966-2974 Caffarelli E, Fatica A, Prislei S, De Gregorio E, Fragapane P, Bozzoni I (1996) Processing of the intron-encoded U16 and U18 snoRNAs: the conserved C and D boxes control both the processing reaction and the stability of the mature snoRNA. EMBO J 15:1121-1131 Cahill NM, Friend K, Speckmann W, Li ZH, Terns RM, Terns MP, Steitz JA (2002) Sitespecific cross-linking analyses reveal an asymmetric protein distribution for a box C/D snoRNP. EMBO J 21:3816-3828 Cavaille J, Buiting K, Kiefmann M, Lalande M, Brannan CI, Horsthemke B, Bachellerie JP, Brosius J, Huttenhofer A (2000) Identification of brain-specific and imprinted small nucleolar RNA genes exhibiting an unusual genomic organization. Proc Natl Acad Sci USA 97:14311-14316 Cavaille J, Nicoloso M, Bachellerie JP (1996) Targeted ribose methylation of RNA in vivo directed by tailored antisense RNA guides. Nature 383:732-735 Chen CL, Liang D, Zhou H, Zhuo M, Chen YQ, Qu LH (2003) The high diversity of snoRNAs in plants: identification and comparative study of 120 snoRNA genes from Oryza sativa. Nucleic Acids Res 31:2601-2613 Chen JL, Blasco MA, Greider CW (2000) Secondary structure of vertebrate telomerase RNA. Cell 100:503-514 Clouet d'Orval B, Bortolin ML, Gaspin C, Bachellerie JP (2001) Box C/D RNA guides for the ribose methylation of archaeal tRNAs. The tRNATrp intron guides the formation of two ribose-methylated nucleosides in the mature tRNATrp. Nucleic Acids Res 29:4518-4529 Darzacq X, Jady BE, Verheggen C, Kiss AM, Bertrand E, Kiss T (2002) Cajal bodyspecific small nuclear RNAs: a novel class of 2'-O-methylation and pseudouridylation guide RNAs. EMBO J 21:2746-2756 Davis DR (1995) Stabilization of RNA stacking by pseudouridine. Nucleic Acids Res 23:5020-5026

254 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Davis DR (1998) Biophysical and conformational properties of modified nucleotides in RNA. In: Grosjean H, Benne R (eds) Modification and editing of RNA. ASM Press, Washington, DC, pp 85-102 Decatur WA, Fournier MJ (2002) rRNA modifications and ribosome function. Trends Biochem Sci 27:344-351 Decatur WA, Fournier MJ (2003) RNA-guided nucleotide modification of ribosomal and other RNAs. J Biol Chem 278:695-698 Deng L, Starostina NG, Liu ZJ, Rose JP, Terns RM, Terns MP, Wang BC (2004) Structure determination of fibrillarin from the hyperthermophilic archaeon Pyrococcus furiosus. Biochem Biophys Res Commun 315:726-732 Dennis PP, Omer A, Lowe T (2001) A guided tour: small RNA function in Archaea. Mol Microbiol 40:509-519 Donmez G, Hartmuth K, Luhrmann R (2004) Modified nucleotides in the 5' end of the human U2 snRNA are required for early spliceosome (E complex) formation in vitro. The 2004 RNA meeting abstract:92 Dragon F, Pogacic V, Filipowicz W (2000) In vitro assembly of human H/ACA small nucleolar RNPs reveals unique features of U17 and telomerase RNAs. Mol Cell Biol 20:3037-3048 Dunbar DA, Wormsley S, Lowe TM, Baserga SJ (2000) Fibrillarin-associated box C/D small nucleolar RNAs in Trypanosoma brucei. Sequence conservation and implications for 2'-O-ribose methylation of rRNA. J Biol Chem 275:14767-14776 Ferre-D'Amare AR (2003) RNA-modifying enzymes. Curr Opin Struct Biol 13:49-55 Filipowicz W (2000) Imprinted expression of small nucleolar RNAs in brain: time for RNomics. Proc Natl Acad Sci USA 97:14035-14037 Filipowicz W, Pogacic V (2002) Biogenesis of small nucleolar ribonucleoproteins. Curr Opin Cell Biol 14:319-327 Galardi S, Fatica A, Bachi A, Scaloni A, Presutti C, Bozzoni I (2002) Purified box C/D snoRNPs are able to reproduce site-specific 2'-O-methylation of target RNA in vitro. Mol Cell Biol 22:6663-6668 Ganot P, Bortolin ML, Kiss T (1997a) Site-specific pseudouridine formation in preribosomal RNA is guided by small nucleolar RNAs. Cell 89:799-809 Ganot P, Caizergues-Ferrer M, Kiss T (1997b) The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation. Genes Dev 11:941-956 Ganot P, Jady BE, Bortolin ML, Darzacq X, Kiss T (1999) Nucleolar factors direct the 2'O-ribose methylation and pseudouridylation of U6 spliceosomal RNA. Mol Cell Biol 19:6906-6917 Gaspin C, Cavaille J, Erauso G, Bachellerie JP (2000) Archaeal homologs of eukaryotic methylation guide small nucleolar RNAs: lessons from the Pyrococcus genomes. J Mol Biol 297:895-906 Gautier T, Berges T, Tollervey D, Hurt E (1997) Nucleolar KKE/D repeat proteins Nop56p and Nop58p interact with Nop1p and are required for ribosome biogenesis. Mol Cell Biol 17:7088-7098 Gerbi SA, Savino R, Stebbins-Boaz B, Jeppesen C, Rivera-Leon R (1990) A role for U3 small nuclear ribonucleoprotein in the nucleolus? In: Dahlberg A, Garrett RA, Moore PB, Schlessinger D, Warner JR (eds) The ribosome–structure, function and evolution. ASM Press, Washington, DC, pp 452-469

Mechanisms and functions of RNA-guided RNA modification 255 Grosjean H, Keith G, Droogmans L (2004) Detection and quantification of modified nucleotides in RNA using thin-layer chromatography. Methods Mol Biol 265:357-391 Grosjean H, Sprinzl M, Steinberg S (1995) Posttranscriptionally modified nucleosides in transfer RNA: their locations and frequencies. Biochimie 77:139-141 Gutgsell NS, Del Campo MD, Raychaudhuri S, Ofengand J (2001) A second function for pseudouridine synthases: A point mutant of RluD unable to form pseudouridines 1911, 1915, and 1917 in Escherichia coli 23S ribosomal RNA restores normal growth to an RluD-minus strain. RNA 7:990-998 Hamma T, Ferre-D'Amare AR (2004) Structure of protein L7Ae bound to a K-turn derived from an archaeal box H/ACA sRNA at 1.8 A resolution. Structure (Camb) 12:893-903 Heiss NS, Knight SW, Vulliamy TJ, Klauck SM, Wiemann S, Mason PJ, Poustka A, Dokal I (1998) X-linked dyskeratosis congenita is caused by mutations in a highly conserved gene with putative nucleolar functions. Nat Genet 19:32-38 Henras A, Dez C, Noaillac-Depeyre J, Henry Y, Caizergues-Ferrer M (2001) Accumulation of H/ACA snoRNPs depends on the integrity of the conserved central domain of the RNA-binding protein Nhp2p. Nucleic Acids Res 29:2733-2746 Henras A, Henry Y, Bousquet-Antonelli C, Noaillac-Depeyre J, Gélugne JP, CaizerguesFerrer M (1998) Nhp2p and Nop10p are essential for the function of H/ACA snoRNPs. EMBO J 17:7078-7090 Henras AK, Capeyrou R, Henry Y, Caizergues-Ferrer M (2004) Cbf5p, the putative pseudouridine synthase of H/ACA-type snoRNPs, can form a complex with Gar1p and Nop10p in absence of Nhp2p and box H/ACA snoRNAs. RNA 10:1704-1712 Hodnett JL, Busch H (1968) Isolation and characterization of uridylic acid-rich 7 S ribonucleic acid of rat liver nuclei. J Biol Chem 243:6334-6342 Hopper AK, Phizicky EM (2003) tRNA transfers to the limelight. Genes Dev 17:162-180 Huang ZP, Zhou H, Liang D, Qu LH (2004) Different expression strategy: multiple intronic gene clusters of box H/ACA snoRNA in Drosophila melanogaster. J Mol Biol 341:669-683 Huttenhofer A, Brosius J, Bachellerie JP (2002) RNomics: identification and function of small, non-messenger RNAs. Curr Opin Chem Biol 6:835-843 Huttenhofer A, Cavaille J, Bachellerie JP (2004) Experimental RNomics: a global approach to identifying small nuclear RNAs and their targets in different model organisms. Methods Mol Biol 265:409-428 Huttenhofer A, Kiefmann M, Meier-Ewert S, O'Brien J, Lehrach H, Bachellerie JP, Brosius J (2001) RNomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger RNAs in mouse. EMBO J 20:2943-2953 Jady BE, Bertrand E, Kiss T (2004) Human telomerase RNA and box H/ACA scaRNAs share a common Cajal body-specific localization signal. J Cell Biol 164:647-652 Jady BE, Kiss T (2001) A small nucleolar guide RNA functions both in 2'-O-ribose methylation and pseudouridylation of the U5 spliceosomal RNA. EMBO J 20:541-551 Kass S, Tyc K, Steitz JA, Sollner-Webb B (1990) The U3 small nucleolar ribonucleoprotein functions in the first step of preribosomal RNA processing. Cell 60:897-908 King TH, Liu B, McCully RR, Fournier MJ (2003) Ribosome structure and activity are altered in cells lacking snoRNPs that form pseudouridines in the peptidyl transferase center. Mol Cell 11:425-435 Kiss AM, Jady BE, Bertrand E, Kiss T (2004) Human box H/ACA pseudouridylation guide RNA machinery. Mol Cell Biol 24:5797-5807

256 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Kiss AM, Jady BE, Darzacq X, Verheggen C, Bertrand E, Kiss T (2002) A Cajal bodyspecific pseudouridylation guide RNA is composed of two box H/ACA snoRNA-like domains. Nucleic Acids Res 30:4643-4649 Kiss T (2001) Small nucleolar RNA-guided post-transcriptional modification of cellular RNAs. EMBO J 20:3617-3622 Kiss T (2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse cellular functions. Cell 109:145-148 Kiss T, Jady BE (2004) Functional characterization of 2'-O-methylation and pseudouridylation guide RNAs. Methods Mol Biol 265:393-408 Kiss-Laszlo Z, Henry Y, Bachellerie JP, Caizergues-Ferrer M, Kiss T (1996) Site-specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell 85:1077-1088 Kiss-László Z, Henry Y, Kiss T (1998) Sequence and structural elements of methylation guide snoRNAs essential for site-specific ribose methylation of pre-rRNA. EMBO J 17:797-807 Klein DJ, Schmeing TM, Moore PB, Steitz TA (2001) The kink-turn: a new RNA secondary structure motif. EMBO J 20:4214-4221 Klein RJ, Misulovin Z, Eddy SR (2002) Noncoding RNA genes identified in AT-rich hyperthermophiles. Proc Natl Acad Sci USA 99:7542-7547 Koonin EV (1996) Pseudouridine synthases: four families of enzymes containing a putative uridine-binding motif also conserved in dUTPases and dCTP deaminases. Nucleic Acids Res 24:2411-2415 Kuhn JF, Tran EJ, Maxwell ES (2002) Archaeal ribosomal protein L7 is a functional homolog of the eukaryotic 15.5kD/Snu13p snoRNP core protein. Nucleic Acids Res 30:931-941 Lafontaine DL, Tollervey D (1998) Birth of the snoRNPs: the evolution of the modification-guide snoRNAs. Trends Biochem Sci 23:383-388 Lafontaine DL, Tollervey D (2000) Synthesis and assembly of the box C+D small nucleolar RNPs. Mol Cell Biol 20:2650-2659 Lafontaine DLJ, Tollervey D (1999) Nop58p is a common component of the box C+D snoRNPs that is required for snoRNA stability. RNA 5:455-467 Leverette RD, Andrews MT, Maxwell ES (1992) Mouse U14 snRNA is a processed intron of the cognate hsc70 heat shock pre-messenger RNA. Cell 71:1215-1221 Li HD, Zagorski J, Fournier MJ (1990) Depletion of U14 small nuclear RNA (snR128) disrupts production of 18S rRNA in Saccharomyces cerevisiae. Mol Cell Biol 10:11451152 Liang XH, Xu YX, Michaeli S (2002) The spliced leader-associated RNA is a trypanosome-specific sn(o) RNA that has the potential to guide pseudouridine formation on the SL RNA. RNA 8:237-246 Liu J, Maxwell ES (1990) Mouse U14 snRNA is encoded in an intron of the mouse cognate hsc70 heat shock gene. Nucleic Acids Res 18:6565-6571 Lowe TM, Eddy SR (1999) A computational screen for methylation guide snoRNAs in yeast. Science 283:1168-1171 Lubben B, Fabrizio P, Kastner B, Luhrmann R (1995) Isolation and characterization of the small nucleolar ribonucleoprotein particle snR30 from Saccharomyces cerevisiae. J Biol Chem 270:11549-11554

Mechanisms and functions of RNA-guided RNA modification 257 Lukowiak AA, Narayanan A, Li ZH, Terns RM, Terns MP (2001) The snoRNA domain of vertebrate telomerase RNA functions to localize the RNA within the nucleus. RNA 7:1833-1844 Ma X, Zhao X, Yu YT (2003) Pseudouridylation (Psi) of U2 snRNA in S. cerevisiae is catalyzed by an RNA-independent mechanism. EMBO J 22:1889-1897 Maden BE (1990) The numerous modified nucleotides in eukaryotic ribosomal RNA. Prog Nucleic Acid Res Mol Biol 39:241-303 Makarova OV, Makarov EM, Liu S, Vornlocher HP, Luhrmann R (2002) Protein 61K, encoded by a gene (PRPF31) linked to autosomal dominant retinitis pigmentosa, is required for U4/U6*U5 tri-snRNP formation and pre-mRNA splicing. EMBO J 21:11481157 Marker C, Zemann A, Terhorst T, Kiefmann M, Kastenmayer JP, Green P, Bachellerie JP, Brosius J, Huttenhofer A (2002) Experimental RNomics: identification of 140 candidates for small non-messenger RNAs in the plant Arabidopsis thaliana. Curr Biol 12:2002-2013 Marrone A, Mason PJ (2003) Dyskeratosis congenita. Cell Mol Life Sci 60:507-517 Martin JL, McMillan FM (2002) SAM (dependent) I AM: the S-adenosylmethioninedependent methyltransferase fold. Curr Opin Struct Biol 12:783-793 Maser RL, Calvet JP (1989) U3 small nuclear RNA can be psoralen-cross-linked in vivo to the 5' external transcribed spacer of pre-ribosomal-RNA. Proc Natl Acad Sci USA 86:6523-6527 Massenet S, Motorin Y, Lafontaine DL, Hurt EC, Grosjean H, Branlant C (1999) Pseudouridine mapping in the Saccharomyces cerevisiae spliceosomal U small nuclear RNAs (snRNAs) reveals that pseudouridine synthase pus1p exhibits a dual substrate specificity for U2 snRNA and tRNA. Mol Cell Biol 19:2142-2154 Massenet S, Mougin A, C. B (1998) Posttranscriptional modifications in the U small nuclear RNAs. In: Grosjean H (ed) Modification and Editing of RNA. ASM Press, Washington, DC, pp 201-228 Maxwell ES, Fournier MJ (1995) The small nucleolar RNAs. Annu Rev Biochem 64:897934 Maxwell ES, Martin TE (1986) A low-molecular-weight RNA from mouse ascites cells that hybridizes to both 18S rRNA and mRNA sequences. Proc Natl Acad Sci USA 83:7261-7265 McPheeters DS, Abelson J (1992) Mutational analysis of the yeast U2 snRNA suggests a structural similarity to the catalytic core of group I introns. Cell 71:819-831 McPheeters DS, Fabrizio P, Abelson J (1989) In vitro reconstitution of functional yeast U2 snRNPs. Genes Dev 3:2124-2136 Meier UT (2003) Dissecting dyskeratosis. Nat Genet 33:116-117 Mitchell JR, Cheng J, Collins K (1999a) A box H/ACA small nucleolar RNA-like domain at the human telomerase RNA 3' end. Mol Cell Biol 19:567-576 Mitchell JR, Wood E, Collins K (1999b) A telomerase component is defective in the human disease dyskeratosis congenita. Nature 402:551-555 Moore T, Zhang Y, Fenley MO, Li H (2004) Molecular basis of box C/D RNA-protein interactions; cocrystal structure of archaeal L7Ae and a box C/D RNA. Structure (Camb) 12:807-818 Motorin Y, Grosjean H (1998) Chemical structures and classification of posttranscriptionally modified nucleotides in RNA. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 543-549

258 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Newby MI, Greenbaum NL (2001) A conserved pseudouridine modification in eukaryotic U2 snRNA induces a change in branch-site architecture. RNA 7:833-845 Newby MI, Greenbaum NL (2002) Sculpting of the spliceosomal branch site recognition motif by a conserved pseudouridine. Nat Struct Biol 9:958-965 Ni J, Tien AL, Fournier MJ (1997) Small nucleolar RNAs direct site-specific synthesis of pseudouridine in ribosomal RNA. Cell 89:565-573 Nieuwlandt DT, Carr MB, Daniels CJ (1993) In vivo processing of an intron-containing archael tRNA. Mol Microbiol 8:93-99 Noon KR, Bruenger E, McCloskey JA (1998) Posttranscriptional modifications in 16S and 23S rRNAs of the archaeal hyperthermophile Sulfolobus solfataricus. J Bacteriol 180:2883-2888 Ochs RL, Lischwe MA, Spohn WH, Busch H (1985) Fibrillarin: a new protein of the nucleolus identified by autoimmune sera. Biol Cell 54:123-133 Ofengand J (2002) Ribosomal RNA pseudouridines and pseudouridine synthases. FEBS Lett 514:17-25 Ofengand J, Fournier M (1998) The pseudouridine residues of rRNA: number, location, biosynthesis, and function. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 229-253 Olivas WM, Muhlrad D, Parker R (1997) Analysis of the yeast genome: identification of new non-coding and small ORF-containing RNAs. Nucleic Acids Res 25:4619-4625 Omer AD, Lowe TM, Russell AG, Ebhardt H, Eddy SR, Dennis PP (2000) Homologs of small nucleolar RNAs in Archaea. Science 288:517-522 Omer AD, Ziesche S, Decatur WA, Fournier MJ, Dennis PP (2003) RNA-modifying machines in archaea. Mol Microbiol 48:617-629 Omer AD, Ziesche S, Ebhardt H, Dennis PP (2002) In vitro reconstitution and activity of a C/D box methylation guide ribonucleoprotein complex. Proc Natl Acad Sci USA 99:5289-5294 Pan ZQ, Ge H, Fu XY, Manley JL, Prives C (1989) Oligonucleotide-targeted degradation of U1 and U2 snRNAs reveals differential interactions of simian virus 40 pre-mRNAs with snRNPs. Nucleic Acids Res 17:6553-6568 Peculis B (1997) RNA processing: pocket guides to ribosomal RNA. Curr Biol 7:R480-482 Pelczar P, Filipowicz W (1998) The host gene for intronic U17 small nucleolar RNAs in mammals has no protein-coding potential and is a member of the 5'-terminal oligopyrimidine gene family. Mol Cell Biol 18:4509-4518 Pogacic V, Dragon F, Filipowicz W (2000) Human H/ACA small nucleolar RNPs and telomerase share evolutionarily conserved proteins NHP2 and NOP10. Mol Cell Biol 20:9028-9040 Qu LH, Henras A, Lu YJ, Zhou H, Zhou WX, Zhu YQ, Zhao J, Henry Y, CaizerguesFerrer M, Bachellerie JP (1999) Seven novel methylation guide small nucleolar RNAs are processed from a common polycistronic transcript by Rat1p and RNase III in yeast. Mol Cell Biol 19:1144-1158 Qu LH, Meng Q, Zhou H, Chen YQ, Liang-Hu Q, Qing M, Hui Z, Yue-Qin C (2001) Identification of 10 novel snoRNA gene clusters from Arabidopsis thaliana. Nucleic Acids Res 29:1623-1630 Rashid R, Aittaleb M, Chen Q, Spiegel K, Demeler B, Li H (2003) Functional requirement for symmetric assembly of archaeal box C/D small ribonucleoprotein particles. J Mol Biol 333:295-306

Mechanisms and functions of RNA-guided RNA modification 259 Raychaudhuri S, Conrad J, Hall BG, Ofengand J (1998) A pseudouridine synthase required for the formation of two universally conserved pseudouridines in ribosomal RNA is essential for normal growth of Escherichia coli. RNA 4:1407-1417 Reddy R, Busch H (1988) Small nuclear RNAs: RNA sequences, structure, and modifications. In: Birnsteil ML (ed) Structure and function of major and minor small nuclear ribonucleoprotein particles. Sringer-Verlag Press, Heidelberg, pp 1-37 Rimoldi OJ, Raghu B, Nag MK, Eliceiri GL (1993) Three new small nucleolar RNAs that are psoralen cross-linked in vivo to unique regions of pre-rRNA. Mol Cell Biol 13:4382-4390 Rozhdestvensky TS, Tang TH, Tchirkova IV, Brosius J, Bachellerie JP, Huttenhofer A (2003) Binding of L7Ae protein to the K-turn of archaeal snoRNAs: a shared RNA binding motif for C/D and H/ACA box snoRNAs in Archaea. Nucleic Acids Res 31:869-877 Ruff EA, Rimoldi OJ, Raghu B, Eliceiri GL (1993) Three small nucleolar RNAs of unique nucleotide sequences. Proc Natl Acad Sci USA 90:635-638 Ruggero D, Grisendi S, Piazza F, Rego E, Mari F, Rao PH, Cordon-Cardo C, Pandolfi PP (2003) Dyskeratosis congenita and cancer in mice deficient in ribosomal RNA modification. Science 299:259-262 Samarsky DA, Fournier MJ (1999) A comprehensive database for the small nucleolar RNAs from Saccharomyces cerevisiae. Nucleic Acids Res 27:161-164 Schattner P, Decatur WA, Davis CA, Ares M Jr, Fournier MJ, Lowe TM (2004) Genomewide searching for pseudouridylation guide snoRNAs: analysis of the Saccharomyces cerevisiae genome. Nucleic Acids Res 32:4281-4296 Schubert HL, Blumenthal RM, Cheng X (2003) Many paths to methyltransfer: a chronicle of convergence. Trends Biochem Sci 28:329-335 Segault V, Will CL, Sproat BS, Luhrmann R (1995) In vitro reconstitution of mammalian U2 and U5 snRNPs active in splicing: Sm proteins are functionally interchangeable and are essential for the formation of functional U2 and U5 snRNPs. EMBO J 14:4010-4021 Singh SK, Gurha P, Tran EJ, Maxwell ES, Gupta R (2004) A trans mechanism for archaeal tRNAtrp nucleotide 2'-O-methylation guided by the pre-tRNATrp intron-encoded box C/D RNPs. The 2004 RNA meeting abstract:744 Smith CM, Steitz JA (1997) Sno storm in the nucleolus: new roles for myriad small RNPs. Cell 89:669-672 Smith CM, Steitz JA (1998) Classification of gas5 as a multi-small-nucleolar-RNA (snoRNA) host gene and a member of the 5'-terminal oligopyrimidine gene family reveals common features of snoRNA host genes. Mol Cell Biol 18:6897-6909 Speckmann WA, Li ZH, Lowe TM, Eddy SR, Terns RM, Terns MP (2002) Archaeal guide RNAs function in rRNA modification in the eukaryotic nucleus. Curr Biol 12:199-203 Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res 26:148-153 Starostina NG, Marshburn S, Johnson LS, Eddy SR, Terns RM, Terns MP (2004) Circular box C/D RNAs in Pyrococcus furiosus. Proc Natl Acad Sci USA 101:14097-14101 Steitz JA, Tycowski KT (1995) Small RNA chaperones for ribosome biogenesis. Science 270:1626-1627 Stroke IL, Weiner AM (1989) The 5' end of U3 snRNA can be crosslinked in vivo to the external transcribed spacer of rat ribosomal RNA precursors. J Mol Biol 210:497-512

260 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Szewczak LB, DeGregorio SJ, Strobel SA, Steitz JA (2002) Exclusive interaction of the 15.5 kD protein with the terminal box C/D motif of a methylation guide snoRNP. Chem Biol 9:1095-1107 Tang TH, Bachellerie JP, Rozhdestvensky T, Bortolin ML, Huber H, Drungowski M, Elge T, Brosius J, Huttenhofer A (2002) Identification of 86 candidates for small nonmessenger RNAs from the archaeon Archaeoglobus fulgidus. Proc Natl Acad Sci USA 99:7536-7541 Terns MP, Terns RM (2002) Small nucleolar RNAs: versatile trans-acting molecules of ancient evolutionary origin. Gene Expr 10:17-39 Tollervey D, Lehtonen H, Carmo-Fonseca M, Hurt EC (1991) The small nucleolar RNP protein NOP1 (fibrillarin) is required for pre-rRNA processing in yeast. EMBO J 10:573-583 Tollervey D, Lehtonen H, Jansen R, Kern H, Hurt EC (1993) Temperature-sensitive mutations demonstrate roles for yeast fibrillarin in pre-rRNA processing, pre-rRNA methylation, and ribosome assembly. Cell 72:443-457 Tran E, Brown J, Maxwell ES (2004) Evolutionary origins of the RNA-guided nucleotidemodification complexes: from the primitive translation apparatus? Trends Biochem Sci 29:343-350 Tran EJ, Zhang X, Maxwell ES (2003) Efficient RNA 2'-O-methylation requires juxtaposed and symmetrically assembled archaeal box C/D and C'/D' RNPs. EMBO J. 22:39303940 Trinh-Rohlik Q, Maxwell ES (1988) Homologous genes for mouse 4.5S hybRNA are found in all eukaryotes and their low molecular weight RNA transcripts intermolecularly hybridize with eukaryotic 18S ribosomal RNAs. Nucleic Acids Res 16:6041-6056 Tyc K, Steitz JA (1989) U3, U8 and U13 comprise a new class of mammalian snRNPs localized in the cell nucleolus. EMBO J 8:3113-3119 Tycowski KT, Shu MD, Steitz JA (1996) A mammalian gene with introns instead of exons generating stable RNA products. Nature 379:464-466 Tycowski KT, Steitz JA (2001) Non-coding snoRNA host genes in Drosophila: expression strategies for modification guide snoRNAs. Eur J Cell Biol 80:119-125 Tycowski KT, You ZH, Graham PJ, Steitz JA (1998) Modification of U6 spliceosomal RNA is guided by other small RNAs. Mol Cell 2:629-638 Uliel S, Liang XH, Unger R, Michaeli S (2004) Small nucleolar RNAs that guide modification in trypanosomatids: repertoire, targets, genome organisation, and unique functions. Int J Parasitol 34:445-454 Valadkhan S, Manley JL (2003) Characterization of the catalytic activity of U2 and U6 snRNAs. RNA 9:892-904 Vidovic I, Nottrott S, Hartmuth K, Luhrmann R, Ficner R (2000) Crystal structure of the spliceosomal 15.5kD protein bound to a U4 snRNA fragment. Mol Cell 6:1331-1342 Villa T, Ceradini F, Presutti C, Bozzoni I (1998) Processing of the intron-encoded U18 small nucleolar RNA in the yeast Saccharomyces cerevisiae relies on both exo- and endonucleolytic activities. Mol Cell Biol 18:3376-3383 Vitali P, Royo H, Seitz H, Bachellerie JP, Huttenhofer A, Cavaille J (2003) Identification of 13 novel human modification guide RNAs. Nucleic Acids Res 31:6543-6551 Wang C, Meier UT (2004) Architecture and assembly of mammalian H/ACA small nucleolar and telomerase ribonucleoproteins. EMBO J 23:1857-1867

Mechanisms and functions of RNA-guided RNA modification 261 Wang C, Query CC, Meier UT (2002) Immunopurified small nucleolar ribonucleoprotein particles pseudouridylate rRNA independently of their association with phosphorylated Nopp140. Mol Cell Biol 22:8457-8466 Wang H, Boisvert D, Kim KK, Kim R, Kim SH (2000) Crystal structure of a fibrillarin homologue from Methanococcus jannaschii, a hyperthermophile, at 1.6 A resolution. EMBO J 19:317-323 Watanabe Y, Gray MW (2000) Evolutionary appearance of genes encoding proteins associated with box H/ACA snoRNAs: cbf5p in Euglena gracilis, an early diverging eukaryote, and candidate Gar1p and Nop10p homologs in archaebacteria. Nucleic Acids Res 28:2342-2352 Watkins KP, Dungan JM, Agabian N (1994) Identification of a small RNA that interacts with the 5' splice site of the Trypanosoma brucei spliced leader RNA in vivo. Cell T6:171-182 Watkins NJ, Gottschalk A, Neubauer G, Kastner B, Fabrizio P, Mann M, Lührmann R (1998) Cbf5p, a potential pseudouridine synthase, and Nhp2p, a putative RNA-binding protein, are present together with Gar1p in all H BOX/ACA-motif snoRNPs and constitute a common bipartite structure. RNA 4:1549-1568 Watkins NJ, Segault V, Charpentier B, Nottrott S, Fabrizio P, Bachi A, Wilm M, Rosbash M, Branlant C, Luhrmann R (2000) A common core RNP structure shared between the small nucleolar box C/D RNPs and the spliceosomal U4 snRNP. Cell 103:457-466 Weidenhammer EM, Ruiz-Noriega M, Woolford JL Jr (1997) Prp31p promotes the association of the U4/U6 x U5 tri-snRNP with prespliceosomes to form spliceosomes in Saccharomyces cerevisiae. Mol Cell Biol 17:3580-3588 Wise JA, Tollervey D, Maloney D, Swerdlow H, Dunn EJ, Guthrie C (1983) Yeast contains small nuclear RNAs encoded by single copy genes. Cell 35:743-751 Yu YT, Scharl EC, Smith CM, Steitz JA (1999) The growing world of small nuclear ribonucleoproteins. In: Gesteland RF, Cech TR, Atkins JF (eds) The RNA world, 2nd edn. Cold Spring Harbor laboratory Press, Cold Spring Harbor, New York, pp 487-524 Yu YT, Shu MD, Steitz JA (1997) A new method for detecting sites of 2'-O-methylation in RNA molecules. RNA 3:324-331 Yu YT, Shu MD, Steitz JA (1998) Modifications of U2 snRNA are required for snRNP assembly and pre-mRNA splicing. EMBO J 17:5783-5795 Yuan G, Klambt C, Bachellerie JP, Brosius J, Huttenhofer A (2003) RNomics in Drosophila melanogaster: identification of 66 candidates for novel non-messenger RNAs. Nucleic Acids Res 31:2495-2507 Zagorski J, Tollervey D, Fournier MJ (1988) Characterization of an SNR gene locus in Saccharomyces cerevisiae that specifies both dispensable and essential small nuclear RNAs. Mol Cell Biol 8:3282-3290 Zebarjadian Y, King T, Fournier MJ, Clarke L, Carbon J (1999) Point mutations in yeast CBF5 can abolish in vivo pseudouridylation of rRNA. Mol Cell Biol 19:7461-7472 Zhao X, Li ZH, Terns RM, Terns MP, Yu YT (2002) An H/ACA guide RNA directs U2 pseudouridylation at two different sites in the branchpoint recognition region in Xenopus oocytes. RNA 8:1515-1525 Zhao X, Yu YT (2004a) Detection and quantitation of RNA base modifications. RNA 10:996-1002 Zhao X, Yu YT (2004b) Pseudouridines in and near the branch site recognition region of U2 snRNA are required for snRNP biogenesis and pre-mRNA splicing in Xenopus oocytes. RNA 10:681-690

262 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Zhu Y, Tomlinson RL, Lukowiak AA, Terns RM, Terns MP (2004) Telomerase RNA accumulates in Cajal bodies in human cancer cells. Mol Biol Cell 15:81-90

Terns, Michael P. Department of Biochemistry and Molecular Biology, University of Georgia, Life Sciences Building, Athens, Georgia 30602, USA Terns, Rebecca M. Department of Biochemistry and Molecular Biology, University of Georgia, Life Sciences Building, Athens, Georgia 30602, USA Yu , Yi-Tao Department of Biochemistry and Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Rochester, NY 14642, USA [email protected]

Conserved ribosomal RNA modification and their putative roles in ribosome biogenesis and translation Bruno Lapeyre

Abstract rRNA maturation requires extensive covalent modifications of riboses and bases. These modifications concern exclusively the most conserved regions of the molecule, and some modifications are highly conserved throughout the evolution. In bacteria, rRNA modification is achieved exclusively by site-specific enzymes while in archaea and eukaryotes the formation of 2’-O-methylriboses and pseudouridines is guided by numerous snoRNA that direct a catalytic machinery to the target sites on the pre-rRNA. The exact function of these modifications remains elusive since preventing their formation generally leads to no detectable phenotype. However, most of the enzymes that catalyze the formation of these modifications are encoded by essential genes in yeast. Moreover, in some cases preventing the formation of several modifications simultaneously affect ribosome biogenesis and translation. This review presents rRNA modifications that have been conserved throughout the evolution and it gives a special emphasis to the recently characterized 2’-O-ribose RNA methyltransferase Spb1p, which broke the “snoRNA-guided only” methylation dogma.

1 Introduction Ribosomes are large ribonucleoprotein complexes whose main function is to translate the genetic information into proteins. The ribosome controls translation fidelity by ensuring that the proper tRNA is selected in front of a given codon. Then, the ribosome catalyzes the polymerization of proteins by transferring the elongated polypeptide chain from the peptidyl-tRNA present in the P-site to the incoming aminoacyl-tRNA present in the A-site of the ribosome. The peptidyl transferase reaction is catalyzed by the ribosomal RNA, as foreseen long ago (Crick 1968), and then demonstrated year after year from photocrosslinking of derivatives of tRNA to rRNA (Barta et al. 1984), to the discovery that deproteinized rRNA from the large subunit is sufficient to achieve the peptidyl transfer reaction (Noller et al. 1992). Later, these results were confirmed by showing that there is no protein in the catalytic center of the ribosome (Ban et al. 2000) and that an

Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b105433 / Published online: 14 December 2004 © Springer-Verlag Berlin Heidelberg 2005

264

Bruno Lapeyre

adenine of the large rRNA is likely directly involved in the catalytic reaction (Nissen et al. 2000). Ribosomes from prokaryotes and eukaryotes share numerous features that form the core elements. Since all ribosomes are able to polymerize proteins, it is assumed that the elements required for translation are present in the core elements. Throughout evolution, ribosomes have acquired additional RNA sequences and new ribosomal proteins. These additional elements are likely to be involved in other aspects of protein synthesis that might be important for more complex organisms. For instance, metazoans must synthesize many different kinds of proteins in different cell types and in a timely fashion. Also, larger ribosomes might be involved in mRNA localization, mRNA turnover, regulating translation or addressing proteins to certain cellular compartments. A major difference between organelles, bacteria, and archaea on one side and eukaryotes on the other is that, in the latter, ribosomes are synthesized, assembled, and processed in a cellular compartment that is different from the one in which translation takes place. The increased complexity of ribosome biogenesis in eukaryotes is illustrated by the fact that the number of genes encoding nucleolar components is about twice the number of ribosomal components per se. Although ribosomes from organelles, bacteria, and archaea do undergo processing to become functional, it is simpler than in eukaryotes. In addition, it has been shown long ago that functional prokaryotic ribosomes could be assembled in vitro from their isolated constituents (Hosokawa et al. 1966), while this has not been achieved for eukaryotic ones. This result suggests that ribosome biogenesis in eukaryotes must follow an ordered pathway that includes some essential intermediate steps. It is likely that these steps require internal and external transcribed spacers and nucleolar components, which are not part of mature ribosomes. It is generally assumed that assembling ribosomes in the nucleus rather than in the cytoplasm may provide the cell various levels of control on their synthesis. Final processing stages occur in the cytoplasm, possibly to prevent the accumulation of functional ribosomes in the nucleus. The extent of ribosome expansion during the evolution is quite significant and may account for the additional functions performed by ribosomes from higher eukaryotes, as mentioned above. However, this increase in size is not as spectacular as the enormous increase of the diversity of the proteins a ribosome might synthesize, as shown in Table 1. In contrast, the number of rRNA modifications increases very significantly, along with the complexity of the organism. For instance, it is quite striking that yeast mitochondrial ribosomes, which are required for the synthesis of ~ 10 proteins, are composed of mitochondrial rRNA that possess only three modifications. In contrast, ribosomes from metazoans synthesize tens of thousands of different proteins and contain ~ two hundred modified nucleotides. Nascent polypeptides exit from the ribosome through a tunnel that is ~ 15Å in diameter and 80% of the methylated nucleotides have been mapped to the inner face of this tunnel in H. marismortui (Nissen et al. 2000). The hydrophobic nature of the methyl groups may prevent the nascent polypeptide chain to stick to this tunnel. According to this view, the greater the complexity of a given organism’s proteome, the higher the chance to encounter certain amino-acid motifs

RNA modification in ribosome biogenesis and translation

265

Table 1. Correlation between ribosome composition and the diversity of proteins synthesized in different organisms (Maden 1990; Del Campo et al. 2001; Ofengand et al. 2001). Ribosome size Mito (yeast) Bacteria Yeast Human

4.2 Md 2.5 Md 4 Md 4.2 Md

rRNA size Number of rproteins 5,000 77 4,400 55 5,200 77 6,800 83

Number of proteins synthesized ~10 ~3,000 ~6,000 ~250,000

Number of modifications 3 37 112 250

that are susceptible to stick to the exit tunnel. Therefore, more methyl groups would be required to prevent this sticking (Pintard et al. 2002a). The presence of modified nucleotides in RNA has been established nearly five decades ago, and it has then been followed by the identification of about one hundred different types of modified nucleotides, which are detected in every type of RNA molecules and in every known organisms (Grosjean and Benne 1998). These discoveries have been vividly recounted elsewhere (Lane 1998). The last decade has seen two major achievements. The first one concerns the identification and characterization of a rapidly growing number of modification enzymes and of their targets. For instance, most of the enzymes that catalyze the modification of yeast cytoplasmic tRNA have been identified and one can predict that all remaining activities will be identified in the next few years (Hopper and Phizicky 2003; see Johansson and Byström, this volume). The second major achievement was the discovery of the snoRNA-guided mechanism that leads to the formation of 2’-Omethylriboses (Nm) and pseudouridines (Ψ) in rRNA molecules. Originally discovered in yeast (Cavaillé et al. 1996; Kiss-Laszlo et al. 1996; Ni et al. 1997), this mechanism was then shown also to operate in archaea (Gaspin et al. 2000; Omer et al. 2000). Moreover, other molecules such as tRNA or snRNA can also be modified using a small RNA-guided mechanism (Kiss 2001; Decatur and Fournier 2003; see Yu et al. this volume). RNA modification appears to be a universal feature of RNA maturation. The huge diversity of modifications, their high frequency of occurrence in every RNA molecules and often, their high conservation between very distantly related species strongly pleads for a major role of these modifications in RNA function. Along this line, several enzymes involved in RNA modification turned out to be essential for cell growth in different organisms. Also, most proteins of the snoRNP involved in Nm and Ψ formation are encoded by essential genes in yeast. However, the situation is more complex than it seems at a first glance. For several enzymes involved in RNA modification it has been possible to separate the essential function of the protein from its RNA modification catalytic activity, which, thus, turned out not to be essential (Lafontaine et al. 1998). In addition, numerous snoRNA can be deleted with nearly no effect on ribosome synthesis or function (Parker et al. 1988). This difficulty to assign a precise function to each RNA modification has led to underestimate the potential impact of modifications

266

Bruno Lapeyre

Table 2. rRNA modifications in E. coli that have been conserved in the yeast S. cerevisiae and their mechanisms of formation.

SSU

E. coli

Enzyme

Ψ516 m7G527 m5U788

RsuA

2

m G966 m5C967 m2G1207 m4Cm1402 m7G1405 m5C1407 m3U1498 m2G1516 m62A1518 m62A1519 LSU

m1G745 Ψ746 m5U747 Ψ955 m6A1618 m2G1835 Ψ1911 m3Ψ1915 Ψ1917 m5U1939 m5C1962 m6A2030 m6A2058 m7G2069 Gm2251 m2G2445 D2449 Ψ2457 Cm2498 m2A2503 Ψ2504 Um2552 Ψ2580 Ψ2604 Ψ2605

S. cerevisiae

Catalyzed by Mito yeast

m1acp3Ψ1189

snR35 (Ψ)

Cm1638

snR70

m62A1780

Dim1p, Dim2p

m62A1781

Dim1p, Dim2p

Ψ2258 Ψ2260

snR191 snR191

RlmB

Gm2619

snR67

RluE

Ψ2826

snR34

Um2921 Gm2922

snR52 (Spb1p) Spb1p

Enzyme

TrmA (RUMT) (in vitro)1 RsmB RsmC Arm (Antibiotr)2

RsmA (KsgA=Dim1) RsmA (KsgA=Dim1) RlmA (RrmA) RluA (+tRNA Ψ32) RumB (YbjF) RluC RluD RluD RluD RumA (YgcA)3 Erm (Antibiotr)4

RluC RrmJ RluC RluF RluB

Gm2270

Pet56/Mrm1p

Um2791

Mrm2p

Ψ2918

Pus5p

Yeast numbering is given according to the sequence available at SGD: http://www.yeastgenome.org/. (Limbach et al. 1994; Rozenski et al. 1999; Ofengand 2002). 1 (Gu et al. 1994), 2(Bujnicki and Rychlewski 2001; Galimand et al. 2003), 3(Agarwalla et al. 2002), 4(Vester et al. 1995).

RNA modification in ribosome biogenesis and translation

267

on RNA function. Several evidences indicate that modifications may act cooperatively. McCloskey and his collaborators have observed a correlation between the number of methylated positions and the maximal optimal temperature of growth for thermophilic organisms (Noon et al. 1998), as was previously noted for tRNA (Agris et al. 1973; Kowalak et al. 1994). These observations suggest that 2’-Omethylriboses contribute together to stabilize the RNA secondary structure (see Johansson and Byström Chapter 3). Along the same line, the site-specific rRNA MTase RrmJ from E. coli belongs to a heat-shock operon, thus, establishing another link between rRNA methylation and temperature of growth or stress (Bügl et al. 2000; Caldas et al. 2000a). Recently, several groups have unraveled a new aspect of RNA modification, by showing that they may operate in a coordinate manner. For instance, while individual deletion of guide snoRNA have little effect on rRNA synthesis, it was found that removing a certain set of H/ACA snoRNA may affect ribosome functioning (see below) (King et al. 2003). Also, several enzymes catalyzing tRNA modifications can be deleted one by one with no detectable phenotype, however, combining certain gene deletions leads to a growth defect (Purushothaman et al. 2004).

2 Pseudouridylations conserved in bacteria and eukaryotes Several modifications are highly conserved throughout the evolution; however, the mechanisms responsible for these modifications are not necessarily the same in different phylaea. For instance, helix 69 of the LSU according to Yusupov et al. (2001) is nearly invariant and is highly modified in bacteria and in eukaryotes (Table 2). In E. coli, it possesses three Ψ at position 1911, 1915, and 1917, which are catalyzed by the pseudouridylase RluD. In contrast to all other rRNA pseudouridylases, deletion of the RluD gene leads to a severe growth defect in E. coli (Raychaudhuri et al. 1998). However, mutations affecting specifically the catalytic domain of RluD block the formation of the three Ψ while the mutated proteins are still able to restore normal growth, thus, demonstrating that these Ψ are not required for normal cell growth (Gutgsell et al. 2001). In yeast, helix 69 is also highly modified with Am2256, Ψ2258, Ψ2260, Ψ2264 and Ψ2266 (S. cerevisiae rRNA nucleotide numbering is given according to the yeast genomic sequence available at SGD: http://www.yeastgenome.org/). Ψ2264 is guided by snR3 that can be deleted with no detectable phenotype (Lowe and Eddy 1999). In contrast, Ψ2258 and Ψ2260, which are the equivalent of Ψ1915 and Ψ1917 in E. coli, are guided by snR191, whose deletion leads to a slight growth disadvantage in yeast (Badis et al. 2003). It is striking that different phylae have selected two completely different strategies to catalyze the pseudouridylation of these nucleotides: site-specific enzymes in bacteria and snoRNA-guided modification in eukaryotes. It strongly suggests that these modifications are of great importance for the cell.

268

Bruno Lapeyre

Another example of pseudouridylation that is conserved throughout the evolution is Ψ2457, which is catalyzed by RluE in E. coli and is guided by snR34 in yeast (Table 2). However, despite the conservation of this modification in prokaryotes and eukaryotes, which may suggest an important function, deletion of the genes coding for RluE in E. coli or for snR34 in yeast leads no detectable phenotype (Samarsky et al. 1995; Del Campo et al. 2001). One particular Ψ is conserved between bacteria and mitochondria LSU, but is absent in yeast cytoplasmic LSU. It is located in helix 90, at position 2580 in E. coli and at position 2819 in yeast mitochondria (Table 2), while the corresponding position in yeast cytoplasmic LSU is occupied by a U at position 2949 (Fig. 1). Two other pseudouridines are located nearby on helix 90 of yeast cytoplasmic LSU, at position 2880 (that is on the opposite strand, compared to bacteria or mitochondria), and 2944 (that is on the same strand, compared to bacteria and mitochondria). It is conceivable that the pseudouridines present in cytoplasmic LSU play a role similar to the one played by Ψ2580 in bacteria, even if they are not located at the exact same position. Indeed, it has been proposed that pseudouridine may affect RNA conformation and exert their action at some distance from their actual position (Ofengand 2002). It is noteworthy that the two enzymes that catalyze the formation of these pseudouridines in bacteria and mitochondria, respectively RluC and Pus5p, are related to each other (Conrad et al. 1998; Ansmant et al. 2000). Recently, it has been shown that combining the deletion of several guide H/ACA snoRNA prevents the formation of several Ψ, which then affect the rate of translation to a certain extent (King et al. 2003). Taken together, these observations indicate that pseudouridylation of rRNA has been conserved throughout the evolution and can be achieved either by site-specific enzymes or by a snoRNA-guided machinery. Ψ are located in the core domains of the rRNA, near the functionally important domains of the ribosome, the peptidyl transferase center, and the tRNA attachment sites suggesting that these modifications participate in ribosome biogenesis and function (Maden 1990; Decatur and Fournier 2002).

3 Base modification and their enzymes Base modifications (mN) are the only modifications that are more frequent in bacteria than in lower or higher eukaryotes (Maden 1990; Ofengand et al. 2001) (see Douthwaite et al., this volume). Except for KsgA/Dim1p (see below), mN of rRNA has drawn little attention since its original characterization forty years ago (Starr and Fefferman 1964). While most tRNA base MTases have been characterized in recent years, only three base MTases have been described for E. coli SSU and three for the LSU. Together, they account for only seven modifications out of the twenty-one known mN in E. coli (Ofengand et al. 2001). The situation is even lagging behind in eukaryotes since Dim1p is still the only known enzyme that catalyzes the formation of a doublet of dimethyladenosine (m22A) located near the 3’ end of the SSU. It is commonly assumed that mN formation in eukaryotes

RNA modification in ribosome biogenesis and translation

269

270

Bruno Lapeyre

Fig. 1 (overleaf). Secondary structure representation of a region of the yeast LSU comprising the PTC: factors involved in nucleotide modification within the A-loop. Top: A portion of the 25S LSU (nucleotides 2365 to 2423 and 2607 to 2994) is represented, according to Cannone et al. (2002). Modified nucleotides have been represented as follows: pseudouridine (Ψ, black triangle flag; 2’-O-methylriboses, black circle flag; methylbases, black arrow. The m5U at position 2924, which is present in non-stoichiometric amounts (see text), is depicted with a question mark. The putative tRNA docking sites have been boxed: the Asite, with nucleotide Gm2922 and the P-site with nucleotides Gm2619G2620. Two nucleotides (A2820 and G2816) potentially involved in catalyzing the peptidyl transfer reaction are shown in bold type face and are underlined. Three nucleotides (U2873, U2954 and A2971) that might be important for peptide release have been circled. Bottom: A generic A-loop is shown on the left, with the five conserved nucleotides UαGβUγUδCε. A-loops from E. coli, yeast mitochondria and yeast cytoplasmic LSU are represented, with their modified nucleotides (Um2552, Um2791 and Um2921Gm2922Ψ2923m5U2924) along with the factors directing their formation.

might be catalyzed by site-specific enzymes, as for Dim1p, rather than by a small RNA-guided mechanism, but indeed this remains to be established. The doublet of m22A found at positions 1518, 1519 (E. coli) or 1780, 1781 (S. cerevisiae), is very special for several reasons. It is the only conserved base methylation between prokaryotes and eukaryotes. In E. coli, deletion of the gene encoding the dimethyltransferase KsgA leads to resistance to kasugamycin but not to significant growth defect (reviewed in van Knippenberg 1986; see Douthwaite et al., this volume). In sharp contrast, the yeast homolog Dim1p is encoded by an essential gene (Lafontaine et al. 1994). However, the conditional thermosensitive allele dim1-2 blocks the formation of m22A1780m22A1781 at permissive temperature but not the accumulation of 18S rRNA (Lafontaine et al. 1998). Therefore, the authors have concluded that methylation of the 3’-end of the 18S rRNA is not essential per se and that Dim1p might rather be involved in a quality control mechanism in ribosome biogenesis. Dim1p participates in the late steps of 40S maturation that occur in the cytoplasm (Fatica and Tollervey 2002). It is likely that, once all the enzymes involved in tRNA modification are characterized, some of the laboratories that have contributed to this task will direct their interest to the rRNA and will use their expertise to identify the remaining modification enzymes, which would greatly benefit to the field of ribosome biogenesis and translation.

4 2’-O-ribose methylations conserved in bacteria and eukaryotes In E. coli, there are only one 2’-O-methylribose (Nm) in the SSU and three in the LSU (Maden 1990). While base methylations are not conserved between prokaryotes and eukaryotes, except for the doublet of m22A, the methylriboses are rather well conserved since three Nm out of four have been conserved between E. coli and S. cerevisiae, with two being also conserved in yeast mitochondria (Table

RNA modification in ribosome biogenesis and translation

271

2). Actually, these two positions are of outstanding interest since Um2552 is located in the tRNA-aminoacyl acceptor site (A-site) and Gm2251 is located in the tRNApeptidyl acceptor site (P-site) of the LSU from E. coli. A comparison of the modification status of the nucleotides of the A-loop between different bacteria, archaea, and eukaryotes (Fig. 1) has shown that it is highly modified in every examined species (Hansen et al. 2002). Out of the four Nm in E. coli rRNA, only two enzymes have been identified: RlmB catalyzes the formation of Gm2251 (Lovgren and Wikstrom 2001) and RrmJ that of Um2552 of the LSU (Bügl et al. 2000; Caldas et al. 2000a). RlmB is the ortholog of Pet56/Mrm1p, which catalyzes the formation of Gm2270 in yeast mitochondrial LSU, a position equivalent to Gm2251 in E. coli LSU (Sirum-Connolly and Mason 1993). It is quite exemplary that the same enzyme has been conserved and performs the same function in prokaryotes and in mitochondria. However, there are also some differences since deletion of the rlmB gene in E. coli does not affect ribosome maturation, while Pet56/Mrm1p was proposed to be required for LSU biogenesis in yeast mitochondria (Sirum-Connolly and Mason 1993; Mason 1998). In eukaryotic cytoplasmic rRNA, the corresponding position (Gm2619) is also 2’-O-methylated, however, it is not catalyzed by a site-specific enzyme but by the snoRNA-guided machinery using snR67 to target the modification of this position. snR67 also guides the formation of Um2724, however, its deletion, which prevents the formation of both Gm2619 and Um2724, does not appear to affect cell growth (Lowe and Eddy 1999). For Cm2498, the E. coli enzyme has not yet been characterized and the corresponding position is not methylated in yeast, which argues against a major role for this modification. For m4Cm1402, neither the base MTase nor the 2’-O-ribose MTase have been characterized in E. coli, however, the corresponding position in yeast LSU is Cm1638, which is guided by snR70, another non-essential C/D snoRNA (Lowe and Eddy 1999). The last Nm, Um2552, will be described below.

5 The outstanding case of Spb1p: an essential sitespecific enzyme in a world of snoRNA-guided modifications The A-loop of the LSU is located in the peptidyl transferase center of the ribosome and is one of the most conserved regions of the rRNA (Fig. 1). For instance, a region of 194 nucleotides encompassing the peptidyl transferase center (containing the putative catalytic site, the A-loop and a region possibly involved in peptide release), from position 2801 to position 2914 (yeast LSU numbering) is 94% identical between yeast and human, and the UαGβUγUδCε motif found in the A-loop is invariant (Fig. 1, bottom). This sequence contains modified nucleotides in every studied species, either Uαm, Gβm, or both. In addition, in eukaryotes, a Ψ is often present instead of Uγ. m5U has also been detected at position Uδ, although in nonstoichiometric amounts (Bakin et al. 1994; Lapeyre and Purushothaman 2004). This preliminary observation opens the possibility that rRNA modifications may

272

Bruno Lapeyre

not all be constitutively added but that some of them might be regulated, for instance, in response to a stress or to changes in physiological conditions. 5.1 The universally conserved Uαm is catalyzed by the site-specific MTase RrmJ in bacteria Um2552 is catalyzed in E. coli by the site-specific rRNA MTase RrmJ, which is encoded by a gene that belongs to a heat-shock operon. Deletion of that gene leads to a severe growth defect, which correlates with a defect in ribosomes (Bügl et al. 2000; Caldas et al. 2000a). Unlike the absence of Pet56/Mrm1p, which leads to a decreased production of LSU, a phenotype that is the signature of a factor involved in LSU biogenesis (Sirum-Connolly and Mason 1993), the lack of RrmJ correlates to lower amounts of 70S and higher relative amounts of 30S and 50S (Bügl et al. 2000; Caldas et al. 2000b). This phenotype suggests that the absence of RrmJ may lead to a joining defect of the two subunits. However, this defect is not necessarily linked to the absence of modification at Um2552, since RrmJ was found to have another, non-enzymatic function in ribosome assembly. RrmJ is associated stoichiometrically with 50S subunits until they join the 30S subunits, which would then displace RrmJ (Hager et al. 2002, 2004). An important property of RrmJ is that it normally requires the 50S RNP to be active and it is a late ribosome biogenesis event (Bügl et al. 2000; Caldas et al. 2000a), in sharp contrast to most prokaryotic (Lovgren and Wikstrom 2001) and eukaryotic (Retel et al. 1969) Nm that are formed at a very early stage of pre-rRNA processing. 5.2 S. cerevisiae possesses three homologs of RrmJ 5.2.1 Mrm2p is a mitochondrial rRNA MTase RrmJ possesses an ortholog in yeast mitochondria that is named Mrm2p (Pintard et al. 2002a). Mrm2p catalyzes the formation of Um2791 (position Uα), the mitochondrial equivalent to Um2552 in E. coli or Um2921 in yeast cytoplasmic LSU (Fig. 1). Mrm2p shares several characteristics with RrmJ, in particular it is active on LSU RNP and not on deproteinized rRNA, thus, suggesting that it takes place at a late stage of ribosome assembly. In contrast, Pet56/Mrm1p, which catalyzes the formation of Gm2270, the other Nm of the mitochondrial rRNA, is active on naked RNA and is likely acting at an early processing stage (Sirum-Connolly and Mason 1993). Cells deprived of Mrm2p are impaired in mitochondrial protein synthesis, are respiration thermosensitive and rapidly loose their mitochondrial DNA (Pintard et al. 2002a). Interestingly, yeast mitochondrial rRNA possess only three modifications, each catalyzed by a site-specific enzyme, a process that is reminiscent of bacteria, but that differs from cytoplasmic ribosomes in eukaryotes. The two mitochondrial rRNA MTases Pet56/Mrm1p and Mrm2p are required for proper ribosome biogenesis, although the lack of Pet56/Mrm1p seems to have a more dramatic effect on LSU production and the enzyme operates at an early stage

RNA modification in ribosome biogenesis and translation

273

of the processing. In contrast, the lack of Mrm2p appears to have a less severe effect on ribosome biogenesis, does not seem to affect the ratio of the subunit and likely operates at a late stage of processing. Deletion of PUS5, which encodes the third mitochondrial rRNA modification enzyme, does not appear to have any detectable phenotype. 5.2.2 Trm7p is a cytoplasmic tRNA MTase In yeast, in addition to Mrm2p, there are two other homologs of RrmJ, Trm7p and Spb1p. Trm7p is a cytoplasmic protein, which is well conserved throughout the evolution. It is a tRNA MTase that catalyzes the formation of Nm at position 32 and 34 of the anticodon of tRNA Leu, Phe, and Trp (Pintard et al. 2002b). The tRNA anticodon loop exhibits striking similarities with the A-loop of the LSU, which, therefore, may explain the high level of sequence similarity between the substrate-binding pockets of RrmJ, Mrm2p and Trm7p (Bujnicki et al. 2004). 5.2.3 Spb1p is a nucleolar site-specific rRNA MTase The third homolog of RrmJ in yeast is Spb1p. spb1-1 has been identified as a suppressor of a mutation in a gene encoding the Poly(A)-binding protein (Pab1p) (Sachs and Davis 1989). A few years later, spb1-2 has been isolated independently as a suppressor of silencing in yeast (Loo et al. 1995). The link existing between Pab1p, its suppressors and ribosome biogenesis or gene silencing is still a puzzling question. Later, Spb1p has been shown to be an AdoMet-binding protein, located within the nucleolus and required for normal pre-rRNA processing (Kressler et al. 1999; Pintard et al. 2000). It was tempting to hypothesize that Spb1p could be the main rRNA MTase in yeast, however, several lines of evidence concurred to point to Fibrillarin/Nop1p as being the snoRNA-guided Nm MTase in eukaryotes (Schimmang et al. 1989; Lapeyre et al. 1990; Tollervey et al. 1993; Niewmierzycka and Clarke 1999; Wang et al. 2000). Yeast cytoplasmic rRNA contains approximately 55 2’-O-methylated nucleotides, of which 54 have now been assigned to a precise position (Lowe and Eddy 1999). An extensive search for C/D snoRNAs in yeast has led to the identification of 41 snoRNA that target the modification of 51 2’-O-methylated nucleotides out of 55, leaving four orphan sites (Lowe and Eddy 1999). Two of these sites are next to each other (Um2921Gm2922) and located within the A-loop of the LSU (Fig. 1) (n.b. nucleotides Um2921Gm2922 in the standardized numbering system used here correspond to Um2918Gm2919 that were used previously). Alkaline hydrolysis of rRNA generates a number of dinucleotides (XmXp) and 4 trinucleotides that correspond to positions that possess two adjacent 2’-O-methylriboses (doublet). Methylation of three of these doublets is guided by U18, U24, and snR13. These snoRNAs can direct the formation of the two Nm of the doublet, possibly by sliding from one nucleotide to the next (Kiss-Laszlo et al. 1996; Lowe and Eddy 1999). snR52 was predicted to target the methylation of Um2921 of the fourth doublet (Um2921Gm2922), and possibly Gm2922, by analogy with the three other dou-

274

Bruno Lapeyre

blets. However, this prediction was not confirmed experimentally (Lowe and Eddy 1999). Another candidate for modifying the Um2921Gm2922 doublet is Spb1p, which is a nucleolar protein with an MTase domain. Spb1p is homologous to RrmJ and Mrm2p that catalyze the formation of Um at the homologous position (Uα) within the A-loop of the LSU in bacteria and mitochondria, respectively. The putative target for Spb1p, the Um2921Gm2922 doublet of the A-loop, is quite different from any other Nm in rRNA. Indeed, while all Nm are catalyzed at an early stage of rRNA processing, on the 35S pre-rRNA, the trinucleotide Um2921GmΨp is detected in mature rRNA species, but not in the early precursors (Brand et al. 1977; Eladari et al. 1977; Dubin and Taylor 1978). The absence of this trinucleotide in 35S means that either Um2921 or Gm2922 (or both) is (are) not formed at this stage, but is (are) catalyzed at a later processing stage. Therefore, this trinucleotide will be referred hereafter as the paradoxical trinucleotide. Recently, a study has addressed the function of Spb1p and snR52 in pre-rRNA processing, using primer extension mapping and RNA prepared from strains either deleted for SNR52 or expressing an MTase-deficient version of Spb1p (Bonnerot et al. 2003). The principle of this method is that reverse transcriptase is able to discriminate between 2’-OH ribose and 2’-O-methylribose, when nucleotides are in limiting amounts (Maden et al. 1995; Kiss-Laszlo et al. 1996; Lowe and Eddy 1999). Consequently, a primer extension pause will be detected when a Nm is encountered by the polymerase. This method is rapid and efficient to map isolated modified nucleotides, however, it cannot discriminate between modified nucleotides that are located next to each other. In the aforementioned study, the authors have incorrectly interpreted the primer extension pause that they observed in the A-loop region as being due to Um2921. This mistake has led them to two major misinterpretations. First, the primer extension pause that they observed is due to Gm2922, which is first encountered by the polymerase, not Um2921 as was claimed in their report. Second, they could not discriminate between pauses and, therefore, 2’-O-methylations, at position 2921 and 2922. Therefore, they interpreted their data as if only one nucleotide was modified by either snR52 or Spb1p and concluded improperly that the two systems are redundant. Instead, we have analyzed directly the status of the nucleotides of the A-loop, to avoid the artifacts due to indirect primer extension mapping that cannot discriminate between modified nucleotides located next to each other. We have demonstrated that formation of the paradoxical trinucleotide occurs in two distinct steps (Lapeyre and Purushothaman 2004). snR52 directs the formation of Um2921 on the 35S pre-rRNA, i.e. simultaneously with all other snoRNA-guided modifications. However, and rather unexpectedly, Gm2922, that is located next to Um2921, is catalyzed by Spb1p and this modification occurs at a late processing stage, before the conversion of 27S into 25S. It is interesting to notice that Spb1p is also able to catalyze the formation of Um2921 (Uα), like RrmJ and Mrm2p, although this activity is not used in normal growth conditions since snR52 acts on rRNA prior to Spb1p. Recent studies have shown that the formation of the pre-60S subunits is different from what was originally hypothesized (Fatica and Tollervey 2002) (Fig. 2).

RNA modification in ribosome biogenesis and translation

275

The 90S pre-ribosomes that form on the 35S pre-rRNA contain mostly 40S maturation factors, while the 60S maturation factors associate with the 27S precursors later and in several different waves. Spb1p joins the subunit at a late stage, before cleavage at C2 and then is released, before the recruitment of the exosome (Fig. 2) (Bassler et al. 2001; Harnpicharnchai et al. 2001). Therefore, Spb1p joins the pre60S at a relatively late stage of the maturation that is compatible with the hypothesis that formation of Gm2922 is a late event occurring on the 27S species. The snoRNA-guided machinery is responsible for the formation of most Nm and Ψ in rRNA in archaea and eukaryotes. Since this mechanism uses a common set of proteins for the catalytic machinery and simply an anti-sense guide RNA for specifying the target, it can adapt and expand rapidly to new targets, by duplicating a guide RNA and adapting its complementary sequence to the new target. In contrast, the site-specific enzyme mechanism requires the selection of an enzyme with a new specificity for each new target, a process that is likely far more complicated. Modification of nucleotides Uαand Gβ provides an extraordinary example of how evolution is playing with the toolbox provided by nature. In bacteria and in mitochondria, Uαis methylated by a site-specific enzyme (RrmJ and Mrm2p, respectively), while in eukaryotes it is guided by snR52. However, an ortholog of RrmJ exists in eukaryotes, which has a new specificity since it is able to methylate a nucleotide that is located next to Um2921 in the A-loop, Gm2922. One might wonder whether the emergence of the site-specific enzyme mechanism, which is the only one known so far to be used in bacteria, has preceded or has followed the appearance of the snoRNA-guided mechanism that operates in archaea and in eukaryotes. We observed that the number of modifications catalyzed by the sitespecific enzyme mechanism has stalled throughout the evolution (10 mN in S. cerevisiae and 10 mN in human) while the number of snoRNA-guided modifications has doubled when comparing yeast to human (98 to 196 Nm and Ψ). Several evidences indicate that Gm2922, Spb1p’s target, plays a more prominent role in the cell than Um2921. First, in bacteria, mutation of Gβ not Uα strongly affects cell growth (Kim and Green 1999). Second, Gβ base-pairs with the 3’-end of tRNA, not Uα(Green et al. 1998). Third, in yeast, blocking the formation of Gm not Um strongly affects ribosome biogenesis and/or translation (Lapeyre and Purushothaman 2004). Spb1p is significantly larger than RrmJ (841 versus 209 residues) and it is likely that its additional domains play other roles in ribosome biogenesis. In rrmJmutants, the decreased level of 70S suggests that RrmJ could be involved in subunit joining. It is unlikely that Spb1p performs the same function in eukaryotes since it operates in the nucleolus while the joining of the two subunits occurs in the cytoplasm. However, Spb1p interacts with a complex machinery, the eukaryotic pre-ribosome, in which it may play additional roles as compared to RrmJ in E. coli. It is noteworthy that a mutation affecting the AdoMet-binding site of Spb1p abolishes its MTase activity and leads to a strong growth defect and severe impairment of 60S synthesis. However, this mutation is not lethal as it is the case for the deletion of the gene (Lapeyre and Purushothaman 2004). We conclude from this result that Spb1p likely fulfills some other essential function in the cell.

276

Bruno Lapeyre

Fig. 2. Schematic diagram of the ribosome biogenesis pathway. A simplified version of the complex pathway leading to the formation of the 40S and the 60S subunits is presented (Fatica and Tollervey 2002). The two types of RNA transcripts that give rise to the four rRNA molecules are shown at the top, with the position of the cleavage sites. The 90S is formed by the association of the 35S with the 40S maturation factors, while the first set of 60S maturation factors will assemble onto the 27S pre-rRNA only after cleavages at site A0, A1, and A2. Then, other sets of 60S maturation factors will be recruited in a stepwise manner, when the maturation of the 27S proceeds. The snoRNA-guided modifications take place early on the 35S, including the formation of Um2921 guided by snR52. In sharp contrast, formation of Gm2922 takes place at a later stage, likely at the time Spb1p will be recruited onto 60S subunits, as shown on the diagram. Recruitment of other factors is summarized by arrows in or out. When cleavage at sites A0, A1, and A2 is affected, a 23S RNA dead end product accumulates, which is not converted into 18S.

RNA modification in ribosome biogenesis and translation

277

Fig. 3. Schematic representation of the interactions between the tRNA and the ribosome during translation. The two ribosomal subunits are represented (light gray, SSU; dark gray, LSU). The three tRNA acceptor sites on the ribosome are designated A, aminoacyl-tRNA acceptor site, P, peptidyl-tRNA, and E, exit site. The incoming aminoacyl-tRNA (on the right) binds to the A-site and its CCA 3’ end interacts with the A-loop of the LSU, likely by establishing a hydrogen bond between its C75 and Gm2922. After the peptidyl transfer reaction has taken place accompanied by the translocation of the tRNA from the A-site to the Psite, the peptidyl-tRNA establishes another interaction with the P-loop at site P, through two hydrogen bonds: C74:G2620 and C75:Gm2619. The growing polypeptide chain (represented here as residues 1 to 8) exits the LSU by a narrow tunnel whose internal face is decorated with numerous Nm (represented with white m). The hydrophobic environment provided by the methyl groups would facilitate the exit of the peptide chain by preventing its sticking to the tunnel.

However, formation of Gm2922 is also important for normal cell growth and it is so far the only rRNA modification having such a function. We cannot yet conclude whether Gm2922 is uniquely required for LSU biogenesis or if it is required for achieving proper rates of translation as suggested by its tRNA docking role in the A-site of the ribosome. The availability of an Spb1p MTase deficient strain will permit to address these questions in the future.

278

Bruno Lapeyre

6 Perspectives: how can modifications extend the ability of RNA molecules Several RNA modification enzymes are either essential for cell viability, or at least for normal cell growth; however, it is not always their catalytic activity that is required, but it is sometimes another function carried out by the protein. In several instances, it has been possible to isolate point mutations that prevent the catalytic activity of a modification enzyme but the mutated proteins are still able to restore normal growth. This is the case in E. coli for the tRNA MTase Trm2 (Persson et al. 1992) and the rRNA pseudouridylase TruD (Gutgsell et al. 2001), and in S. cerevisiae for the rRNA MTase Dim1p (Lafontaine et al. 1998) and the mitochondrial MTase Pet56p (Sirum-Connolly and Mason 1993; Ofengand et al. 2001). A strain deleted for RluD exhibits a defect in ribosomal subunit joining, along with a modification of the sedimentation profile of the 30S and 50S subunit and the appearance of peaks at 39S and 27S (Ofengand et al. 2001). This profile likely reflects a ribosome assembly defect, which is corrected when the cell expresses a catalytically inactive form of the protein. The absence of RrmJ also correlates with a subunit joining deficiency and the appearance of intermediate species on sucrose gradients (Bügl et al. 2000; Hager et al. 2002, 2004). These two examples reveal that ribosome biogenesis in prokaryotes also involve nonribosomal factors that are required for proper assembly and functioning. The putative role of Ψ has been reviewed in detail previously (Ofengand et al. 2001; Decatur and Fournier 2002;). It has been proposed that the main effect of converting U into Ψ could be to provide the nucleotide with an extra H-bond donor, thus, leading to a potentially more stable molecule (Ofengand et al. 2001). Recently, it has been shown that the Ψ present in the peptidyl transferase center are required to achieve full translation rate (King et al. 2003). What is the role of RNA modifications, since they can apparently be prevented in most cases with no or nearly no detectable effect on cell growth? Even for the modification enzymes that are required for cell growth, it seems that it is not the modification activity that is important. Why would a protein, which is required for folding or assembly of the ribosome, be associated with a non-essential modification activity? It has been proposed that the modification could simply be a mark that would be affixed to the ribonucleoprotein complex once it has passed a checking mechanism (Lafontaine et al. 1998). However, there are also evidences suggesting that the modification itself may change the chemical or structural properties of the RNA. Addition of a methyl group to the 2’OH of the ribose protects the RNA from being degraded by alkali or by certain endonucleases. But the methylation of the ribose at that position has also a more direct effect on the structure of the nucleotide and it affects its interaction with other nucleotides (Blanchard and Puglisi 2001). It is quite striking that two of the most highly conserved Nm in LSU are engaged in hydrogen bonds with the aminoacyl-tRNA and peptidyltRNA at the A-site (Gm2922) and the P-site (Gm2619) of the ribosome peptidyl transferase center (Fig. 3). It is conceivable that modified nucleotides are involved in: i) aminoacyl-tRNA recognition and binding, ii) peptidyl transfer reaction and

RNA modification in ribosome biogenesis and translation

279

tRNA translocation from A to P-site, and iii) tRNA exit (see Rousset and Namy Chapter 10). In addition, the hydrophobic methyl group modifies the ability of the ribose to engage in hydrogen bonds as the 2’OH normally does. Since many methyl groups have been mapped on the exit tunnel of the ribosome (Fig. 3), it has been proposed that they may provide a more hydrophobic environment that would prevent nascent polypeptide chains to stick to the exit tunnel (Nissen et al. 2000; Decatur and Fournier 2002). Finally, the number of Nm increases as the optimal growth temperature of an organism, and at least the RrmJ MTase belongs to a heat-shock operon, two observations that suggest that Nm contribute to stabilize the RNA secondary structure. Identifying the remaining rRNA modification activities and their detailed analysis should provide a clearer picture on the role played by these modifications in ribosome biogenesis and translation. For instance, Spb1p catalyzes the formation of the only rRNA modification known so far to be important for ribosome biogenesis and function in eukaryotes. Further characterization of its activity should allow evaluating whether ribose methylation at position G2922 is required for achieving normal rates of translation or if it is only involved in 60S subunit processing.

Acknowledgements This work was supported by a grant from the Association pour la Recherche sur le Cancer (N° 5914), the Fondation pour la Recherche Médicale, the Ligue Nationale contre le Cancer, and the Centre National de la Recherche Scientifique. I am grateful to J. M. Bujnicki for his help with phylogenetic analysis and to C. Curie, and S. K. Purushothaman for critically reading the manuscript.

References Agarwalla S, Kealey JT, Santi DV, Stroud RM (2002) Characterization of the 23 S ribosomal RNA m5U1939 methyltransferase from Escherichia coli. J Biol Chem 277:8835-8840 Agris PF, Koh H, Soll D (1973) The effect of growth temperatures on the in vivo ribose methylation of Bacillus stearothermophilus transfer RNA. Arch Biochem Biophys 154:277-282 Ansmant I, Massenet S, Grosjean H, Motorin Y, Branlant C (2000) Identification of the Saccharomyces cerevisiae RNA:pseudouridine synthase responsible for formation of psi(2819) in 21S mitochondrial ribosomal RNA. Nucl Acids Res 28:1941-1946 Badis G, Fromont-Racine M, Jacquier A (2003) A snoRNA that guides the two most conserved pseudouridine modifications within rRNA confers a growth advantage in yeast. RNA 9:771-779

280

Bruno Lapeyre

Bakin A, Lane BG, Ofengand J (1994) Clustering of pseudouridine residues around the peptidyltransferase center of yeast cytoplasmic and mitochondrial ribosomes. Biochemistry 33:13475-13483 Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000) The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289:905-920 Barta A, Steiner G, Brosius J, Noller HF, Kuechler E (1984) Identification of a site on 23S ribosomal RNA located at the peptidyl transferase center. Proc Natl Acad Sci USA 81:3607-3611 Bassler J, Grandi P, Gadal O, Lessmann T, Petfalski E, Tollervey D, Lechner J, Hurt E (2001) Identification of a 60S preribosomal particle that is closely linked to nuclear export. Mol Cell 8:517-529 Blanchard SC, Puglisi JD (2001) Solution structure of the A loop of 23S ribosomal RNA. Proc Natl Acad Sci USA 98:3720-3725 Bonnerot C, Pintard L, Lutfalla G (2003) Functional redundancy of Spb1p and a snR52dependent mechanism for the 2'-O-ribose methylation of a conserved rRNA position in yeast. Mol Cell 12:1309-1315 Brand RC, Klootwijk J, Van Steenbergen TJ, De Kok AJ, Planta RJ (1977) Secondary methylation of yeast ribosomal precursor RNA. Eur J Biochem 75:311-318 Bügl H, Fauman EB, Staker BL, Zheng F, Kushner SR, Saper MA, Bardwell JC, Jakob U (2000) RNA methylation under heat shock control. Mol Cell 6:349-360 Bujnicki JM, Droogmans L, Grosjean H, Purushothaman SK, Lapeyre B (2004) Bioinformatics-guided identification and experimental characterization of novel RNA methyltransferases. In: Bujnicki JM (ed) Practical Bioinformatics. Springer-Verlag, Heidelberg, pp 139-168 Bujnicki JM, Rychlewski L (2001) Sequence analysis and structure prediction of aminoglycoside-resistance 16S rRNA:m7G methyltransferases. Acta Microbiol Pol 50:7-17 Caldas T, Binet E, Bouloc P, Costa A, Desgres J, Richarme G (2000a) The FtsJ/RrmJ heat shock protein of Escherichia coli is a 23S ribosomal RNA methyltransferase. J Biol Chem 275:16414-16419 Caldas T, Binet E, Bouloc P, Richarme G (2000b) Translational defects of Escherichia coli mutants deficient in the Um(2552) 23S ribosomal RNA methyltransferase RrmJ/FTSJ. Biochem Biophys Res Commun 271:714-718 Cannone JJ, Subramanian S, Schnare MN, Collett JR, D'Souza LM, Du Y, Feng B, Lin N, Madabusi LV, Muller KM, Pande N, Shang Z, Yu N, Gutell RR (2002) The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics 3:2 Cavaillé J, Nicoloso M, Bachellerie JP (1996) Targeted ribose methylation of RNA in vivo directed by tailored antisense RNA guides. Nature 383:732-735 Conrad J, Sun D, Englund N, Ofengand J (1998) The rluC gene of Escherichia coli codes for a pseudouridine synthase that is solely responsible for synthesis of pseudouridine at positions 955, 2504, and 2580 in 23 S ribosomal RNA. J Biol Chem 273:18562-18566 Crick FH (1968) The origin of the genetic code. J Mol Biol 38:367-379 Decatur WA, Fournier MJ (2002) rRNA modifications and ribosome function. Trends Biochem Sci 27:344-351 Decatur WA, Fournier MJ (2003) RNA-guided nucleotide modification of ribosomal and other RNAs. J Biol Chem 278:695-698 Del Campo M, Kaya Y, Ofengand J (2001) Identification and site of action of the remaining four putative pseudouridine synthases in Escherichia coli. RNA 7:1603-1615

RNA modification in ribosome biogenesis and translation

281

Dubin DT, Taylor RH (1978) Modification of mitochondrial ribosomal RNA from hamster cells: the presence of GmG and late-methylated UmGmU in the large subunit (17S) RNA. J Mol Biol 121:523-540 Eladari ME, Hampe A, Galibert F (1977) Nucleotide sequence neighbouring a late modified guanylic residue within the 28S ribosomal RNA of several eukaryotic cells. Nucl Acids Res 4:1759-1767 Fatica A, Tollervey D (2002) Making Ribosomes. Curr Opin Cell Biol 14:313-318 Galimand M, Courvalin P, Lambert T (2003) Plasmid-mediated high-level resistance to aminoglycosides in enterobacteriaceae due to 16S rRNA methylation. Antimicrob Agents Chemother 47:2565-2571 Gaspin C, Cavaillé J, Erauso G, Bachellerie JP (2000) Archaeal homologs of eukaryotic methylation guide small nucleolar RNAs: lessons from the Pyrococcus genomes [published erratum appears in J Mol Biol 2000 Jul 21;300(4):1017-8]. J Mol Biol 297:895906 Green R, Switzer C, Noller HF (1998) Ribosome-catalyzed peptide-bond formation with an A-site substrate covalently linked to 23S ribosomal RNA. Science 280:286-289 Grosjean H, Benne R (eds) (1998) Modification and editing of RNA. ASM Press, Washington Gu X, Ofengand J, Santi DV (1994) In vitro methylation of Escherichia coli 16S rRNA by tRNA (m5U54)-methyltransferase. Biochemistry 33:2255-2261 Gutgsell NS, Campo MD, Raychaudhuri S, Ofengand J (2001) A second function for pseudouridine synthases: a point mutant of RluD unable to form pseudouridines 1911, 1915, and 1917 in Escherichia coli 23S ribosomal RNA restores normal growth to an RluD-minus strain. RNA 7:990-998 Hager J, Staker BL, Bugl H, Jakob U (2002) Active site in RrmJ, a heat shock-induced methyltransferase. J Biol Chem 277:41978-41986 Hager J, Staker BL, Jakob U (2004) Substrate binding analysis of the 23S rRNA methyltransferase RrmJ. J Bact: in press Hansen MA, Kirpekar F, Ritterbusch W, Vester B (2002) Posttranscriptional modifications in the A-loop of 23S rRNAs from selected archaea and eubacteria. RNA 8:202-213 Harnpicharnchai P, Jakovljevic J, Horsey E, Miles T, Roman J, Rout M, Meagher D, Imai B, Guo Y, Brame CJ, Shabanowitz J, Hunt DF, Woolford JL Jr (2001) Composition and functional characterization of yeast 66S ribosome assembly intermediates. Mol Cell 8:505-515 Hopper AK, Phizicky EM (2003) tRNA transfers to the limelight. Genes Dev 17:162-180 Hosokawa K, Fujimura RK, Nomura M (1966) Reconstitution of functionally active ribosomes from inactive subparticles and proteins. Proc Natl Acad Sci USA 55:198-204 Kim DF, Green R (1999) Base-pairing between 23S rRNA and tRNA in the ribosomal A site. Mol Cell 4:859-864 King TH, Liu B, McCully RR, Fournier MJ (2003) Ribosome structure and activity are altered in cells lacking snoRNPs that form pseudouridines in the peptidyl transferase center. Mol Cell 11:425-435 Kiss T (2001) Small nucleolar RNA-guided post-transcriptional modification of cellular RNAs. EMBO J 20:3617-3622 Kiss-Laszlo Z, Henry Y, Bachellerie JP, Caizergues FM, Kiss T (1996) Site-specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell 85:1077-1088

282

Bruno Lapeyre

Kowalak JA, Dalluge JJ, McCloskey JA, Stetter KO (1994) The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochemistry 33:7869-7876 Kressler D, Rojo M, Linder P, de la Cruz J (1999) Spb1p is a putative methyltransferase required for 60S ribosomal subunit biogenesis in Saccharomyces cerevisiae. Nucl Acids Res 27:4598-4608 Lafontaine D, Delcour J, Glasser AL, Desgres J, Vandenhaute J (1994) The DIM1 gene responsible for the conserved m6(2)Am6(2)A dimethylation in the 3'-terminal loop of 18S rRNA is essential in yeast. J Mol Biol 241:492-497 Lafontaine DL, Preiss T, Tollervey D (1998) Yeast 18S rRNA dimethylase Dim1p: a quality control mechanism in ribosome synthesis? Mol Cell Biol 18:2360-2370 Lane BG (1998) Historical perspectives on RNA nucleoside modifications. In: Grosjean H, Benne R (eds) Modification and editing of RNA. American Society for Microbiology, Washington, pp 1-20 Lapeyre B, Mariottini P, Mathieu C, Ferrer P, Amaldi F, Amalric F, Caizergues-Ferrer M (1990) Molecular cloning of Xenopus fibrillarin, a conserved U3 small nuclear ribonucleoprotein recognized by antisera from humans with autoimmune disease. Mol Cell Biol 10:430-434 Lapeyre B, Purushothaman SK (2004) Spb1p-directed formation of Gm2922 in the ribosome catalytic center occurs at a late processing stage. Mol Cell: in press Limbach PA, Crain PF, McCloskey JA (1994) Summary: the modified nucleosides of RNA. Nucl Acids Res 22:2183-2196 Loo S, Laurenson P, Foss M, Dillin A, Rine J (1995) Roles of ABF1, NPL3, and YCL54 in silencing in Saccharomyces cerevisiae. Genetics 141:889-902 Lovgren JM, Wikstrom PM (2001) The rlmB gene is essential for formation of Gm2251 in 23S rRNA but not for ribosome maturation in Escherichia coli. J Bacteriol 183:69576960 Lowe TM, Eddy SR (1999) A computational screen for methylation guide snoRNAs in yeast. Science 283:1168-1171 Maden BE (1990) The numerous modified nucleotides in eukaryotic ribosomal RNA. Prog Nucl Acid Res Mol Biol 39:241-303 Maden BE, Corbett ME, Heeney PA, Pugh K, Ajuh PM (1995) Classical and novel approaches to the detection and localization of the numerous modified nucleotides in eukaryotic ribosomal RNA. Biochimie 77:22-29 Mason TL (1998) Functional aspects of the three modified nucleotides in yeast mitochondrial large-subunit rRNA. In: Grosjean H, Benne R (eds) Modification and editing of RNA. ASM Press, Washington, pp 273-280 Ni J, Tien AL, Fournier MJ (1997) Small nucleolar RNAs direct site-specific synthesis of pseudouridine in ribosomal RNA. Cell 89:565-573 Niewmierzycka A, Clarke S (1999) S-Adenosylmethionine-dependent methylation in Saccharomyces cerevisiae. Identification of a novel protein arginine methyltransferase. J Biol Chem 274:814-824 Nissen P, Hansen J, Ban N, Moore PB, Steitz TA (2000) The structural basis of ribosome activity in peptide bond synthesis. Science 289:920-930 Noller HF, Hoffarth V, Zimniak L (1992) Unusual resistance of peptidyl transferase to protein extraction procedures. Science 256:1416-1419

RNA modification in ribosome biogenesis and translation

283

Noon KR, Bruenger E, McCloskey JA (1998) Posttranscriptional modifications in 16S and 23S rRNAs of the archaeal hyperthermophile Sulfolobus solfataricus. J Bacteriol 180:2883-2888 Ofengand J (2002) Ribosomal RNA pseudouridines and pseudouridine synthases. FEBS Lett 514:17-25 Ofengand J, Malhotra A, Remme J, Gutgsell NS, Del Campo M, Jean-Charles S, Peil L, Kaya Y (2001) Pseudouridines and pseudouridine synthases of the ribosome. Cold Spring Harb Symp Quant Biol 66:147-159 Omer AD, Lowe TM, Russell AG, Ebhardt H, Eddy SR, Dennis PP (2000) Homologs of small nucleolar RNAs in archaea. Science 288:517-522 Parker R, Simmons T, Shuster EO, Siliciano PG, Guthrie C (1988) Genetic analysis of small nuclear RNAs in Saccharomyces cerevisiae: viable sextuple mutant. Mol Cell Biol 8:3150-3159 Persson BC, Gustafsson C, Berg DE, Björk GR (1992) The gene for a tRNA modifying enzyme, m5U54-methyltransferase, is essential for viability in Escherichia coli. Proc Natl Acad Sci USA 89:3995-3998 Pintard L, Bujnicki JM, Lapeyre B, Bonnerot C (2002a) MRM2 encodes a novel yeast mitochondrial 21S rRNA methyltransferase. EMBO J 21:1139-1147 Pintard L, Kressler D, Lapeyre B (2000) Spb1p is a yeast nucleolar protein associated with Nop1p and Nop58p that is able to bind S-adenosyl-L-methionine in vitro. Mol Cell Biol 20:1370-1381 Pintard L, Lecointe F, Bujnicki JM, Bonnerot C, Grosjean H, Lapeyre B (2002b) Trm7p catalyses the formation of two 2'-O-methylriboses in yeast tRNA anticodon loop. EMBO J 21:1811-1820 Purushothaman SK, Bujnicki JM, Grosjean H, Lapeyre B (2004) Trm11 encodes the tRNA MTase that catalyzes the formation of m2G10 in yeast: in preparation Raychaudhuri S, Conrad J, Hall BG, Ofengand J (1998) A pseudouridine synthase required for the formation of two universally conserved pseudouridines in ribosomal RNA is essential for normal growth of Escherichia coli. RNA 4:1407-1417 Retel J, van den Bos RC, Planta RJ (1969) Characteristics of the methylation in vivo of ribosomal RNA in yeast. Biochim Biophys Acta 195:370-380 Rozenski J, Crain PF, McCloskey JA (1999) The RNA modification database: 1999 update. Nucleic Acids Res 27:196-197 Sachs AB, Davis RW (1989) The poly(A) binding protein is required for poly(A) shortening and 60S ribosomal subunit-dependent translation initiation. Cell 58:857-867 Samarsky DA, Balakin AG, Fournier MJ (1995) Characterization of three new snRNAs from Saccharomyces cerevisiae: snR34, snR35 and snR36. Nucl Acids Res 23:25482554 Schimmang T, Tollervey D, Kern H, Frank R, Hurt EC (1989) A yeast nucleolar protein related to mammalian fibrillarin is associated with small nucleolar RNA and is essential for viability. EMBO J 8:4015-4024 Sirum-Connolly K, Mason TL (1993) Functional requirement of a site-specific ribose methylation in ribosomal RNA. Science 262:1886-1889 Starr JL, Fefferman R (1964) The occurrence of methylated bases in ribosomal ribonucleic acid of Escherichia coli K12 W-6. J Biol Chem 239:3457-3461 Tollervey D, Lehtonen H, Jansen R, Kern H, Hurt EC (1993) Temperature-sensitive mutations demonstrate roles for yeast fibrillarin in pre-rRNA processing, pre-rRNA methylation, and ribosome assembly. Cell 72:443-457

284

Bruno Lapeyre

van Knippenberg PH (1986) Structural and functional aspects of the N6, N6 dimethyladenosines in 16S ribosomal RNA. In: Hadesty BaK, G. (ed) Structure, function, and genetics of ribosomes. Springer-Verlag, New York, pp 412-424 Vester B, Hansen LH, Douthwaite S (1995) The conformation of 23S rRNA nucleotide A2058 determines its recognition by the ErmE methyltransferase. RNA 1:501-509 Wang H, Boisvert D, Kim KK, Kim R, Kim SH (2000) Crystal structure of a fibrillarin homologue from Methanococcus jannaschii, a hyperthermophile, at 1.6 Å resolution. EMBO J 19:317-323 Yusupov MM, Yusupova GZ, Baucom A, Lieberman K, Earnest TN, Cate JH, Noller HF (2001) Crystal structure of the ribosome at 5.5 A resolution. Science 292:883-896

Lapeyre, Bruno Centre de Recherche de Biochimie Macromoléculaire du CNRS 1919 Route de Mende Montpellier, France [email protected]

Nucleotide methylations in rRNA that confer resistance to ribosome-targeting antibiotics Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

Abstract Methylation of rRNA nucleotides is an effective means of conferring resistance to antibiotics that target the bacterial ribosome. This type of resistance seems to have evolved as self-defence mechanisms in bacteria such as Streptomyces species that synthesize ribosome-targeting drugs. The self-defence mechanisms were subsequently recruited by pathogenic bacteria including streptococcal and staphylococcal species, where resistance to macrolides and related drugs is now a prevalent clinical problem. In this article, we review the methylation events in bacterial rRNA that confer resistance, and discuss how the molecular mechanisms of resistance can be explained from the recent crystal structures of antibiotics bound to the ribosome.

1 Introduction Many clinically important antibiotics inhibit the growth of bacteria by blocking protein synthesis on the ribosomes (Vázquez 1979; Gale et al. 1981; Spahn and Prescott 1996). These antibiotics bind to regions of the ribosome that are concerned with essential steps in protein synthesis such as peptide bond formation, GTP hydrolysis and mRNA decoding. The main contact sites for the antibiotics are on the rRNA, rather than on the ribosomal protein components (Cundliffe 1990), which is consistent with the view that the rRNA carries out the primary functions of the ribosome, including the formation of the peptide bond (Green and Noller 1997; Nissen et al. 2000). Not surprisingly therefore, changes in the ribosome structure that confer antibiotic resistance are mainly to be found in the rRNA, and consist of nucleotide methylations or base substitutions (Cundliffe 1990). There are indeed cases of ribosomal protein (r-protein) mutations which confer resistance to ribosome-targeting antibiotics in laboratory (Cundliffe 1990; Chittum and Champney 1994; Belova et al. 2001; Kofoed and Vester 2002; Bosling et al. 2003) plus veterinary and clinical strains (Aarestrup and Jensen 2000; Adrian et al. 2000b; McNicholas et al. 2000; Tait-Kamradt et al. 2000; Farrell et al. 2003). However, these r-protein mutations tend to confer resistance in an indirect manner by influencing the conformation of adjacent rRNA structures that

Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b105586 / Published online: 14 December 2004 © Springer-Verlag Berlin Heidelberg 2005

286 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

make contact with the antibiotic (Gregory and Dahlberg 1999; Gabashvili et al. 2001). In this review, we concentrate on rRNA methylations (Fig. 1) that are directed by specific methyltransferase enzymes and confer resistance to ribosome-targeting drugs (Table 1). This form of resistance is found in many pathogenic and drugproducing bacteria and tends to be absent in other bacterial species. The resistance methyltransferases seem to have no function in the absence of the antibiotic against which they offer protection; indeed, in many cases these are expressed only when the antibiotic is present. This contrasts with the many other modifications in rRNAs that are present under most if not all growth conditions and carry out ‘house-keeping’ roles important for the general functioning of rRNA during protein synthesis.

2 Ribosomal RNA modifications Bacterial rRNAs can contain over thirty house-keeping modifications all of which are added post-transcriptionally. The sites and types of modification have been most accurately mapped in Esherichia coli, where 16S and 23S rRNAs contain eleven and twenty-four modifications, respectively, that consist of pseudouridinylations, base methylations, and the 2´-O-methylation of riboses (Rozenski et al. 1999; Andersen et al. 2004). These rRNA modifications are present under most growth conditions, and modification is generally the product of a specific enzyme encoded by a gene inherent in the bacterial chromosome. Each resistance methyltransferase is also encoded by a specific gene. In actinomycetes, resistance genes are generally an integral part of a chromosome region concerned with antibiotic production, whereas in pathogenic bacteria resistance genes are often acquired on plasmids or transposons. The requirement in bacteria for a specific enzyme for each modification contrasts with the mechanisms in eukaryotic cells. Pseudouridinylation reactions and 2´-O-methylations, which make up the bulk of eukaryotic rRNA modifications, are guided by a variety of snoRNAs that function together with a limited set of enzymes (Kiss 2002; Ofengand 2002; Decatur and Fournier 2003). An overview of eukaryotic rRNA modification is provided by Yu et al. in this volume. The locations of the rRNA modifications can be charted on the crystallographic structures of the ribosome (Ban et al. 2000; Schlünzen et al. 2000; Wimberly et al. 2000; Harms et al. 2001; Yusupov et al. 2001), and cluster in regions concerned with the essential ribosome functions (Brimacombe et al. 1993; Ofengand and Rudd 2000; Decatur and Fournier 2002). For the most part, the posttranscriptional modifications fine-tune the function of rRNA in protein synthesis, and this has been demonstrated by the superior performance of authentic rRNAs compared to their unmodified 16S (Krzyzosiak et al. 1987) and 23S counterparts (Green and Noller 1999; Khaitovich et al. 1999). Although the resistance methyltransferases only benefit rRNA function when the bacterium is challenged by an antibiotic, the advantage afforded by a resistance methylation under these condi-

Nucleotide methylations in rRNA 287

tions is enormous, and can facilitate growth at antibiotic concentrations several orders of magnitude higher than required to inhibit strains that lack this methylation.

3 The antibiotic resistance rRNA methyltransferases The control of expression of the resistance genes in different pathogenic and drugproducing bacteria can vary from constitutive systems to those under tight regulation where the methyltransferase is induced only when detectible levels of the drug are present. Well studied examples of the latter systems are the inducible Erm methyltransferases (Weisblum 1995b); these are encoded on bicistronic mRNAs in which a leader sequence and erm fold into secondary structures that occlude the erm start codon and Shine-Dalgarno sequence. The presence of small quantities of macrolide drug induces ribosome stalling within the upstream leader cistron, causing rearrangement of the secondary structure and freeing the start of the erm sequence for translational initiation. Expression of resistance is also affected by the substrate requirements of the various rRNA methyltransferases. Resistance methylations that have been studied in detail on 16S rRNA tend to occur after assembly of the 30S subunit. For instance, methylation of 16S rRNA G1405 and A1408 (Escherichia coli rRNA nucleotide numbering is used throughout this review) occurs only on the mature 30S subunit (Skeggs et al. 1985; Thompson et al. 1985) where these nucleotides are displayed at the decoding region on the subunit interface (Schlünzen et al. 2000; Wimberly et al. 2000). This would indicate that the target nucleotides are accessible on the surface of the small subunit, and that for methylation to take place the nucleotides need to be presented in higher-order structures that are absent in the free rRNA. In contrast, free 23S rRNA (Fig. 1) prior to complete assembly with the rproteins is the preferred substrate for the resistance methylations that have been studied in detail. Ribosomal protein L11 obscures 23S rRNA nucleotide A1067 (Bechthold and Floss 1994), which is the target for the Tsr methyltransferase, and nucleotides G745, G748, and A2058 become inaccessible to the RlmAI (RrmA) (Hansen et al. 2001), RlmAII (TlrB) (Liu et al. 2000), and Erm methyltransferases (Skinner et al. 1983; Vester and Douthwaite 1994) after assembly of 23S rRNA into mature 50S subunits. There has, however, been a report of methylation of staphylococcal 50S precursor particles by ErmC (Champney et al. 2003), and in this case, the 23S rRNA has presumably not yet been fully folded to block off access to nucleotide A2058. The requirement for a free (or partially free) 23S rRNA substrate has two major consequences. First, these methyltransferases have only a limited window of opportunity to modify their nucleotide targets before the 23S rRNA assembles with the r-proteins. Second, switching from a sensitive to a resistant state requires the synthesis of new 50S ribosomal subunits. There has been no indication that the resistance methylations interfere with the assembly of 23S rRNA into subunits.

288 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

Fig. 1. Schematic outlines of 16S and 23S rRNA secondary structures showing the sites and types of the methylations conferring resistance to the antibiotics indicated. The rRNA ‘house-keeping’ modifications that are not linked with antibiotic resistance are not shown (see text). The roman numerals indicate the domains of the rRNAs.

Nucleotide methylations in rRNA 289

The biochemical and genetic data on drug binding (reviewed by Fourmy et al. 2003), which has recently been complemented by crystallographic structures (Brodersen et al. 2000; Carter et al. 2000; Schlünzen et al. 2001; Hansen et al. 2002, 2003; Harms et al. 2004), indicate that the methylated nucleotides are an integral part of the drug binding sites and presumably confer resistance by blocking contacts essential for drug interaction. As of yet, no cases have been recorded of allosteric mechanisms where methylation causes resistance from a location outside the drug binding site. The biological costs associated with methylation of these functional rRNA regions have not been accurately measured, although the induction mechanisms employed by many bacteria (Weisblum 1995b) including many current clinical strains (Giovanetti et al. 1999; Min et al. 2003) to repress erm expression when the methyltransferase is not needed, suggest that the costs might be significant. All the rRNA resistance methyltransferases studied to date use S-adenosyl-Lmethionine (AdoMet) as the methyl group donor and contain the three conserved motifs involved in AdoMet binding (Kagan and Clarke 1994). This functional region of the rRNA methyltransferases has distinct similarity to Rossmann-fold structures found in proteins that bind other adenosine-based cofactors such as ATP and NAD (Schubert et al. 2003). The fold of the AdoMet binding region has been used by Schubert et al. (2003) to categorize methyltransferases, and they group the enzymes that methylate rRNA bases into the functionally diverse Class I of proteins, and the 2´-O-methyltransferases into Class IV. The rRNA methyltransferases are largely heterogeneous in other parts of their structures, which presumably include the sequences that enable the enzymes to recognize their own distinctive rRNA targets.

4 Resistance to antibiotics targeting the small subunit rRNA The principle function of the 30S ribosomal subunit is to decipher the genetic information encoded within the mRNA, and this process is hindered by drugs including tetracycline, pactamycin, and aminoglycosides (Vázquez 1979; Gale et al. 1981). The aminoglycosides are a diverse group of compounds, which loosely includes spectinomycin, streptomycin, hygromycin B, kasugamycin, and gentamicin/paromomycin type antibiotics. While these compounds are derivatives of amino sugars, they are sufficiently dissimilar in structure, site of interaction on the 30S subunit and mode of action to be classified separately. Neither spectinomycin, streptomycin, hygromycin B (nor tetracycline) are considered further here as there have been no reported links between resistance to these drugs and rRNA methylation (although there is extensive literature on resistance conferred by r-protein and rRNA mutations, drug inactivation and efflux mechanisms).

290 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

4.1 The case of kasugamycin resistance Kasugamycin resistance can be conferred by a lack of modification at the 3´terminal hairpin loop in 16S rRNA where nucleotides A1518 and A1519 are otherwise N6, N6-dimethylated (Fig. 1). These highly conserved house-keeping methylations are lost by inactivation of the rsmA (ksgA) gene. Although kasugamycin resistance contrasts with the other resistance mechanisms, which involve the acquisition of an additional methylation (Table 1), RsmA has many features in common with the resistance methyltransferases. Like the gentamicin/paromomycin resistance methyltransferases discussed in the next section, RsmA cannot methylate free 16S rRNA (Helser et al. 1972), and prefers assembled 30S as its substrate (Poldermans et al. 1979). The crystal structure of RsmA reveals that the protein is arranged in two domains, where the catalytic N-terminal domain shows extensive structural correspondence to the ErmC´ methyltransferase (O'Farrell et al. 2004), which belongs to another group of N6, N6-adenosine dimethyltransferases discussed in detail below. RsmA is one of the most highly conserved rRNA methyltransferase with homologues in all wild type bacteria that have been studied in detail (Van Buul et al. 1983; Van Buul and van Knippenberg 1985; Rozenski et al. 1999). Genome searches indicate that RsmA orthologues are present in the archaea, as well as in eukaryotes where they modify cytoplasmic ribosomes (Lafontaine et al. 1994). In mitochondria, RsmA orthologues function in the additional role of transcription factor (Seidel-Rogol et al. 2003). Whereas inactivation of RsmA is tolerated in bacteria, it seems to be a lethal event in eukaryotes (Lafontaine et al. 1994). 4.2 Resistance by methylation of the small subunit rRNA Methyltransferases that fit the stricter definition of resistance determinants have been identified for pactamycin, as well as for the gentamicin and paromomycin classes of aminoglycosides. The resistance mechanism for pactamycin has been studied in the drug producer, Streptomyces pactum. Briefly, this bacterium protects its own ribosomes via the Pct methyltransferase (Fig.1) that modifies the N1 position of nucleotide A964 (Ballesta and Cundliffe 1991), and presumably thereby blocks drug binding. The gentamicin and paromomycin aminoglycosides are considered in greater detail here as these drugs have more clinical relevance and have been studied more exhaustively. Aminoglycoside compounds in both gentamicin (which includes kanamycin) and paromomycin classes (including neomycin) are based on a common neamine structure consisting of two amino sugar rings (I and II), which direct the binding of the drugs to the same location within the ribosome decoding site (Fourmy et al. 1996; Yoshizawa et al. 1998). The classes differ in the manner in which ring II (2-deoxystreptamine) is substituted with additional sugar groups (reviewed by Fourmy et al. 2003). The decoding site of the ribosome is situated towards the 3´-end of the 16S rRNA where a string of noncanonical base pairs distort an otherwise regular helix.

Nucleotide methylations in rRNA 291

292 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

The distortion in the RNA major groove assimilates the gentamicin/paromomycin compounds and enables their neamine moiety to make several specific interactions with the noncanonical pairs (Fig. 2A). Ring I interacts with nucleotides A1408 and A1493, while ring II engages the U1406-U1495 and G1407-G1494 pairs (Fourmy et al. 1996). The NMR structure of this RNA region (Fig. 2A) shows that the neamine portions of both gentamicin and paromomycin compounds are superimposable, and make similar contacts with the 16S rRNA decoding region (Fourmy et al. 1996; Yoshizawa et al. 1998). These findings are supported by crystal structures of aminoglycosides bound to the 30S subunit (Carter et al. 2000). The drugnucleotide contacts have been further refined in higher resolution crystal structures of this isolated RNA region bound to paromomycin (Vicens and Westhof 2001), tobramycin (Vicens and Westhof 2002), and geneticin (Vicens and Westhof 2003) (the latter two drugs being structurally close to gentamicin) and reveal a conserved hydrogen bond from the aminoglycoside ring I to the Watson-Crick face of A1408. Aminoglycoside contacts are disrupted after methylation at the N1 position of nucleotide A1408 by the KamA methyltransferase, resulting in resistance to neamine and kanamycin (Beauclerk and Cundliffe 1987). Although A1408 methylation hinders the placement of ring I into the aminoglycoside binding site, this modification confers no appreciable resistance to neomycin or paromomycin (Cundliffe 1990). This suggests that other contacts made by these compounds (see below) compensate for the loss of the A1408 interaction. Fig. 2 (overleaf). The rRNA structures at the sites of antibiotic interaction. The rRNA is outlined in grey showing the riboses (orange windows), guanine and cytosine bases (green windows), and adenine and uracil bases (blue windows). The magenta spheres represent the methylations that confer antibiotic resistance. A: The aminoglycoside antibiotics gentamicin (red) and paromomycin (yellow) bound to their overlapping sites within the 16S rRNA decoding site; methylation by Grm and KamA is indicated at nucleotides G1405 and A1408, respectively. The figure is based on the NMR structures of paromomycin (Fourmy et al. 1996) and gentamicin (Yoshizawa et al. 1998) bound to this RNA region. B: The orthosomycin antibiotic site formed by nucleotides in 23S rRNA helices 89 and 91, with most of the bases removed for clarity. The sites of the EmtA and AviRb methylations at nucleotides G2470 and U2479 in helix 89, and of the AviRa methylation at G2535 in helix 91 are indicated. The figure is adapted from crystallographic coordinates of the 50S subunit (Ban et al. 2000; Schlünzen et al. 2000; Harms et al. 2001; Yusupov et al. 2001). No crystallographic data is available for orthosomycin binding, although our present best guess is that avilamycin and evernimicin bind to the pocket delineated here by the three magenta spheres. C: The macrolide tylosin (blue) bound to the 23S rRNA MLSB site where it interacts with nucleotides G748 and A2058. Sites of RlmAII (TlrB) and ErmN (TlrD) monomethylation are indicated. The figure is adapted from the crystallographic structure of tylosin bound to the 50S ribosomal subunit (Hansen et al. 2002)

Nucleotide methylations in rRNA 293

294 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

The Grm methyltransferase, found in the gentamicin-producing bacterium Micromonospora purpurea, modifies the N7 position of G1405 and also confers resistance to kanamycin (and gentamicin), but not to neamine, neomycin, or paromomycin (Thompson et al. 1985; Beauclerk and Cundliffe 1987). The molecular mechanism for the specificity of this resistance is more clear-cut. In the gentamicin/kanamycin compounds, position 6 of deoxystreptamine (ring II) is substituted by an additional amino sugar (ring III), which is directed towards G1405 in the binding site (Yoshizawa et al. 1998). Methylation of G1405 would clearly cause a steric clash with gentamicin (Fig. 2A), reducing drug binding and resulting in resistance. In contrast, neomycin and paromomycin are substituted at position 5 of deoxystreptamine projecting their substituents at a different angle, directed away from G1405, to make contact further down the RNA helix (Fourmy et al. 1996). This leaves ample room within the binding site to accommodate neamine, neomycin, or paromomycin despite the presence of a methyl group on G1405 (Fig. 2A).

5 Resistance by methylation of the large subunit rRNA The main functions of the 50S ribosomal subunit are the formation of peptide bonds and the control of GTP hydrolysis. Both of these are multi-step processes, in which each step is a potential point of inhibition for the numerous antibiotics that target the 50S subunit (Vázquez 1979; Gale et al. 1981; Spahn and Prescott 1996; Fourmy et al. 2003). As in the case of the 30S-specific drugs, various types of resistance mechanisms exist for all of these antibiotics. We limit our attention here to resistance that is conferred by enzymatic modification of the rRNA, and categorize the relevant antibiotics into three groups. The first group is composed of thiopeptide antibiotics; the second group is made up of the orthosomycin drugs; and the third is a chemically diverse group that comprises the macrolide, lincosamide, and streptogramin B (MLSB) compounds. Each of these groups of drugs binds to its own discrete site formed by the tertiary folding of the 23S rRNA. Within each of these three sites, one or more nucleotides can be modified by rRNA methyltransferases to confer resistance that is specific for the associated group of drugs. 5.1 Thiopeptide antibiotics The best studied of the thiopeptide compounds is thiostrepton, which inhibits a number of ribosomal functions all of which are linked to GTPase activity. Thiostrepton blocks turnover of GTP (Rodnina et al. 1999) on elongation factors EFTu and EF-G. The drug elicits a non-productive stimulation of initiation factor IF2-dependent GTP turnover (Brandi et al. 2004) and it inhibits production of the ribosomal stringency factor ppGpp (Cundliffe 1990). The GTPase-associated region of 23S rRNA, where thiostrepton and related compounds such as micrococ-

Nucleotide methylations in rRNA 295

cin bind, is located between nucleotides 1050 to 1110 and also contains the binding site for r-protein L11 (Thompson et al. 1979; Egebjerg et al. 1989; Ryan et al. 1991; Rosendahl and Douthwaite 1993). Thiostrepton makes direct contact with the rRNA (Rosendahl and Douthwaite 1994), and drug binding is greatly stimulated by L11 (Thompson et al. 1979; Cundliffe 1990; Rosendahl and Douthwaite 1993; Porse et al. 1999). As well as being resistant to thiostrepton, L11-minus mutants are defective in several functions including release factor interaction (Van Dyke and Murgola 2003). The crystal structures of this region (Conn et al. 1999; Wimberly et al. 1999) indicate how the local rRNA conformation is influenced by L11 to create what appears to be a binding pocket for the drug. The producer of thiostrepton, Streptomyces azureus, protects its own ribosomes by expression of the Tsr methyltransferase, which methylates the 2´-O-ribose of A1067 within the drug binding site (Thompson et al. 1982). Methylation by Tsr reduces the binding constant for thiostrepton on the ribosome by at least six orders of magnitude (Cundliffe 1990), constituting an extremely effective means of resistance. 5.2 Orthosomycin antibiotics The orthosomycin antibiotics are oligosaccharide compounds that bind exclusively to the 50S subunit (McNicholas et al. 2000; Belova et al. 2001) at a site that is physically distinct from that of other groups of 50S-targeting drugs (Mankin 2001). The orthosomycin site is close to where initiation factor IF2 interacts (La Teana et al. 2001; Brandi et al. 2004), suggesting that the drugs might perturb the initiation stage of protein synthesis. The best characterized orthosomycin compounds are evernimicin (also called SCH27899 or ziracin), which has shown promise in clinical trials against serious infectious diseases (Foster and Rybak 1999), and the structurally similar avilamycin A. Unfortunately, avilamycin has been extensively used as a growth promoter of livestock, putting a limit to its clinical usefulness (and also to that of evernimicin). Numerous field isolates of Enterococcus faecalis, E. faecium, Streptococcus pneumoniae and Staphylococcus aureus (Aarestrup and Jensen 2000; Adrian et al. 2000a, 2000b; McNicholas et al. 2000) and laboratory strains of Halobacterium halobium (Belova et al. 2001; Kofoed and Vester 2002) have developed orthosomycin resistance due to 23S rRNA and r-protein L16 mutations. One isolate of Enterococcus possessed particularly effective and genetically transferable resistance, the determinant of which was shown to encode a methyltransferase (dubbed EmtA) that specifically modifies nucleotide G2470 in helix 89 of 23S rRNA (Mann et al. 2001) (Fig. 2B). Independent studies on the avilamycin-producing bacterium, Streptomyces viridochromogenes, showed that this organism possesses three determinants to protect itself from the toxicity of its own metabolite. One determinant is concerned with drug efflux, while the other two, aviRa and aviRb, encode rRNA methyltransferases that respectively confer low and high level resistance to avilamycin A (Weitnauer et al. 2001). Expression of aviRa results in methylation of the N1 position of 23S rRNA nucleotide G2535, while aviRb confers resistance via methyla-

296 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

tion of the 2´-O-ribose on nucleotide U2479 (Treede et al. 2003). The AviRa and AviRb methylation sites are within 23S rRNA helices 91 and 89, respectively, and are positioned close to each other in the 50S subunit (Fig. 2B). The EmtA site at G2470 is immediately adjacent to the AviRb methylation within the rRNA secondary structure (Fig. 1), and the sites of the rRNA resistance mutations also cluster within this region. The proximity of the rRNA methylations and mutations suggest that they lie within the orthosomycin binding site and confer resistance by altering key sites of drug contact (Treede et al. 2003). The substituted residue in r-protein L16 neighbours this region, and presumably does not directly contact the drug but confers resistance by altering the local rRNA structure. 5.2.1 AviR methyltransferase substrates Studies on the substrate requirements for the AviRa and AviRb enzymes show that both methylate naked 23S rRNA (Treede et al. 2003). Moreover, the enzymes specifically recognize and methylate small RNA transcripts of 87 nucleotides (for AviRb) and 187 (AviRa) that contain little more than the secondary structures immediately adjacent to the target nucleotides (Treede et al. 2003), and these RNA substrates could undoubtedly be pruned down further in size without loss of enzyme specificity. AviRa has been crystallized, and the structure has been computationally docked onto the 50S subunit interface adjacent to G2535 (Mosbacher et al. 2003). This model suggests a snug fit of AviRa onto the subunit, although rearrangement in the local rRNA structure would be required before contact between G2535 and the AviRa active site could be achieved. As the free 23S rRNA functions as a substrate for AviRa, and attempts to methylate the rRNA within the assembled subunit have not been successful, it remains unclear whether the putative 50S subunit-AviRa interaction has physiological relevance. 5.3 MLSB antibiotics The MLSB antibiotics bind to overlapping sites within the 50S ribosomal subunit tunnel close to the catalytic centre where peptide bonds are formed (the peptidyl transferase centre). During protein synthesis in the absence of antibiotics, the peptidyl transferase centre sequentially couples amino acids to the growing peptide chain, which lengthens through the 50S tunnel to emerge at the back of the subunit. This process is block by the MLSB drugs, which either directly inhibit catalysis at the peptidyl transferase centre and/or act as a physical barrier to the growth of the peptide chain within the tunnel. The lincosamides and streptogramin B drugs bind closest to the peptidyl transferase centre (Schlünzen et al. 2001) and have the primary effect of inhibiting this process (Vázquez 1979; Gale et al. 1981). From their location further down the tunnel, the 16-membered ring macrolides (Hansen et al. 2002) also inhibit peptidyl transferase to varying degrees (Poulsen et al. 2000), while the smaller 14-membered ring macrolides do not appear to reach into the peptidyl transferase centre (Schlünzen et al. 2001), and act solely by blocking the path of the peptide chain (Tenson et al. 2003).

Nucleotide methylations in rRNA 297

The MLSB antibiotics make numerous nucleotide interactions within their binding site, some of which are specific for MLSB subgroups, although all the drugs make substantial contact with nucleotide A2058 (Schlünzen et al. 2001; Hansen et al. 2002; Harms et al. 2004). This nucleotide is the target for the Erm methyltransferases that are found in numerous pathogenic as well as drug-producing strains of bacteria. The members of the Erm family are diverse (Roberts et al. 1999) but they are all specific for the N6 position of A2058 where they add either one or two methyl groups (Skinner et al. 1983; Weisblum 1995a). The monomethyltransferases confer high resistance to lincosamides, but lower levels of resistance to macrolides and streptogramin B antibiotics (Jenkins et al. 1989), while the dimethyltransferases confer high level resistance to all the MLSB drugs (Leclercq and Courvalin 1991; Weisblum 1995a). 5.3.1 Functional domains in Erm methyltransferases The tertiary structures of two dimethyltransferases, ErmAM (ErmB) and ErmC´, have been solved by NMR (Yu et al. 1997) and crystallography (Bussiere et al. 1998), respectively. The structures of the Erm methyltransferases are extremely similar, despite their sequence diversity, and are organized into two domains. The N-terminal domain binds AdoMet (Schluckebier et al. 1999) and contains the catalytic site (Maravic et al. 2003b). The C-terminal domain has been implicated in RNA binding, as has the interdomain cleft where there is a predominance of positively charged residues. Scanning of the ErmC´ sequence by alanine mutagenesis indicated that the RNA recognition motif is on the N-terminal side the interdomain cleft (Maravic et al. 2003a). Furthermore, a recently discovered addition to the Erm family, ErmMT in Mycobacterium tuberculosis, has a truncated C-terminal domain but is still capable of methylating its rRNA and conferring drug resistance (Buriánková et al. 2004). While ErmMT diminishes the importance assigned to the C-terminal region, this domain could still have a role in stabilizing Erm binding and target specificity. The crystal structure of ErmC´ resembles that of RsmA (KsgA) (O'Farrell et al. 2004), and this is particularly striking in the N-terminal domain, where the similarity extends to include the distribution of positive charges in the interdomain cleft. The similarities in architecture, topology, and function of these N6, N6adenosine methyltransferases suggest that they have evolved from a common structure. Further studies and comparisons of these two groups of enzyme will hopefully reveal how they achieve rRNA target specificity. At present, no highresolution structure is available for Erm or RsmA together with their respective RNA targets, or for any other methyltransferase-RNA complex. 5.3.2 General mechanism of MLSB resistance The crystal structures of MLSB antibiotics bound to ribosomal subunits (Schlünzen et al. 2001; Hansen et al. 2002; Harms et al. 2004) give us a clearer idea of how Erm-mediated methylation confers resistance. In the crystal structures, the N6 of A2058 is clearly directed towards the lumen of the peptide tunnel,

298 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

where it contributes to the binding of the MLSB drugs. The addition of methyl groups would have the dual effect of preventing the electrostatic interactions with A2058, which are required for binding, as well as sterically hindering the antibiotic fit into the site (Poehlsgaard and Douthwaite 2003). Such effects satisfactorily explain resistance by Erm, and there is neither necessity nor evidence to suggest that a conformational change occurs in the rRNA (as is commonly invoked in many accounts of the Erm resistance mechanism).

6 Synergistic effects of dual rRNA methylations Until recently, all the recorded cases of antibiotic resistance conferred by nucleotide methylation involved modification of a single nucleotide within the antibiotic site. This picture was adjusted recently when it was shown that mechanisms used by drug-producing Streptomyces species to confer resistance to the macrolide tylosin (Liu and Douthwaite 2002c) and the orthosomycin antibiotic avilamycin (Treede et al. 2003) involve methylation of two nucleotides that are located at opposite sides of the drug binding pockets. In the case of the tylosin-producer S. fradiae, resistance is conferred by monomethylation at the N6 of nucleotide A2058 (directed by ErmN, formerly named TlrD) acting in synergy with a methylation at the N1 of nucleotide G748 by RlmAII (formerly TlrB) (Liu and Douthwaite 2002c; Douthwaite et al. 2004). The N1 of G748 points into the lumen of the 50S ribosomal subunit tunnel, and faces nucleotide A2058 on the opposite side of the tunnel (Fig. 2C). These nucleotides are located within 15 Å of one another, and are integral components of the narrowest region of the ribosome tunnel (Ban et al. 2000; Harms et al. 2001) where the MLSB antibiotics bind. Neither methylation on its own confers appreciable tylosin resistance. The distinctive manner in which tylosin binds within the MLSB site offers explanations for why neither methylation on its own confers resistance, how they function together synergistically, and why this form of resistance is specific to tylosin. A single methyl group at A2058 can rotate freely around the adenine C6-N6 bond, and in some orientations the methyl group would interfere with drug binding by encroaching into the space required for tylosin (or any other macrolide) to fit into its site. Tylosin distinguishes itself from the other macrolides by possessing a mycinose sugar that interacts with nucleotide 748 (Hansen et al. 2002), and the mycinose-G748 interaction could enable tylosin to gain a foothold in its binding site while a fit is accommodated by rotating the single methyl group at A2058 out of the way. Methylation at G748 would cause the loss of tylosin interaction here, and thus would lead to tylosin resistance when A2058 is monomethylated. However, methylation of G748 on its own would be of little consequence if A2058, which is the major interaction site for tylosin binding, were unmodified. The S. fradiae tylosin-producer uses both belt and suspenders as in addition to RlmAII and ErmN it also possesses the ErmS dimethyltransferase. ErmS confers

Nucleotide methylations in rRNA 299

strong resistance on its own (Zalacain and Cundliffe 1989) and possibly is only expressed at a late stage of growth when tylosin levels are high. Returning to the orthosomycin antibiotics described above, AviRa is required together with AviRb to achieve full protection against avilamycin in in vitro translation assays, despite the low resistance conferred by AviRa on its own (Weitnauer et al. 2001). This would suggest that the S. viridochromogenes avilamycin producer needs both these determinant to attain adequate protection against its own antibiotic. Although there are no crystal data on orthosomycin drugs within their binding site, the available structures tell us that the AviRa and AviRb methylation sites are located 11Å apart in the 50S subunit (Ban et al. 2000; Harms et al. 2001) (Fig. 2B). It seems feasible that the distance between the AviRa and AviRb sites matches the geometry of avilamycin, and that these two methylations function together in a synergistic manner (Treede et al. 2003) similar to that for tylosin resistance in S. fradiae.

7 Conclusion and future perspectives The mechanisms of resistance outlined in this review are on the increase in pathogenic bacteria, and they have rendered many commonly used antibiotics therapeutically ineffective. This problem has stimulated the search for solutions, and these tend to be heading in three main directions. The first direction involves the further development of existing drugs so that they might bind to the ribosome despite the presence of methylated rRNA residues. There have been some important advances in this area, including the introduction of the ketolide antibiotics (Bryskier 2000) that have been developed from the macrolide erythromycin and bind to the same MLSB site. The first clinically approved version of the ketolides, telithromycin, shows greatly improved binding to A2058 monomethylated ribosomes compared to the parent compound (Liu and Douthwaite 2002a), but is unfortunately still strongly deterred by dimethylation at A2058. A second type of approach involves the rational design of novel compounds that would act against new ribosomal targets. This approach requires highresolution crystallographic structures of the ribosome, from which novel target sites can be identified, and their size, shape, and surface charge/hydrophobicity can be calculated. These data would then used to by medical chemists to create compounds that fit into the novel targets on the bacterial ribosome and block its function. This second approach is still in the developmental stage. A third line of attack involves the use of inhibitors directed against the resistance methyltransferases. The idea here is to prevent the methyltransferases from modifying the bacterial rRNA, and thus, effectively reinstate the therapeutic power of the older antibiotics. Inhibition of Erm methyltransferases has been achieved using compounds that mimic the AdoMet cofactor (Hajduk et al. 1999; Hanessian and Sgarbi 2000). While this approach has considerable potential, present difficulties lie in avoiding concomitant inhibition of the patient’s methyltransferases and inhibitors with this level of specificity still await discovery.

300 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa

Acknowledgements We thank Birte Vester for discussions about the manuscript, and Jacob Poehlsgaard for generously creating Fig. 2. We gratefully acknowledge support from the European Commission's Fifth Framework Program (grant QLK2CT2000-00935) and the Nucleic Acid Centre of the Danish Grundforskningsfond.

References Aarestrup FM, Jensen LB (2000) Presence of variations in ribosomal protein L16 corresponding to susceptibility of enterococci to oligosaccharides (avilamycin and evernimicin). Antimicrob Agents Chemother 44:3425-3427 Adrian PV, Mendrick C, Loebenberg D, McNicholas P, Shaw KJ, Klugman KP, Hare RS, Black TA (2000a) Evernimicin (SCH27899) inhibits a novel ribosome target site: analysis of 23S ribosomal DNA mutants. Antimicrob Agents Chemother 44:3101-3106 Adrian PV, Zhao W, Black TA, Shaw KJ, Hare RS, Klugman KP (2000b) Mutations in ribosomal protein L16 conferring reduced susceptibility to evernimicin (SCH27899): implications for mechanism of action. Antimicrob Agents Chemother 44:732-738 Andersen TE, Porse BT, Kirpekar F (2004) A novel partial modification at 2501 in Escherichia coli 23S ribosomal RNA. RNA 10:907-913 Ballesta JP, Cundliffe E (1991) Site-specific methylation of 16S rRNA caused by pct, a pactamycin resistance determinant from the producing organism, Streptomyces pactum. J Bacteriol 173:7213-7218 Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000) The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289:905-920 Beauclerk AA, Cundliffe E (1987) Sites of action of two ribosomal RNA methylases responsible for resistance to aminoglycosides. J Mol Biol 193:661-671 Bechthold A, Floss HG (1994) Overexpression of the thiostrepton-resistance gene from Streptomyces azureus in Escherichia coli and characterization of recognition sites of the 23S rRNA A1067 2'-methyltransferase in the guanosine triphosphatase center of 23S ribosomal RNA. Eur J Biochem 224:431-437 Belova L, Tenson T, Xiong L, McNicholas PM, Mankin AS (2001) A novel site of antibiotic action in the ribosome: interaction of evernimicin with the large ribosomal subunit. Proc Natl Acad Sci USA 98:3726-3731 Bosling J, Poulsen SM, Vester B, Long KS (2003) Resistance to the peptidyl transferase inhibitor tiamulin caused by mutation of ribosomal protein L3. Antimicrob Agents Chemother 47:2892-2896 Brandi L, Marzi S, Fabbretti A, Fleischer C, Hill WE, Gualerzi CO, Lodmell JS (2004) The translation initiation functions of IF2: Targets for thiostrepton inhibition. J Mol Biol 335:881-894 Brimacombe R, Mitchell P, Osswald M, Stade K, Bochkariov D (1993) Clustering of modified nucleotides at the functional center of bacterial ribosomal RNA. FASEB Journal 7:161-167

Nucleotide methylations in rRNA 301 Brodersen DE, Clemons WM Jr, Carter AP, Morgan-Warren RJ, Wimberly BT, Ramakrishnan V (2000) The structural basis for the action of the antibiotics tetracycline, pactamycin, and hygromycin B on the 30S ribosomal subunit. Cell 103:1143-1154 Bryskier A (2000) Ketolides - telithromycin, an example of a new class of antibacterial agents. Clin Microbial Infect 6:661-669 Buriánková K, Doucet-Populaire F, Dorson O, Gondran A, Ghnassia JC, Weiser J, Pernodet JL (2004) Molecular basis of intrinsic macrolide resistance in the Mycobacterium tuberculosis complex. Antimicrob Agents Chemother 48:143-150 Bussiere DE, Muchmore SW, Dealwis CG, Schluckebier G, Nienaber VL, Edalji RP, Walter KA, Ladror US, Holzman TF, Abad-Zapatero C (1998) Crystal structure of ErmC', an rRNA methyltransferase which mediates antibiotic resistance in bacteria. Biochemistry 37:7103-7112 Carter AP, Clemons WM, Brodersen DE, Morgan-Warren RJ, Wimberly BT, Ramakrishnan V (2000) Functional insights from the structure of the 30S ribosomal subunit and its interactions with antibiotics. Nature 407:340-348 Champney WS, Chittum HS, Tober CL (2003) A 50S ribosomal subunit precursor particle is a substrate for the ErmC methyltransferase in Staphylococcus aureus cells. Curr Microbiol 46:453-460 Chittum HS, Champney WS (1994) Ribosomal protein gene sequence changes in erythromycin-resistant mutants of Escherichia coli. J Bacteriol 176:6192-6198 Conn GL, Draper DE, Lattman EE, Gittis AG (1999) Crystal structure of a conserved ribosomal protein-RNA complex. Science 284:1171-1174 Cundliffe E (1990) Recognition sites for antibiotics within rRNA. In: Hill WE, Dahlberg A, Garrett RA, Moore PB, Schlessinger D, Warner JR (eds) The Ribosome: Structure, Function and Evolution. American Society for Microbiology, Washington DC, pp 479490 Das K, Acton T, Chiang Y, Shih L, Arnold E, Montelione GT (2004) Crystal structure of RlmAI: implications for understanding the 23S rRNA G745/G748-methylation at the macrolide antibiotic-binding site. Proc Natl Acad Sci USA 101:4041-4046 Decatur WA, Fournier MJ (2002) rRNA modifications and ribosome function. Trends Biochem Sci 27:344-351 Decatur WA, Fournier MJ (2003) RNA-guided nucleotide modification of ribosomal and other RNAs. J Biol Chem 278:695-698 Douthwaite S, Crain PF, Liu M, Poehlsgaard J (2004) The tylosin resistance methyltransferase RlmAII (TlrB) modifies the N-1 position of 23S rRNA nucleotide G748. J Mol Biol 337:1073-1077 Egebjerg J, Douthwaite S, Garrett RA (1989) Antibiotic interactions at the GTPaseassociated centre within Escherichia coli 23S rRNA. EMBO J 8:607-611 Farrell DJ, Douthwaite S, Morrissey I, Bakker S, Poehlsgaard J, Jakobsen L, Felmingham D (2003) Macrolide resistance by ribosomal mutation in clinical isolates of Streptococcus pneumoniae from the PROTEKT 1999-2000 study. Antimicrob Agents Chemother 47:1777-1783 Foster DR, Rybak MJ (1999) Pharmacologic and bacteriologic properties of SCH-27899 (Ziracin), an investigational antibiotic from the everninomicin family. Pharmacotherapy 19:1111-1117 Fourmy D, Recht MI, Blanchard SC, Puglisi JD (1996) Structure of the A site of Escherichia coli 16S ribosomal RNA complexed with an aminoglycoside antibiotic. Science 274:1367-1371

302 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa Fourmy D, Yoshizawa S, Douthwaite S (2003) Antibiotics as indicators of the functional components of the ribosome. In: Lapointe J, Brakier-Gingras L (eds) Translation Mechanisms. Landes Bioscience, USA, pp 429-442 Gabashvili IS, Gregory ST, Valle M, Grassucci R, Worbs M, Wahl MC, Dahlberg AE, Frank J (2001) The polypeptide tunnel system in the ribosome and its gating in erythromycin resistance mutants of L4 and L22. Mol Cell 8:181-188 Gale EF, Cundliffe E, Reynolds PE, Richmond MH, Waring MJ (1981) The Molecular Basis of Antibiotic Action. John Wiley and Sons, London Galimand M, Courvalin P, Lambert T (2003) Plasmid-mediated high-level resistance to aminoglycosides in Enterobacteriaceae due to 16S rRNA methylation. Antibicrob Agents Chemother 47:2565-2571 Giovanetti E, Montanari MP, Mingoia M, Varaldo PE (1999) Phenotypes and genotypes of erythromycin-resistant Streptococcus pyogenes strains in Italy and heterogeneity of inducibly resistant strains. Antimicrob Agents Chemother 43:1935-1940 Green R, Noller HF (1997) Ribosomes and translation. Annu Rev Biochem 66:679-716 Green R, Noller HF (1999) Reconstitution of functional 50S ribosomes from in vitro transcripts of Bacillus stearothermophilus 23S rRNA. Biochemistry 38:1772-1779 Gregory ST, Dahlberg AE (1999) Erythromycin resistance mutations in ribosomal proteins L22 and L4 perturb the higher order structure of 23S ribosomal RNA. J Mol Biol 289:827-834 Gustafsson C, Persson BC (1998) Identification of the rrmA gene encoding the 23S rRNA m1G745 methyltransferase in Escherichia coli and characterization of an m1G745deficient mutant. J Bacteriol 180:359-365 Hajduk PJ, Dinges J, Schkeryantz JM, Janowick D, Kaminski M, Tufano M, Augeri DJ, Petros A, Nienaber V, Zhong P, Hammond R, Coen M, Beutel B, Katz L, Fesik SW (1999) Novel inhibitors of Erm methyltransferases from NMR and parallel synthesis. J Med Chem 42:3852-3859 Hanessian S, Sgarbi PW (2000) Design and synthesis of mimics of S-adenosyl-Lhomocysteine as potential inhibitors of erythromycin methyltransferases. Bioorg Med Chem Lett 10:433-437 Hansen JL, Ippolito JA, Ban N, Nissen P, Moore PB, Steitz TA (2002) The structures of four macrolide antibiotics bound to the large ribosomal subunit. Mol Cell 10:117-128 Hansen JL, Moore PB, Steitz TA (2003) Structures of five antibiotics bound to the peptidyl transferase center of the large ribosomal subunit. J Mol Biol 330:1061-1075 Hansen LH, Kirpekar F, Douthwaite S (2001) Recognition of nucleotide G745 in 23S ribosomal RNA by the RrmA methyltransferase. J Mol Biol 310:1001-1010 Harms J, Schluenzen F, Zarivach R, Bashan A, Gat S, Agmon I, Bartels H, Franceschi F, Yonath A (2001) High resolution structure of the large ribosomal subunit from a mesophilic eubacterium. Cell 107:679-688 Harms JM, Schlünzen F, Fucini P, Bartels H, Yonath A (2004) Alteration at the peptidyl transferase centre of the ribosome induced by synergistic action of the streptogramins dalfopristin and quinupristin. BMC Biology 2:4-13 Helser TL, Davies JE, Dahlberg JE (1972) Mechanism of kasugamycin resistance in Escherichia coli. Nature (London) New Biol 235:6-9 Jenkins G, Zalacain M, Cundliffe E (1989) Inducible ribosomal RNA methylation in Streptomyces lividans, conferring resistance to lincomycin. J Gen Microbiol 135:3281-3288

Nucleotide methylations in rRNA 303 Kagan RM, Clarke S (1994) Widespread occurrence of three motifs in diverse Sadenosylmethionine-dependent methyltransferases suggests a common structure for these enzymes. Arch Biochem Biophys 310:417-427 Khaitovich P, Tenson T, Kloss P, Mankin AS (1999) Reconstitution of functionally active Thermus aquaticus large ribosomal subunits with in vitro-transcribed rRNA. Biochemistry 38:1780-1788 Kiss T (2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse cellular functions. Cell 109:145-148 Kofoed CB, Vester B (2002) Interaction of avilamycin with ribosomes and resistance caused by mutations in 23S rRNA. Antimicrob Agents Chemother 46:3339-3342 Krzyzosiak W, Denman R, Nurse K, Hellmann W, Boubik M, Gehrke CW, Agris PF, Ofengand J (1987) In vitro synthesis of 16S ribosomal RNA containing single base changes and assembly into functional 30S ribosome. Biochemistry 26:2353-2364 La Teana A, Gualerzi CO, Dahlberg AE (2001) Initiation factor IF 2 binds to the alphasarcin loop and helix 89 of Escherichia coli 23S ribosomal RNA. RNA 7:1173-1179 Lafontaine DL, Delcour J, Glasser AL, Desgres J, Vandenhaute J (1994) The DIM1 gene responsible for the conserved m62Am62A dimethylation in the 3'-terminal loop of 18S rRNA is essential in yeast. 241:492-497 Leclercq R, Courvalin P (1991) Bacterial resistance to macrolide, lincosamide, and streptogramin antibiotics by target modification. Antimicrob Agents Chemother 35:12671272 Liu M, Douthwaite S (2002a) Activity of the ketolide antibiotic telithromycin is refractory to Erm monomethylation of bacterial rRNA. Antimicrob Agents Chemother 46:16291633 Liu M, Douthwaite S (2002b) Methylation at nucleotide G745 or G748 in 23S rRNA distinguishes Gram-negative from Gram-positive bacteria. Mol Microbiol 44:195-204 Liu M, Douthwaite S (2002c) Resistance to the macrolide antibiotic tylosin is conferred by single methylations at 23S rRNA nucleotides G748 and A2058 acting in synergy. Proc Natl Acad Sci USA 99:14658-14663 Liu M, Kirpekar F, van Wezel GP, Douthwaite S (2000) The tylosin resistance gene tlrB of Streptomyces fradiae encodes a methyltransferase that targets G748 in 23S rRNA. Mol Microbiol 37:811-820 Mankin AS (2001) Ribosomal antibiotics. Mol Biol 35:509-520 Mann PA, Xiong L, Mankin AS, Chau AS, Mendrick CA, Najarian DJ, Cramer CA, Loebenberg D, Coates E, Murgolo NJ, Aarestrup FM, Goering RV, Black TA, Hare RS, McNicholas PM (2001) EmtA, a rRNA methyltransferase conferring high-level evernimicin resistance. Mol Microbiol 41:1349-1356 Maravic G, Bujnicki JM, Feder M, Pongor S, Flögel M (2003a) Alanine-scanning mutagenesis of the predicted rRNA-binding domain of ErmC' redefines the substratebinding site and suggests a model for protein-RNA interactions. Nucleic Acids Res 31:4941-4949 Maravic G, Feder M, Pongor S, Flögel M, Bujnicki JM (2003b) Mutational analysis defines the roles of conserved amino acid residues in the predicted catalytic pocket of the rRNA:m6A methyltransferase ErmC'. J Mol Biol 332:99-109 Maravic G, Flögel M (2004) RNA methylation and antibiotic resistance: an overview. Periodicum Biologorum 106:135-140 McNicholas PM, Najarian DJ, Mann PA, Hesk D, Hare RS, Shaw KJ, Black TA (2000) Evernimicin binds exclusively to the 50S ribosomal subunit and inhibits translation in

304 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa cell-free systems derived from both gram-positive and gram-negative bacteria. Antimicrob Agents Chemother 44:1121-1126 Min YH, Jeong JH, Choi YJ, Yun HJ, Lee K, Shim MJ, Kwak JH, Choi EC (2003) Heterogeneity of macrolide-lincosamide-streptogramin B resistance phenotypes in enterococci. Antimicrob Agents Chemother 47:3415-3420 Mosbacher TG, Bechthold A, Schulz GE (2003) Crystal structure of the avilamycin resistance-conferring methyltransferase AviRa from Streptomyces viridochromogenes. J Mol Biol 329:147-157 Nissen P, Hansen J, Ban N, Moore PB, Steitz TA (2000) The structural basis of ribosome activity in peptide bond synthesis. Science 289:920-930 O'Farrell HC, Scarsdale JN, Rife JP (2004) Crystal structure of KsgA, a universally conserved rRNA adenine dimethyltransferase in Escherichia coli. J Mol Biol 339:337-353 Ofengand J (2002) Ribosomal RNA pseudouridines and pseudouridine synthases. FEBS Lett 514:17-25 Ofengand J, Rudd KE (2000) Bacteral, archaeal and organellar rRNA pseudouridines and methylated nucleosides and their enzymes. In: Garrett RA, Douthwaite S, Liljas A, Matheson AT, Moore PB, Noller HF (eds) The Ribosome: Structure, Function, Antibiotics, and Cellular Interactions. American Society for Microbiology, Washington, DC, pp 175-189 Poehlsgaard J, Douthwaite S (2003) Macrolide antibiotic interaction and resistance on the bacterial ribosome. Curr Opin Invest Drugs 4:140-148 Poldermans B, Roza L, Van Knippenberg PH (1979) Studies on the function of two adjacent N6,N6-dimethyladenosines near the 3' end of 16S ribosomal RNA of Escherichia coli. III. Purification and properties of the methylating enzyme and methylase-30S interactions. J Biol Chem 254:9094-9100 Porse BT, Cundliffe E, Garrett RA (1999) The antibiotic micrococcin acts on protein L11 at the ribosomal GTPase centre. J Mol Biol 287:33-45 Poulsen SM, Kofoed C, Vester B (2000) Inhibition of the ribosomal peptidyl transferase reaction by the mycarose moiety of the antibiotics carbomycin, spiramycin and tylosin. J Mol Biol 304:471-481 Roberts MC, Sutcliffe J, Courvalin P, Jensen LB, Rood J, Seppälä H (1999) Nomenclature for macrolide and macrolide-lincomycin-streptogramin B resistance determinants. Antimicrobial Agents and Chemotherapy 43:2823-2830 Rodnina MV, Savelsbergh A, Matassova NB, Katunin VI, Semenkov YP, Wintermeyer W (1999) Thiostrepton inhibits the turnover but not the GTPase of elongation factor G on the ribosome. Proc Natl Acad Sci USA 96:9586-9590 Rosendahl G, Douthwaite S (1993) Ribosomal proteins L11 and L10.(L12)4 and the antibiotic thiostrepton interact with overlapping regions of the 23S rRNA backbone in the ribosomal GTPase centre. J Mol Biol 234:1013-1020 Rosendahl G, Douthwaite S (1994) The antibiotics micrococcin and thiostrepton interact directly with 23S rRNA nucleotides 1067A and 1095A. Nucleic Acids Res 22:357-363 Rozenski J, Crain PF, McCloskey JA (1999) The RNA Modification Database: 1999 update. Nucl Acids Res 27:196-197 Ryan PC, Lu M, Draper DE (1991) Recognition of the highly conserved GTPase center of 23S ribosomal RNA by ribosomal protein L11 and the antibiotic thiostrepton. J Mol Biol 221:1257-1268

Nucleotide methylations in rRNA 305 Schluckebier G, Zhong P, Stewart KD, Kavanaugh TJ, Abad-Zapatero C (1999) The 2.2 Å structure of the rRNA methyltransferase ErmC' and its complexes with cofactor and cofactor analogs: Implications for the reaction mechanism. J Mol Biol 289:277-291 Schlünzen F, Tocilj A, Zarivach R, Harms J, Gluehmann M, Janell D, Bashan A, Bartels H, Agmon I, Franceschi F, Yonath A (2000) Structure of functionally activated small ribosomal subunit at 3.3 angstroms resolution. Cell 102:615-623 Schlünzen F, Zarivach R, Harms J, Bashan A, Tocilj A, Albrecht R, Yonath A, Franceschi F (2001) Structural basis for the interaction of antibiotics with the peptidyl transferase centre in eubacteria. Nature 413:814-821 Schubert HL, Blumenthal RM, Cheng X (2003) Many paths to methyltransfer: a chronicle of convergence. Trends Biochem Sci 28:329-335 Seidel-Rogol BL, McCulloch V, Shadel GS (2003) Human mitochondrial transcription factor B1 methylates ribosomal RNA at a conserved stem-loop. Nat Genet 33:23-24 Skeggs PA, Thompson J, Cundliffe E (1985) Methylation of 16S ribosomal RNA and resistance to aminoglycoside antibiotics in clones of Streptomyces lividans carrying DNA from Streptomyces tenjimariensis. Mol Gen Genet 200:415-421 Skinner R, Cundliffe E, Schmidt FJ (1983) Site of action of a ribosomal RNA methylase responsible for resistance to erythromycin and other antibiotics. J Biol Chem 258:12702-12706 Spahn CM, Prescott CD (1996) Throwing a spanner in the works: antibiotics and the translation apparatus. J Mol Med 74:423-439 Sparling PF (1970) Kasugamycin resistance: 30S ribosomal mutation with an unusual location on the Escherichia coli chromosome. Science 167:56-58 Tait-Kamradt A, Davies T, Cronan M, Jacobs MR, Appelbaum PC, Sutcliffe J (2000) Mutations in 23S rRNA and ribosomal protein L4 account for resistance in pneumococcal strains selected in vitro by macrolide passage. Antimicrob Agents Chemother 44:21182125 Tenson T, Lovmar M, Ehrenberg M (2003) The mechanism of action of macrolides, lincosamides and streptogramin B reveals the nascent peptide exit path in the ribosome. J Mol Biol 330:1005-1014 Thompson J, Cundliffe E, Stark M (1979) Binding of thiostrepton to a complex of 23S rRNA with ribosomal protein L11. Eur J Biochem 98:261-265 Thompson J, Schmidt F, Cundliffe E (1982) Site of action of a ribosomal RNA methylase conferring resistance to thiostrepton. J Biol Chem 257:7915-7917 Thompson J, Skeggs PA, Cundliffe E (1985) Methylation of 16S ribosomal RNA and resistance to the aminoglycoside antibiotics gentamicin and kanamycin determined by DNA from the gentamicin-producer, Micromonospora purpurea. Mol Gen Genet 201:168-173 Treede I, Jakobsen L, Kirpekar F, Vester B, Weitnauer G, A. B, Douthwaite S (2003) The avilamycin resistance determinants AviRa and AviRb methylate 23S rRNA at the guanosine 2535 base and the uridine 2479 ribose. Mol Microbiol 49:309-318 Van Buul CP, Damm JB, Van Knippenberg PH (1983) Kasugamycin resistant mutants of Bacillus stearothermophilus lacking the enzyme for the methylation of two adjacent adenosines in 16S ribosomal RNA. Mol Gen Genet 189:475-478 Van Buul CP, van Knippenberg PH (1985) Nucleotide sequence of the ksgA gene of Escherichia coli: comparison of methyltransferases effecting dimethylation of adenosine in ribosomal RNA. Gene 38:65-72

306 Stephen Douthwaite, Dominique Fourmy, and Satoko Yoshizawa Van Dyke N, Murgola EJ (2003) Site of functional interaction of release factor 1 with the ribosome. J Mol Biol 330:9-13 Vázquez D (1979) Inhibitors of Protein Biosynthesis. Springer-Verlag, Berlin Vester B, Douthwaite S (1994) Domain V of 23S rRNA contains all the structural elements necessary for recognition by the ErmE methyltransferase. J Bacteriol 176:6999-7004 Vicens Q, Westhof E (2001) Crystal structure of paromomycin docked into the eubacterial ribosomal decoding A site. Structure (Camb) 9:647-658 Vicens Q, Westhof E (2002) Crystal structure of a complex between the aminoglycoside tobramycin and an oligonucleotide containing the ribosomal decoding a site. Chem Biol 9:747-755 Vicens Q, Westhof E (2003) Crystal structure of geneticin bound to a bacterial 16S ribosomal RNA A site oligonucleotide. J Mol Biol 326:1175-1188 Weisblum B (1995a) Erythromycin resistance by ribosome modification. Antimicrob Agents Chemother 39:577-585 Weisblum B (1995b) Insights into erythromycin action from studies of its activity as inducer of resistance. Antimicrob Agents Chemother 39:797-805 Weitnauer G, Gaisser S, Trefzer A, Stockert S, Westrich L, Quiros LM, Mendez C, Salas JA, Bechthold A (2001) An ATP-binding cassette transporter and two rRNA methyltransferases are involved in resistance to avilamycin in the producer organism Streptomyces viridochromogenes Tu57. Antimicrob Agents Chemother 45:690-695 Wimberly BT, Brodersen DE, Clemons WMJ, Morgan-Warren RJ, Carter AP, Vonrhein C, Hartsch T, Ramakrishnan V (2000) Structure of the 30S ribosomal subunit. Nature 407:327-339 Wimberly BT, Guymon R, McCutcheon JP, White SW, Ramakrishnan V (1999) A detailed view of a ribosomal active site: the structure of the L11-RNA complex. Cell 97:491502 Yokoyama K, Doi Y, Yamane K, Kurokawa H, Shibata N, Shibayama K, Yagi T, Kato H, Arakawa Y (2003) Acquisition of 16S rRNA methylase gene in Pseudomonas aeruginosa. Lancet 362:1888-1893 Yoshizawa S, Fourmy D, Puglisi JD (1998) Structural origins of gentamicin antibiotic action. EMBO J 17:6437-6448 Yu L, Petros AM, Schnuchel A, Zhong P, Severin JM, Walter K, Holzman TF, Fesik SW (1997) Solution structure of an rRNA methyltransferase (ErmAM) that confers macrolide-lincosamide-streptogramin antibiotic resistance. Nat Struct Biol 4:483-489 Yusupov MM, Yusupova GZ, Baucom A, Lieberman K, Earnest TN, Cate JH, Noller HF (2001) Crystal structure of the ribosome at 5.5 Å resolution. Science 292:883-896 Zalacain M, Cundliffe E (1989) Methylation of 23S rRNA caused by tlrA (ermSF), a tylosin resistance determinant from Streptomyces fradiae. J Bacteriol 171:4254-4260

Douthwaite, Stephen Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark Email: [email protected]

Nucleotide methylations in rRNA 307

Fourmy, Dominique Laboratoire de RMN, ICSN-CNRS, 1 ave de la terrasse, 91190 Gif-sur-Yvette, France Yoshizawa, Satoko Laboratoire de RMN, ICSN-CNRS, 1 ave de la terrasse, 91190 Gif-sur-Yvette, France

Translational Recoding and RNA Modifications Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

Abstract During protein synthesis, codons in mRNA are translated sequentially in frame on the ribosome following strict decoding rules. This process is usually very accurate. However, in some cases, recoding events occur at selected codons, leading to a high frequency of frameshifting or stop codon readthrough. The factors influencing these noncanonical decoding events are very diverse; among them are the codon usage and context, the presence of a stable mRNA secondary structure downstream of the decoding sites and the type and relative abundance of normally modified tRNA. Here, we discuss the role of certain modified nucleotides of tRNAs in a few cases of frameshifting and readthrough that occur in Bacteria and Eukarya. While in some cases the effect of a given modified nucleotide in a tRNA is to increase accuracy of the recoding process, in a few other cases the reverse has been observed. This review illustrates the power of using well characterized recoding systems, coupled with specific defects of RNA modification enzymes to assay for translational fidelity under in vivo conditions.

1 Introduction 1.1 Recoding events During the complex translation elongation process, codons in mRNAs are translated on the ribosome by aminoacylated tRNAs following strict decoding rules. This process is usually very accurate, the average frequency of estimated miscod-4 ing being about 5.10 per codon or even less (Buckingham and Grosjean 1986; Kurland 1992; Kurland and Gallant 1996). However, over the last decade, data have demonstrated that reading the genetic code may be more flexible than initially anticipated and that in certain cases the frequency of unconventional decoding can be as high as 40% or even more (Grentzmann et al. 1998). These alternative readings of the genetic code have been called ‘recoding’ (Gesteland et al. 1992). This process corresponds to a subversion of normal decoding rules, leading to the synthesis of an unpredicted polypeptide carrying different biological functions. Such recoding phenomena comprise several translational events including readthrough of stop codons, frameshifting, and ribosomal hopping (Fig. 1, see also Baranov et al. 2002, 2003; Namy et al. 2004). Recoding events are always in competition with the standard decoding process and essentially depend on special

Topics in Current Genetics H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106847 / Published online: 27 January 2005 © Springer-Verlag Berlin Heidelberg 2005

2 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

Fig. 1. Schematic representation of the different recoding events. During readthrough, a normal tRNA reads the stop codon, allowing protein synthesis to proceed to the next inframe termination codon. Most of the +1 and -1 programmed translational frameshifting events also lead to the synthesis of an elongated protein, by escaping a stop codon. In the unique example of ribosome hopping (gene 60 of phage T4), the recoding event also ends with the by-pass of a stop codon. Recoding events are thus, most of the time, necessary to produce an elongated protein carrying new functional domains, such as reverse transcriptase activity in retroviruses and replicase activity in plant viruses.

sequences and structures on the mRNA (“recoding signal” in cis) as well as on in trans canonical components of the translation machinery. However, in the special case of recoding the UGA stop codon to selenocysteine or UAA codon to pyrrolysine, a specific non-canonical tRNA (tRNASer-sec or tRNALys-Pyl) and also a special elongation factor and SECIS-binding protein or special lysyl-tRNA synthetase are needed (Hatfield and Gladyshev 2002; Driscoll and Copeland 2003; Blight et al. 2004). Beside these two last cases, translational recoding events are found mainly in small autonomous elements such as bacteriophages, viruses, or transposons, although the expression of a few cellular genes have been demonstrated to be controlled by this means (reviewed in Namy et al. 2004). In all cases, they allow the synthesis of two related polypeptides from the same mRNA, the one resulting from the recoding process being usually less abundant than the shorter polypeptide. This is typically the case in retroviruses where the GAG protein is synthesized by regular decoding while the polymerase domain is expressed as a GAG-POL fusion protein through a recoding event (Farabaugh 1996). 1.2 The stimulatory recoding signals The efficiency of a recoding process depends on various elements of the translation machinery. The “cis-recoding signals” in an mRNA include a particular sequence where the recoding event takes place, which involves the ribosomal Aand/or P-site codon(s), and additional sequence information that is present upstream and/or downstream and increases the efficiency of the process (“stimula-

Translational Recoding and RNA Modifications 3

tory signals”). For example, the presence of either a pseudoknot or a stable stemloop downstream of a shifty site strongly stimulates -1 frameshifting efficiency (Tzeng et al. 1992). Likewise, in prokaryotes (Bacteria and Archaea), the interactions between a Shine Dalgarno (SD) -like sequence located upstream of the frameshifting site and the rRNA serve as a stimulatory signal for -1 and +1 frameshifting events (Larsen et al. 1995; Marquez et al. 2004). The “trans-recoding elements” include the type and availability of certain aminoacyl-tRNAs complexed with their GTP-elongation factor, the presence of fully active competing termination factors and possibly also some peculiar structural features of the rRNA and/or of the ribosome itself (Farabaugh 1996; Gesteland and Atkins 1996; Atkins et al. 2000). For example, the relative abundance of individual tRNA species, their decoding efficiency and/or their intrinsic capability to slip on certain mRNA sequences, are important factors that induce the ribosomemRNA-tRNA machinery to occasionally bypass a stop codon or read an alternate frame. Since tRNA population (type and relative abundance of each individual normally modified isoacceptor) varies much between organisms, especially of the three domains of Life (reviewed in Marck and Grosjean 2002), the probability of recoding at certain mRNA sequences is usually species specific, although for -1 frameshifting and readthrough, some recoding signals can operate in heterologous species (Stahl et al. 1995; Cassan and Rousset 2001; Leger et al. 2004). Therefore, in a given organism, a ‘subtle combination’ of various cis- and trans-stimulatory signals can force the translation machinery to escape normal decoding rules, leading to readthrough or frameshifting phenomena. The occurrence and efficiency of such translational recoding processes ultimately depend on how these various cisand trans-elements have co-evolved in order to work together in a synergistic way. 1.3 Modified nucleotides in RNA and decoding An important distinctive structural feature of RNA (tRNA and rRNA) is the presence of a significant proportion of post-transcriptional modifications of nucleotides. Out of over 100 different structures reported to date, more than 80 modifications are present in tRNA and about 20 in rRNA (Sprinzl et al. 1998; Rozenski et al. 1999; McCloskey and Rozenski 2005). The pattern of modification (type and location) depends on the RNA molecule considered as well as on the organism or the organelle from which they originate from. In tRNA, the most characteristic and often hypermodified nucleotides are present in the anticodon loop and stem (positions 27-40; Fig. 2 part A and B). These modified nucleotides contribute to the built-in feature of the anticodon branch that ultimately determines the decoding properties (efficiency and accuracy) of the tRNA molecule during translation on the ribosome (reviewed in Agris 1996, 2004; Davis 1998). A majority of the hypermodified nucleotides (such as Q, mnm5s2U) are exclusively present in position 34 (the wobble base in the anticodon), while others (such as m6t6A, ms2i6A or the Y base) are exclusively present in position 37 (3’-adjacent to anticodon). More simple modified nucleotides (such as Ψ, Gm, Um, s2C, m3C or m5C) are present

4 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

elsewhere in the anticodon branch. They are needed to modulate the flexibility and the preferential 3’-stacked conformation adopted by the anticodon loop when it binds to the complementary codon (extended anticodon theory, reviewed in Yarus 1982; Dao et al. 1994). In this context, the ubiquitous purine-37, especially its hypermodified derivatives, plays a major role in modulating the stability of the codon-anticodon interaction by a dangling end type of base stacking (see Fig. 2C, reviewed in Bubienko et al. 1983; Grosjean et al. 1998). Also, because most modified nucleotides-37 cannot base pair in a Watson-Crick mode, their presence 3’adjacent to the anticodon restricts the tRNA to base pair to the in-frame codonanticodon triplet pair, thus limiting (but not completely avoiding, see below) the

Translational Recoding and RNA Modifications 5 Fig. 2 (overleaf). Type and location of modified nucleotides in the anticodon stem and loop of tRNAs. Part A is a schematic representation of the three-dimensional architecture of tRNA. Numbering of nucleotide positions are those universally adopted. The anticodon nucleotides corresponding to positions 34, 35, and 36 are shown in square boxes in the anticodon hairpin representation (Part B). Symbols for modified nucleotides are those defined in Rozenski et al. (1999). The information is derived from the tRNA data bank of Sprinzl et al. (1998). Almost all the hypermodified nucleotides characterized so far occur exclusively in position 34 (the so –called wobble base) or in position 37 (3'-adjacent to the anticodon). In Part C, the 3’ stacked conformation of the anticodon branch is schematically represented. Bases adjacent to the anticodon (denoted Z) are often modified (see part B) and play a role in the flexibility and hence the preferential conformation adopted by the anticodon when it binds to a complementary codon. The ribosomal milieu is also a major factor. Purine-37 (R), 3’ of the anticodon is often hypermodified. It cannot base-pair with mRNA and plays a role in reading frame maintenance as well as in the stabilization of the base pair between the third base-36 of the anticodon and the first codon base (I), as schematized by an arrow. Certain modified bases such as t6A, ms2i6A or Y base, are particularly efficient because of their stacking potential. The wobble base-34 (W) can form noncanonical base pair with the third base of the codon (III). It also stabilizes the pairing in the middle position of the codon-anticodon complex. Therefore, correct decoding may depend on a short double helix formed between the “two out of three” complementary codon-anticodon, “sandwiched” between stacked, but not necessarily complementary nucleotides present in their immediate context.

risk of frameshifting during translation (reviewed in Agris 2004). Likewise, the nucleotide at position 34 of the anticodon reads the third codon base, and it is the only anticodon position that allows a non-Watson-Crick, or wobble base pairing during a “normal” decoding process. The characteristic and often unique modified nucleotides in this position-34 functions to restrict or extend pairing (or base opposition) between anticodon base-34 and the third base of the codon and hence regulates the decoding pattern of individual isoacceptor tRNAs (Yokoyama and Nishimura 1995; Takai and Yokoyama 2003; see also the chapter in this volume by Suzuki). As for purine-37, the type of modification of nucleotide-34 (on the base and/or the ribose) can also modulate stacking interaction with the adjacent Watson-Crick middle base pair at position 35 of the anticodon and position 2 of the codon (Fig. 2C; reviewed in Grosjean et al. 1998). It is noteworthy that the level of modification of certain modified nucleotides in tRNAs, especially in the anticodon branch such as Q34, ms2i6A37, or Y37, depends on cell growth or stress conditions, as well as on the availability of the cofactor(s) needed for enzymatic formation of the modified nucleotide during tRNA biogenesis. Depending on the influence of this modified nucleotide on the recoding process, its presence or absence in the tRNA may affect and possibly regulate the level of expression of the ‘recoding-dependent’ protein (discussed in Buck and Ames 1984; Persson 1993; Winkler 1998; Björk et al. 1999). An analogous situation exists for the expression of proteins from certain bacterial mRNAs by an ‘attenuation-type’ of regulation mechanism. In this last case, the presence or absence of certain modified nucleotides in tRNA (such as Ψ at positions 38-39-40 in tRNAHis or ms2i6A37 in tRNATrp) determines whether the translation machinery

6 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

will pass through a row of several adjacent identical codons (for histidine or tryptophan respectively) and, thereby, produce or not a protein that is required to overcome the stress problem of the bacteria (reviewed in Landick et al. 1996). 1.4 Complexity of the decoding process within the ribosome Today, three tRNA binding sites are accepted as a universal feature of ribosomes: an acceptor A-site where the aminoacyl-tRNA complexed with elongation factorGTP checks the adequacy of codon-anticodon pairing, a peptidyl P-site 5’ adjacent of the A-site codon where a peptidyl-tRNA is positioned, and an exit E-site where the previous peptidyl-tRNA now deacylated is waiting to escape the ribosome for recycling into a new round of aminocylation-aminoacyl transfer process (reviewed in Burkhardt et al. 1998). As indicated above, the modified nucleotides in tRNAs and especially in the anticodon branch are of utmost importance for the modulation (accuracy, efficiency, regulation) of codon-anticodon interaction within the protein synthesizing machinery as well as for maintenance of the correct reading frame during translation. However, the “normal” decoding process (Rodnina and Wintermeyer 2001; Noller et al. 2002; Ogle et al. 2003; Steitz and Moore 2003), does not depend solely on codon-anticodon interactions at the A-site, but also on codon-anticodon interactions at the P-site and possibly at the E-site (Schmeing et al. 2003). It also depends on many other factors of the translation machinery, such as the interactions of the tRNA with the rRNA which also contains many modified nucleotides, especially in the decoding as well as in the peptidyl centers (Decatur and Fournier 2002; Ofengand and Delcampo 2005; McCloskey and Rozenski 2005), and possibly on the interactions with certain ribosomal proteins as well as between the two adjacent tRNAs located in the A-site and P-side and/or in the Psite and the E- site of the ribosome (Nierhaus et al. 2000). The dynamic interplay of these diverse types of interactions between all partners of the ribosomal machinery along the translation process often makes it difficult to evaluate the effect of a single modified nucleotide in a tRNA (or rRNA) on one particular stage of the stepwise in-frame translation process. Whatever the mechanistic details of this “normal” ribosomal elongation process, the time taken by each individual step along the mRNA is of utmost importance for the accurate reading of the genetic code (Rodnina et al. 2000, 2002). Indeed, as illustrated below in section 2, most if not all of the many kinds of recoding (and occasionally premature termination events not discussed here), occur during so-called ‘pauses’ of the ribosome along the mRNA. 1.5 Testing the roles of modified nucleotides of RNA in recoding Since recoding corresponds to an exhausted propensity to make translational errors, it constitutes a choice target to check the effect of RNA modifications on the decoding capacities of the cell. Taking advantage of the complete sequence of many genomes now available, including those of the bacterium Escherichia coli

Translational Recoding and RNA Modifications 7

and the yeast Saccharomyces cerevisiae, together with the powerful genetic tools available to create and study E. coli or yeast mutants, a systematic investigation at identifying the role of particular modified nucleotides of tRNA in recoding events is now possible. The almost complete set of genes encoding tRNA modification enzymes in both E. coli and S. cerevisiae is now being determined (see Bujnicki et al. 2004; De Crécy-Lagard 2004 and the chapter in this volume by Johansson and Bystrom). The fact that deletion/inactivation of most of the genes encoding tRNA modification enzymes usually does not affect cell growth, at least when tested individually, strongly suggests that their roles concern mainly fine tuning of the tRNA molecules, rather than drastic defects of an essential function of the corresponding modification enzymes and/or of the undermodified tRNA molecules (discussed in Hopper and Phizicky 2003). Moreover, the use of mutants, associated with various reporter systems (natural or synthetic), allows a precise quantification of recoding events in E. coli and S. cerevisiae (Stahl et al. 1995; Grentzmann et al. 1998; Paul et al. 2001; Harger and Dinman 2003). This experimental approach permits in principle the analysis of the decoding potential of cells in which a single modification enzyme is inactivated or absent. However, this powerful approach suffers several drawbacks: i) modification enzymes are usually not specific of a single isoacceptor tRNA nor of a single position in different tRNA species (Motorin and Grosjean 1999). In other words, when an effect is observed, it can be difficult (or even impossible) to determine which one(s) of the undermodified isoacceptor tRNA species is (are) involved in the recoding process, or which of the several positions where the modified nucleotide is normally present is (are) critical. To overcome this problem, one should preferentially test those recoding events that depend only on a single, or at least a limited number of isoacceptor tRNA with a well known pattern of modified nucleotides; ii) deletion/inactivation of a gene coding for a given modification enzyme may lead to pleiotropic effects depending on how the enzyme is interconnected with the expression of other cellular processes, including the activities of other tRNA modification enzymes. This is the case of enzymes belonging to enzymatic metabolons leading to the formation of several hypermodified bases in tRNAs, usually present in positions 34 (wobble position) and 37 (5’-adjacent to the anticodon, see above). While no evidence of such interconnectivity has yet been demonstrated between RNA modification enzymes catalyzing reactions at different locations in a tRNA molecule, this possibility cannot be ruled out and should be kept in mind; iii) lack of a given modified nucleotide in a tRNA may also lead to subtle changes in the multiple functions of the tRNA molecules, such as passage through the nuclear pore in eukaryotic cells (Grosshans et al. 2001), and specific interactions with aminoacyl-tRNA synthetases (reviewed in Giegé et al. 1998; Beuning and Musier-Forsyth 1999) or with various factors (Forster et al. 1993; Huang et al. 2005). It can also lead the undermodified tRNA to become a target for the surveillance system and be selectively degraded by the exosome machinery (Kadaba et al. 2004; see also the chapter in this volume by Anderson and Droogmans,). Carefully designed controls have to be performed to test whether the undermodified tRNA remains stable enough in the cell and continues to fulfill its role in mRNA translation; iv) some modification enzymes could be involved in another function

8 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

than tRNA modification, such as a “chaperone-like” activity during rRNA biosynthesis (Lafontaine et al. 1998) In this latter case, knowledge of which is (are) the essential amino acid(s) of the active site of the enzyme makes it possible to design specific mutations resulting in the production of stable proteins devoid of the RNA modification activity but still endowed with other function(s). Despite the many difficulties encountered in attributing a specific function to a given modified nucleotide in the tRNA of E. coli or yeast strains defective in one of the several tRNA modification enzymes, interesting results concerning the role of tRNA modifications in recoding events have recently been published, mostly with S. cerevisiae and E. coli as model systems. In this review, we have excluded most of the abundant data concerning the role of modified nucleotides in natural spontaneous (non-programmed) readthrough, as well as those involving mutated suppressor tRNAs or abnormal termination factors (for details concerning this aspect of decoding, consult excellent reviews by Björk 1995; Murgola 1985, 1995; Curran 1998; Agris 2004).

2 Influence of modified tRNA nucleotides in frameshifting 2.1 Programmed +1 frameshifting in bacteria In bacteria, the case of the prfB gene of E. coli, encoding the release factor 2 (RF2), which recognizes the UAA and UGA codons, is the best understood phenomenon (Craigen and Caskey 1986). The prfB open reading frame is interrupted by a UGA stop codon 26 codons downstream of the initiation codon. However, the coding sequence corresponding to the long C-terminal part of the protein continues immediately in the +1 reading frame, for a further 340 codons (Fig. 3). This recoding event is used as an autoregulatory mechanism controlling the abundance of full length RF2 in the cell (Craigen and Caskey 1986; Adamski et al. 1993). In the presence of high RF2 concentrations, the competition between termination and frameshifting is changed in favor of termination, leading to a decrease of the RF2 concentration in the cell. When RF2 becomes limiting, frameshifting begins to dominate, thus, increasing cellular RF2 level. Frameshifting occurs at the slippery heptanucleotide sequence CUU.UGA.C (the zero frame is represented) located at the junction of the zero and +1 ORFs and allows expression of the active RF2 protein. The frameshifting event requires the peptidyl-tRNALeu (anticodon 5’GAG3’ with an m1G37 3’ adjacent to the anticodon), which normally decodes the zero frame leu-codon CUU in the P-site of the ribosome, now to slip by a single base toward the 3’ end and to miscode the +1 frame Phe-codon UUU. It also involves an unorthodox G*U base pair between the third anticodon base of the peptidyl-tRNALeu and the first base of the new +1 Phe-codon UUU. This +1 frameshift process depends on a Shine-Dalgarno-like (SD) sequence positioned upstream of the slippery sequence, which interferes with the tRNA in the E-site (see below). The transient interaction between the 16S rRNA

Translational Recoding and RNA Modifications 9

Fig. 3. RF2 +1 frameshifting in E. coli. An UGA stop codon interrupts the prfB gene at position 26. In the presence of a limiting concentration of RF2 protein, a pause of the ribosome is induced. In association with the Shine-Dalgarno-like sequence which increases the probability that the E-site tRNA will be ejected from the ribosome, the pause allows the leucine tRNAGAG (carrying the m1G modification at position 37) to slip one nucleotide upstream. During the next step of elongation the Asp-tRNAGUC (not modified at position 37) is incorporated in the +1 frame leading to a +1 frameshifting event that allows the synthesis of full length RF2. If RF2 is not in limiting amounts, it recognizes the UGA termination codon and protein synthesis stops. This represents an elegant autoregulatory mechanism controlling the abundance of RF2 in a large number of bacteria.

and such a SD-like sequence in the mRNA slows down progression of the ribosome on the mRNA, allowing the decoding machinery to better sensor the availability of RF2. When RF2 is in low abundance, the peptidyl-tRNALeu has more time (better chance) to slip rightward to the +1 phe codon UUU (Fig. 3). Urbonavicius and collaborators (2001) directly demonstrated the role of modifications in position 37 in improving reading frame maintenance on frameshifting sites derived from prfB, using E. coli and Salmonella typhimurium mutants deficient in m1G37 or ms2io6A37 modifications. The same study also demonstrated the role of modifications at position 34 (Q34, mnm5s2-U34) and Ψ38-40 on maintenance of the reading frame. Recently, premature release of the E-site tRNA from the ribosome has been demonstrated to be coupled with high-level +1 frameshifting at the prfB gene (Marquez et al. 2004). This study showed that in an in vitro reconstituted system,

10 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

the presence of the E-site tRNA can prevent the +1 frameshifting event. Indeed, when the internal UGA is at the A-site, the SD-like sequence is separated from the peptidyl-tRNA at the P-site by only two nucleotides and thus competes with the first nucleotide of the E-site codon. The result is a steric clash between the SDantiSD of the 16S rRNA and codon-anticodon interaction at the E-site. This situation allows an easier release of the E-site tRNA and consequently favors the subsequent frameshifting event. This study demonstrates for the first time the importance of the E-site tRNA for reading frame maintenance. As for the tRNA in the A- and P-sites, it remains to be determined whether the stability of the E-site tRNA also depends on its modification status, and hence leading to a modulation of frameshifting efficiency, at least in Bacteria. 2.2 Programmed +1 frameshifting in Eukarya In eukaryotes, several genes have been reported to be expressed through a programmed +1 frameshifting event (reviewed in Farabaugh 1996; Namy et al. 2004). Transposon Ty1 and Ty3 frameshifting, together with the ornithine decarboxylase antizyme of higher eukaryotes are probably the best studied examples. In all these cases, a “hungry” codon in the A-site due to limitation in the abundance or decoding capacity of aminoacyl-tRNAs or a severe functional defect in the tRNA (Gallant and Lindsley 1992) is responsible for the ribosomal pause, and favors the probability of a slippage at a specific heptanucleotide sequence of the peptidyltRNA from the P-site into the +1 out of frame codon. Depending on the cooperation of various stimulating elements, the efficiency of this recoding process in S. cerevisiae ranges from a few percent to 90%. These elements (see also section 1.2) are: the peculiar mRNA slippery sequence, the decoding property of the corresponding tRNAs and the presence and/or absence of certain modified nucleotides in these tRNAs (see below). Two models have been proposed to explain such high level of Ty1 frameshifting. In the first model (Stahl et al. 2001), the peptidyl-tRNALeu (UAG), originally located at the P-site Leu-CUU codon, was proposed to bind the out-of-frame +1 Leu-UUA codon by a unorthodox base pairing (a G-U within the third anticodon base and the first codon base, see Fig. 4). This noncanonical phenomenon has a certain probability to occur, because: i) the normal in-frame AGG codon corresponds to a slow decoding tRNAArg(anticodon U*CU, with t6A37 and Ψ39, where U* stands for mcm5U), thus inducing a pause of the ribosome traveling along the mRNA; ii) the overlapping +1 codon GGC corresponds to an abundant and efficient tRNAGly (GCC with A37Ψ38), able to make three GC pairs; iii) naturally occurring tRNALeu(UAG with m1G37 and Ψ39), in contrast with all other U34containing tRNAs in S. cerevisiae, has an unmodified U34 in the wobble position of anticodon and is able to recognize all six leucine codons (including UUA), at least in vitro (Weissenbach et al. 1977). In the second slightly different alternative model (Hansen et al. 2003), the same peptidyl-tRNALeu is proposed to slip on the mRNA without fully pairing with the

Translational Recoding and RNA Modifications 11

Fig. 4. Ty1 +1 frameshifting in S. cerevisiae. The leucine-tRNAUAG in the P-site can re-pair in the +1 position. This event depends on the low availability and decoding capacity (slow decoding) of the arginine tRNAU*CU (where U* stands for mcm5U34). The polypeptide synthesized after frameshifting carries the polymerase activity, whereas the short polypeptide has a GAG-like function. The efficiencies of the two events (normal decoding and +1 frameshifting) are indicated in %. Pseudouridine (Ψ) are in position 39 in tRNALeu and tRNAArg, but in position 38 in tRNAGly.

new +1 codon, the important thing still remains the ability of the incoming abundant tRNAGly(GGC with A37Ψ38) to outcompete with the inefficient tRNAArg(U*CU, t6A37) for the decoding of the +1 codon instead of the in-frame codon. Whatever the exact mechanism is, the common important features for most programmed +1 frameshifting in Eukarya are an unstable codon/anticodon base pairing in the P-site together with a peptidyl-tRNA having a special ability to slip, a codon pausing the ribosome in the A-site due to the existence of an inefficient or low abundant decoding tRNA and a G/C rich codon in the +1 overlapping frame corresponding to an abundant tRNA. In this context, one can expect some modified nucleotides in the anticodon branch of the tRNA (and possibly in the decoding site of the rRNA; Decatur and Fournier 2002) to influence in one way or another the efficiency of the recoding process. Recently, we have shown that the absence of pseudouridine (Ψ) at position 38 or 39 in the anticodon arm of yeast tRNAs decreases frameshifting efficiency at the Ty1 site. Indeed, disruption of the PUS3 gene, coding for tRNA (Ψ38/Ψ39) pseudouridine synthase in S. cerevisiae, or point mutation in the PUS3

12 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

gene that inactivates the gene product, leads to an almost twofold reduction of frameshifting at the Ty1 slippery sequence (Lecointe et al. 2002). Obviously, Ψ39-containing tRNALeu(UAG) is more adapted to “recoding” at the Ty1 slippery sequence, probably by making a more stable codon-anticodon interaction in the out-of-frame codon than the tRNALeu lacking of Ψ39 (Davis et al. 1998; Yarian et al. 1999). Interestingly, deletion of the PUS4 gene involved in pseudouridine formation at positions 55, as well as of the TRM4 gene involved in 5-methylcytosine formation at positions 48 of tRNALeu(UAG) have no noticeable effect on Ty1 +1 frameshifting (Lecointe et al. 2002). Thus, efficiency of +1 programmed frameshifting depends on selected modified nucleotides, which in the present case are located at the end of the anticodon arm. 2.3 Programmed -1 frameshifting in Bacteria and Eukarya Among recoding events, -1 programmed frameshifting is probably the most frequently encountered. It is found in eukaryotes, bacteria, viruses (mostly retroviruses), phages, and also in transposable elements (IS). In viruses, programmed frameshifting is generally used to express the viral polyprotein Gag-Pol. However, a few cellular genes both in prokaryotes and eukaryotes are known to use a -1 frameshifting event for their expression (Baranov et al. 2002; Namy et al. 2004). The -1 type of frameshifting appears to occur either by tandem or single tRNA slippage depending on the tRNAs and the slippery sequence involved (Napthine et al. 2003). One of the major determinants influencing the efficiency of -1 frameshifting is the presence of a stimulatory signal located downstream or upstream of the slippery codons. This cis-acting element acts at least in part by slowing ribosome progression along the mRNA, allowing a better chance for the tandem tRNAs to slip backwards, depending of course on the adequacy of the new -1 mRNA hexanucleotide sequence for base pairing with tRNA anticodons (Jacks et al. 1988). This could be either an upstream Shine-Dalgarno-like sequence (as in +1 frameshifting, see section 2.1 above) or a downstream secondary structure, which could be either a stem loop or more often a pseudoknot (see Fig. 5 for a typical -1 frameshifting site). The influence of the stacking potential of the base immediately adjacent to the base 3’ of tandem shift codons on -1 ribosomal frameshifting has also been demonstrated, thus indicating that the stability (lifetime) of the codon-anticodon interaction in the new -1 frame is an important parameter (Bertrand et al. 2002). Beside these mRNA structural determinants in cis, the question of the involvement of only certain types of tRNAs as being able to shift backward and also the role of certain modified nucleotides in these tRNAs have been controversial. First, soon after the discovery of the -1 translational frameshifting sites in various mammalian retroviruses, such as Human Immunodeficiency virus, Human T-cell lymphotrophic virus 1, and Bovine Leukemia virus, a correlation was made with the observation that aminoacyl-tRNAs required to translate the slippery sequence

Translational Recoding and RNA Modifications 13

Fig. 5. -1 frameshifting site from Beet western yellows virus (BWYV). A shifty heptamer (H) of the form X.XXY.YYZ allows tandem slippage of the tRNAs in the P- and A-sites. The special nature of the heptamer sequence allows “re-pairing” of the two tRNAs in the -1 frame. Although the heptamer sequence is the causative element, a downstream stimulatory secondary structure, in general a pseudoknot (PK) is necessary to reach a high efficiency of frameshifting. The length and sequence of the region between the pseudoknot and the slippery sequence (spacer) is also a critical parameter to achieve a maximal level of frameshifting.

were always hypomodified in infected cells, i.e. the degree of modification of certain nucleotides in tRNAs, such as Q34 in several tRNAs and of Y37 in tRNAPhe was severely reduced (Hatfield et al. 1989). However, several laboratories have analyzed the role of the Q-containing tRNAs and the conclusion was that when the asparaginyl-tRNA (anticodon QUU and t6A37) decodes the A-site codon AAC the presence of Q34 and consequently the presence of the precursor G34 in the mammal tRNA has no significant effect on -1 frameshifting at the slippery sequence U.UUA.AAC, either in vitro or in mammalian cells (Cassan et al. 1994; Reil et al. 1994; Marczinke et al. 2000; Carlson et al. 2000). Second, using a rabbit reticulocyte lysate and the sequences A.AAU.UUU and U7 as model systems, Hatfield and collaborators (Carlson et al. 2001) demonstrated that rabbit reticulocyte tRNAPhe (anticodon GmAA and Y37) bearing a m1G37 instead of the hypermodified Y-base-37 (wybutosine-37) stimulates -1 frameshifting threefold. No such effect was observed with the U7 slippery sequence under the same experimental conditions. However, when tRNAPhe(GmAA) from the yeast S. cerevisiae was tested in vivo on another slippery sequence U.UUU.UUA (instead of A.AAU.UUU as above) and in a yeast trm5 mutant defective in the formation of m1G37 (m1G37 is the first biosynthetic step of Y37 formation), no significant difference in the level of frameshifting was observed as compared with the same analyses performed using the wild type strain (Urbonavicius et al. 2003). If we assume that the two types of test systems are comparable, the result may indicate that a slippery undermodified tRNA in the P-site (as

14 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

in the in vitro reticulocyte experiment) or in the A-site (as in the S. cerevisiae yeast system) can make a difference. Moreover the absence of effect observed by Carlson et al. (2001) with the U7 frameshifting sequence, as tested in the in vitro reticulocyte test system, would indicate that undermethylated tRNAs located in the A-site can neutralize/counteract the effect of the same undermethylated tRNAPhe in the P-site, probably because the global stability/life time of the tandem tRNAPhe bound in the -1 frame of the slippery U7 sequence is too low. In bacteria, one of the most frequently -1 frameshift prone sequence found is U.UUA.AAA/G (nucleotides involved with tRNA pairing are underlined). It involves a specific tRNALys bearing an anticodon U*UU flanked with t6A37(U* stands for 5’-methylaminomethyl-2-thio-uridine-34) complementary to codons AAA/G in the A-decoding site. Among the E. coli mutants that affect the activity of the different enzymes involved in the multistep formation of this hypermodified uridine-34, only mnmE (trmE) catalyzing the early step of the methylaminomethyl-group (mnm) on C5 of U-34 has an effect on -1 frameshifting efficiency. In this mutant, in which, the tRNALys harbors only the thiolated uridine (s2U34, instead of the fully modified mnm5s2U34), -1 frameshifting at the U.UUA.AAA/G slippery sequence is stimulated twofold (Brierley et al. 1997). However, an independent recent report indicates that a twofold decrease in -1 frameshifting efficiency is observed when the same mnmE mutant E. coli strain is used to test another slippery sequence A.AAA.AAC, where the tRNALys is located in the P-decoding site (Urbonavicius et al. 2003). Again, the apparent discrepancy between the two sets of data may result from the fact that in one case the tRNALys (U*UU.t6A) reading the AAA codon, was initially located in the ribosomal Asite while in the second case the same tRNALys was initially present in the P-site. This illustrates the fact that different requirements are involved in codon/anticodon base pairing depending on whether the concerned tRNA is present in the ribosomal A-site or the P-site. Unfortunately, no data has been reported with A7 and the E. coli hypomodified tRNALys in order to verify, as in the U7 sequence tested above with undermodified tRNAPhe, whether the presence of the same hypomodified tRNA in both the P- and the A-sites would neutralize/counteract the effect observed in each individual situation. The importance of the hypermodified uridine-34 in preventing tRNA slippage during the elongation process in E. coli was also demonstrated by Brégeon et al. (2001). Using the artificial frameshifting site GAG.AGA.G within the βgalactosidase ORF expressed in E. coli, coupled with mutagenesis experiments, it was beautifully demonstrated that characteristic +2 frameshifting events (apparent -1 frameshift) occurred in mutants affecting the genes coding for one or two of the several enzymes involved in the formation of methylaminomethyl group of U34 in tRNAGlu(U*UC.m2A37, where U* stands for mnm5s2U34), namely gidA and mnmE. Last but not least, while most studies have concerned only de/recoding properties of tRNA in either the A- or the P-decoding site within a given slippery sequence, recent results from our laboratory indicate the importance of at least one modified nucleotide in the anticodon branch of a tRNA located in the E-site during -1 frameshifting (Bekaert and Rousset 2005). Based on a clear-cut correlation

Translational Recoding and RNA Modifications 15

between the existence of an efficient -1 frameshifting event and the presence of a tRNA carrying a Ψ at position 39 in the P-site of the decoding cassette, we experimentally verified with a PUS3 deleted yeast strain that the presence of the precursor U39 instead of Ψ39 in the tRNA, decreases by a factor of 2 the efficiency of -1 frameshifting event of several slippery sequences, as compared with the situation in a wild type yeast strain. This observation is reminiscent of the role of the tRNA in the E-site in +1 frameshifting of the E. coli prfB gene, although the mechanism could be different (see above in section 2.1 and Marquez et al. 2004). Whatever the molecular basis of these phenomena, discussions of earlier studies considering only the identity of the tRNAs in the A- and P-sites of the ribosome to interpret -1 and +1 frameshifting should be reconsidered.

3 Modified nucleotides in tRNA also affect stop codon readthrough efficiency When a stop codon is presented in the ribosomal A-site of the ribosome, specific release factors bind to the ribosome and trigger the hydrolysis of the peptidyl chain of the peptidyl-tRNA that is present in the P-site. The termination process is usually very accurate, the probability of a readthrough event being estimated to be as low as 0.005 - 0.001% in both Bacteria and Eukarya. The machinery that determines such an efficient process is rather complex (for reviews see Murgola et al. 2000; Wilson et al. 2000; Bertram et al. 2001). Again, as discussed above for frameshifting, a combination of several factors in cis and in trans of the stop codon (readthrough signals) can considerably influence the accuracy of the termination process and enhance the propensity of a given stop codon to be read (in fact miscoded) by a normal elongator tRNA instead of the expected release factor. Among these factors are the stop codon itself, UGA being more ‘leaky’ than UAG, which is less efficient than UAA (Lovett et al. 1991; Manuvakhova et al. 2000; Bidou et al. 2004), the surrounding nucleotide context, up to 6 nucleotides downstream and 2 nucleotides upstream of the stop codon (Bonetti et al. 1995; Bertram et al. 2001; Namy et al. 2001; Harrell et al. 2002; Tork et al. 2004) and the presence (type and abundance) of tRNAs able to decode a given stop codon (Chittum et al. 1998). The ability of these natural suppressor tRNAs to compete with the release factor by reading a stop codon depends very much also on its modified nucleotide content, especially in their anticodon branch (Beier and Grimm 2001). After stop codon readthrough translation continues in the original reading frame and results in synthesis of a longer protein with potentially new biochemical properties. Programmed readthrough of stop codons is used by a number of plant viruses to express their replicase domain in the form of a fusion protein (Beier and Grimm 2001). Again, as described above for viral frameshifting, most of the stop codon readthrough events described are found in viruses infecting eukaryotic cells and are used to control gene expression (Blum et al. 1989; Feng et al. 1992; Zerfass and Beier 1992a; Li and Rice 1993). Several examples of stop codon readthrough

16 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

events have been recently identified in eukaryotic cellular genes, and with the growing number of fully sequenced genomes that become available, more examples are expected to be identified (reviewed in Namy et al. 2004). In bacteria, excluding the very efficient selenocysteine and pyrolysine incorporation in response to a stop codon, no obvious programmed stop codon readthrough has been identified to date. In Eukaryotes, several naturally occurring cytoplasmic tRNAs have been shown to recognize stop codons involved in programmed translational readthrough events (see references below). In all cases, stop codon recognition implies non-orthodox base pairing between the second or the third base of the anticodon and the first or second base of the codon. The probability of such miscoding is highly dependent on the presence or absence of modified nucleotides in the anticodon (first and/or second position) and/or in position 37, adjacent to the third base-36 of the anticodon: i. Eukaryotic tRNATyr normally decodes exclusively UAC/U codons, except in certain cells where it can also efficiently read both the UAG and UAA stop codons, despite a GxG or GxA clash between the first wobble base of the anticodon and the third base of the codon. This phenomenon, originally discovered by studying translation of the Tobacco mosaic virus (TMV) was observed later also in many other plant viruses for which the expression of the polymerase domain depends on efficient readthrough of a characteristic stop codon (Beier et al. 1984). The anticodon tRNATyr of tobacco and wheat leaves is GΨA with m1G37, whereas in tRNATyr of wheat germ, the anticodon is doubly modified into QΨA with m1G37. Interestingly, only tRNATyr (GΨA with m1G37) from tobacco and wheat leaves can efficiently translate the stop codon of TMV in vitro, but not tRNAΤyr(QΨA) from wheat germ. Likewise, natural tRNATyr of S. cerevisiae (anticodon GΨA), that is naturally devoid of Q34 but contains Ψ35 in the middle of the anticodon and i6A37 instead of m1G37, was shown to be an efficient suppressor of the TMV UAG stop codon, while the same tRNATyr in which the GΨA anticodon was replaced by unmodified GUA, became incompetent for UAG stop codon reading in the TMV context (Zerfass and Beier 1992b). These results clearly indicate that the Ψ35 modification is a major determinant for tRNATyr to suppress the UAG stop codon, and that the Q modification in position 34 of the same tRNA counteracts the property of Ψ35. ii. Eukaryotic cytoplasmic tRNAGln are known to be able to suppress UAG and UAA stop codons either in vitro or in vivo (Pure et al. 1985; Kuchino et al. 1987; Kuchino and Muramatsu 1996; Hoja et al. 1998; Namy et al. 2002). Two isoacceptors with either CUG or UmUG anticodons are found in mammalian cells (Sprinzl et al. 1998). These tRNAGln can each read one of the two UAG/UAA stop codons, thus, including a noncanonical G*U wobble base pairing between the third anticodon base (position 36) and the first codon base. It is noteworthy that the sequenced tRNAGln from mouse and tobacco carry an unmodified A in position 37 (instead of the usual m1G37 adjacent to a G36), which is believed to favor such unconventional G*U wob-

Translational Recoding and RNA Modifications 17

iii.

iv.

v.

ble base-pair as stated above (Weissenbach and Grosjean 1981). Moreover, one of the two tRNAGln isoacceptors harbors a 2’O-methyl ribose at position 34 which was demonstrated to strengthen base pairing between codon and anticodon (Satoh et al. 2000). Likewise, two natural tRNATrp suppressors of the UGA stop codon, with a CmCA anticodon have been isolated from plants, one cytoplasmic and one from the chloroplast. The cytoplasmic tRNA carries an m1G at position 37, whereas its chloroplast counterpart has either an i6A or ms2i6A derivative at position 37 (Beier and Grimm 2001). It has been shown that both the 2’ Omethylribose and isopentenyl derivative of A37 stabilize codon/anticodon interactions (Houssier and Grosjean 1985; Satoh et al. 2000; reviewed in Grosjean et al. 1998), thus, allowing the unconventional CmxA base pairing between the first wobble anticodon position and the third codon base. Interestingly, in vertebrate reticulocytes, β-globin is naturally extended beyond its UGA stop codon by multiple suppressions and translational reading gaps. Identification of the amino acids in response of UGA have shown the presence of serine, tryptophan, cysteine, and arginine (Chittum et al. 1998). Also, three peptides result from translational reading gaps as they lack an amino acid or amino acids corresponding to UGA and/or one or two of the immediate downstream codons. Clearly, bypass of a stop codon may involve several “natural” tRNA suppressors, some of which are probably better suppressors than others, depending in part on the sequence of their anticodon but also on their modified nucleotide content. Figure 6A illustrates the case where the stop codon UAG in a context similar to the TMV readthrough site has been shown to be misread in S. cerevisiae by tRNA isoacceptors corresponding to Tyr, Lys, and Trp, yet with different efficiencies (Fearon et al. 1994). So far, only in plants has a cytoplasmic suppressor tRNAArg bearing the anticodon U*CG (where U* stands for mcm5U and/or mcm5s2U) been identified as efficient natural UAG suppressor in the PEMV (Pea enation mosaic virus) context in a wheat germ extract (Baum and Beier 1998). Readthrough of the UGA codon of Sindbis virus has been observed in cultured cells of chicken, human, and insect. The possibility exists that the tRNAArg (U*CG, where U* stands for mcm5U and/or mcm5s2U) which is present in these cells is responsible of the UGA readthrough; however, direct evidence is still lacking (Takkinen 1986; Li and Rice 1989). The presence or the absence of a modified nucleotide not only in the anticodon, but also in another position of the tRNA molecule can control suppressor efficiency. We have recently shown that lack of pseudouridinylation at positions 38 or 39 of the anticodon branch decreases readthrough efficiency of stop codons in S. cerevisiae. Indeed, deletion of the PUS3 gene responsible for the formation of pseudouridines at these positions, affects readthrough of all three stop codons placed in the TMV context. Because all three known natural suppressors of stop codons in S. cerevisiae, i.e. tRNATrp(CmCA.A37), tRNATyr(GΨA.i6A37) and tRNALys(CUU.t6A37) (Fig. 6A and 6B; Fearon et al. 1994) harbor a pseudouridine at position 39, it was

18 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

concluded that this modification improves the decoding efficiency of stop codons, probably because the Ψ39-containing tRNA allows a stronger interaction between codon and the anticodon (Davis et al. 1998; Yarian et al. 1999), a situation that is particularly important for miscoding at stop codons by natural tRNAs. vi. Among various types of recoding events that occur at a stop codon, cotranslational incorporation of selenocysteine in response to UGA is certainly the most spectacular and the best studied one. It requires a specialized tRNASerSec bearing a U*mCA.i6A37 anticodon complementary to UGA (where U* stands for mcm5U34 in mammalian tRNA and a nonmodified U34 in E. coli), a specific elongation factor and a protein that recognizes a secondary structure in the mRNA designated SECIS, located downstream of the UGA codon. Altogether, these cis- and trans- factors drive efficient incorporation of selenocysteine by the tRNASerSec at the UGA stop codon (for review see Walczak et al. 1996; Tujebajeva et al. 2000; Hatfield and Gladyshev 2002; Driscoll and Copeland 2003). Interestingly, even though the anticodon of tRNASerSec is strictly complementary to the UGA stop codon (as a true suppressor tRNA), both the isopentenyl group on adenosine-37 (i6A37) and the 2’O-methyl group on U34 were nevertheless demonstrated to be important for efficient reading of the UGA within the recoding cassette (Warner et al. 2000; Jameson and Diamond 2004). The possibility also exists that the level of 2’-O-methylation of U34 in response to a specific metabolic stress, such as limitation of available free selenium or selenium derivatives, plays a role in regulating the expression of a group of important cellular selenocysteinecontaining proteins (Jameson and Diamond 2004). In the chapter by Rubio and Alfonzo of this volume, there is another interesting case of UGA stop codon recoding that occurs in mitochondria of the trypanosome Leishmania tarentolae. In this case, a nuclear encoded tRNATrp (anticodon CCA.i6A37) is first imported into the mitochondria where a fraction of the imported tRNA (about 50%) becomes modified at positions 32 and 33 into 2’ Omethyl- and 2-thiolated uridine derivatives respectively, before C34 at the anticodon is edited into U34 and subsequently modified again into 2’ O-methyl-U34. The resulting multi-step modified/edited tRNATrp (with the modified anticodon UmCA) is now a true UGA suppressor, making canonical Watson-Crick base pairs between codon and anticodon, exactly as with tRNASerSec involved in selenocysteine incorporation where the 2’ O-methylation of U34 also plays a major role in the efficiency of decoding (see above).

4 Conclusions and Perspectives Codon recognition by tRNAs plays a central role in decoding the genetic message. This process occurs on the ribosome and involves many steps and components of the translation machinery. Obviously, complexity is required to maintain high

Translational Recoding and RNA Modifications 19

Fig. 6. Stop codon readthrough. Part A: Several natural suppressor tRNAs are able to read the UAG stop codon. Among them, three have been identified by protein sequencing in S. cerevisiae, using the CAA.UAG.CAA.GCA readthrough context, similar to the well known TMV readthrough site (stop codon is underlined). Relative proportion of each of the amino acids identified is indicated in brackets. A common feature of all these tRNAs is the presence of a mismatch (noncanonical base pairing) in the codon/anticodon base pair (GxG, UxU or CxA), as well as the presence of a Ψ residue at position 39. The absence of this Ψ39 reduces the efficiency of readthrough in a yeast system (see text). In part B, are shown the anticodon of natural tRNAs of S. cerevisiae that can theoretically make a “genetically incorrect” mismatch with one of the bases of the UAG stop codon. These bases are indicated in bold and underlined. Grey boxes correspond to the anticodons of tRNAs that were demonstrated to induce readthrough in a TMV context (Part A).

fidelity and efficiency of the decoding process. In the present review, we summarized what is known about the role of modified nucleotides in tRNA involved in ‘recoding’ processes (mainly frameshifting and stop codon readthrough). These recoding events can be very efficient and have been selected and optimized during evolution. Comparative studies of how most codons in mRNAs are faithfully de-

20 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

coded in frame while in certain contexts selected codons are efficiently recoded (miscoded) should light on the molecular basis of translational accuracy. It may also help identify the elements within tRNAs and rRNAs that allow a ribosome either to shift or to bypass a termination codon in organisms of the three domains of Life (Bacteria, Archaea, and Eukarya). 4.1 Decoding rules of recoding process are special The main difference between a highly efficient recoding process at specific recoding signals and a highly efficient decoding event during translation of mRNA is that the rules underlying the interactions between a codon and an anticodon on the ribosome are different. According to the genetic code, interactions between codon and anticodon during normal mRNA translation, involve strict WatsonCrick base-pairing between the two first bases of the codon and the two last bases of the anticodon, and only the first base of the anticodon can accommodate (wobble) with a different third base of the codon following the so-called “wobble rules” (Crick 1966; reviewed in Agris 2004). In contrast, recoding phenomena involving natural suppressor tRNAs often imply noncanonical base pairing, usually between the first base of the codon and the third base of anticodon (in addition or not to normal wobbling base pairing on the other side of the codon-anticodon interaction) or, as in very few cases, a unique mismatch in the middle of the codonanticodon pair. This fundamental difference may explain why some features, such as the presence or absence of a given modified nucleotide in the anticodon of a tRNA, may differently affect the efficiency of the normal decoding process as estimated in ‘classic genetic experiments’ and the efficiency of the recoding process. 4.2 Trans-recoding elements are complex and difficult to identify To date much effort to understand the mechanism of recoding events has concentrated on the elucidation of the cis-elements that direct the ribosome to bypass a stop codon, or to shift out of the normal decoding frame. This led to the identification of several types of recoding and stimulatory signals such as slippery sequences, codon context, and secondary structures downstream of the recoding cassette as well as upstream SD-like sequences in Bacteria. In all systems studied so far, a pause during decoding is a critical element of recoding efficiency (Farabaugh 1996). Identification of the trans-acting elements that stimulate the same recoding events has been more challenging. In a given cell, elements that may affect de/recoding are: i) in the ribosome itself, with the rRNA containing many characteristic modified nucleotides, mainly in the decoding- and the peptidyl-sites; ii) in specific normally modified isoacceptor tRNAs (type and abundance); iii) in proteins and/or specific factors acting during the frameshifting or termination process. As far as RNA modification is concerned, sequencing tRNAs and rRNAs from many organisms has revealed their complexity and variability (type, location). De-

Translational Recoding and RNA Modifications 21

tailed biochemical and genetic analyses have revealed a modulation of the level of certain modifications in tRNAs in response to physiological stresses or biochemical constraints. In other words, a given nucleotide in an RNA (tRNA or rRNA) is not necessarily fully modified, as may appear from inspection of RNA modification databanks. Such information is, however, difficult to obtain and even impossible unless specific analyses designed to address this important question are presented. Only in few cases, has this type of analysis been done (Persson 1993; Yu et al. 1997; Björk and Rasmuson 1998; Winkler 1998). Also, ribosomes and translation factors from organisms of each of the three domains of Life are not identical and important differences exist. The majority of these trans-elements are therefore species-, sometimes cell-specific and not necessarily interchangeable in reconstituted in vitro and in vivo heterologous systems. However, several good model systems (natural or synthetic with artificial reporter mRNA) have been developed to study various aspects of the recoding process in Bacteria (E. coli or S. typhimurium) and Eukarya (mammalian cells in culture or S. cerevisiae) transformed or not with ad hoc plasmids. The use of rabbit reticulocyte lysates and wheat germ extracts has also been instrumental in developing tools to study recoding processes. Using these tools, it has been shown that wellcharacterized recoding systems, that are efficient in the homologous translation system, when transposed to another type of cell (such as between yeast and bacteria), no longer work or at least with much lower efficiency. This is the case of the Ty1 frameshifting site that is completely inefficient in mammalian cells (Stahl et al. 1995). Therefore, what we know from one type of cell may not necessarily apply to another type of cell. However, in other cases, recoding signals from one organism work equally well in heterologous systems. Although, it is not obligatory that the same precise mechanism be involved in different species, this implies that the major determinants of recoding have been conserved during evolution. Indeed, the readthrough signal of TMV, initially characterized in plant cells, works also in mouse cells and even better in yeast (Skuzeski et al. 1991; Stahl et al. 1995; Cassan and Rousset 2001) and similarly, at least some retroviral -1 frameshift sites also act in E. coli (Horsfield et al. 1995; Leger at al. 2004). At a given recoding signal for frameshifting or stop codon readthrough, different alternative or even competitive mechanism(s) may exist. In the few cases analyzed in which a stop codon was recoded into a natural amino acid, it was shown that multisuppression phenomena exist, albeit with different relative efficiencies, one of the natural ‘suppressor’ tRNAs being always more efficient than the other(s) (Feng et al. 1989; Fearon et al. 1994; Chittum et al. 1998; see also Fig. 6A). Likewise, during codon reading, competition probably occurs between different tRNAs able to misread a given codon. Therefore, when a tRNA usually playing a major role in recoding becomes less efficient because of a defect of its modified nucleotide content, or because of a parameter that affects its cellular concentration, then alternate natural suppressor tRNA(s) may take over recoding, thereby confusing the final interpretation of the data. Precise identification of tRNA(s) involved in each of the recoding events identified so far is no easy task. Only in few cases, has the precise trans-acting

22 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

tRNA(s) responsible for a recoding event been identified. Hopefully, due to the rapid development of crystallization techniques allowing a detailed molecular view at a few angstrom resolution of the ribosome complexed with tRNA in the A-, P- and/or E-site(s) associated with a fragment of mRNA, one can dream that the 3D-structure of a ribosome stalled at a characteristic recoding signal will soon be solved. This might reveal some important clues of how a miscoding-type of recoding process versus a normal correct decoding process occur on the ribosome. A first step in this direction has been reached recently by Yusupov and collaborators, showing how an mRNA regulatory domain upstream of a coding region can regulate the progression of the ribosome along the mRNA (Yusupova et al. 2001; Noller et al. 2002; M Yusupov, personal communication). 4.3 Role of modified nucleotides in both tRNA and rRNA As is clear from the data reviewed above, the presence or absence of a given modified base in a tRNA may either stimulate or reduce the efficiency of recoding. This apparently contradictory dual effect depends on the system considered and the position of the shifty/miscoding tRNA not only in the A- site and/or P-site but also in the E-site of the ribosome. Indeed, recent studies point to the importance of a fully modified tRNA in the E-site for -1 frameshifting (Bekaert and Rousset 2005). To have a clear view of how tRNA modification influences various recoding events, much more work must be done. Indeed, most information available on the role of a modified nucleotides in tRNA concerns “normal” decoding processes and not recoding events, which, as stated above, are two different (although related) processes. In these studies, reporter systems used in a given biological model system should be carefully designed to address precise questions. The comparison of the results obtained with different reporter systems as well as in different biological systems should be made with much caution. Discrepancies in the scientific literature probably result in part from different reporters used in different laboratories. As a number of nucleotides are modified in each tRNA, it may well be that a defect in only one modified nucleotide in a tRNA is not sufficient to observe a measurable effect. The possibility exists that when combined with other modifications, located at other sites on the tRNA molecule, a more pronounced effect would be observed. Although most of the modifications that have been shown to affect recoding are located at or near the anticodon loop, more distant modifications within the tRNA molecule could also be important. Thanks to the progress that has been made concerning the structure of prokaryotic and eukaryotic ribosomes, the precise localization of modified sites in rRNAs demonstrated that most of these sites are located in important functional regions of the ribosome (Decatur and Fournier 2002; Ofengand 2002). In particular, in S. cerevisiae, 9 modified nucleotides are found near the A-, P- or E-site in the small subunit and 16 in the large subunit of the ribosome. These are likely candidates for a potential role in decoding accuracy. Since most single rRNA modifications appear to be dispensable for ribosome function, experimental studies on the effects

Translational Recoding and RNA Modifications 23

of rRNA modifications are likely to be feasible. The same tools that have been used to analyze tRNA modifications can now be used to study the role of rRNA modifications in recoding and decoding accuracy. Recent results demonstrate that the lack of U2552 methylation in the 23S subunit of rrmJ-deficient E. coli strains, leads to a decrease in programmed +1 and -1 translational frameshifing and in readthrough of UAA and UGA stop codons. This suggests that the interaction between aminoacyl-tRNA and U2552 is involved in the selection of the correct tRNA at the ribosomal A (Widerak et al. 2005). Finally, another possibility is that some effects might be dependent on the specific association of tRNA modifications with modifications of the rRNA. Associating tRNA and rRNA modification deficiencies in the same cell might reveal unexpected synergy between the role of tRNA and rRNA in decoding. Last but not least, to date, evidence for a functional recoding process in Archaea is very limited. The only case reported so far concerns a potential programmed -1 frameshifting in α-1-fucosidase of Sulfolobus solfataricus (CobucciPonzano et al. 2003). More systematic exploration of recoding in organisms of this third domain of Life, in particular the importance of modified nucleotides in both tRNA and rRNA (see e.g. Edmonds et al. 1991; McCloskey et al. 2001), may reveal additional parameters and/or mechanisms of recoding, as well as possibly reveal the evolutionary origin of recoding processes that are common to organisms of the three biological domains.

Acknowledgements We are grateful to Michaël Bekaert for help with the figures and to Anne-Lise Haenni for critical reading of the manuscript. This work was supported by the “Association pour la Recherche sur le Cancer” to JPR and HG and by the “Association Française contre les Myopathies” to JPR. Supports from the CNRS (Programme Interdépartemental de Géomicrobiologie des Environnements Extrêmes, Geomex 2002-2003) to HG is also acknowledged. We thank Glenn Björk, Maurille Fournier and Jaunius Urbonavicius for stimulating discussions and suggestions.

References Adamski FM, Donly BC, Tate WP (1993) Competition between frameshifting, termination and suppression at the frameshift site in the Escherichia coli release factor-2 mRNA. Nucleic Acids Res 21:5074-5078 Agris PF (1996) The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function. Prog Nucleic Acid Res Mol Biol 53:79-129 Agris PF (2004) Decoding the genome: a modified view. Nucleic Acids Res 32:223-238 Atkins JF, Herr AJ, Massire C, O'Connor M, Ivanov I, Gesteland R (2000) Poking a hole in the sanctity of the triplet code: Inferences for framing. In: R. A. Garrett SRD, A. Liljas,

24 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset A. T. Matheson, P. B. Moore, H. F. Noller (ed) The ribosome: Structure, function, antibiotics and cellular interactions. ASM Press, Washington, D.C. pp369-384 Baranov PV, Gurvich OL, Hammer AW, Gesteland RF, Atkins AF (2003) RECODE 2003. Nucleic Acids Res 31:87-89 Baranov PV, Gesteland RF, Atkins JF (2002) Recoding: translational bifurcations in gene expression. Gene 286:187-201 Baum M, Beier H (1998) Wheat cytoplasmic arginine tRNA isoacceptor with a U*CG anticodon is an efficient UGA suppressor in vitro. Nucleic Acids Res 26:1390-1395 Beier H, Barciszewska M, Krupp G, Mitnacht R, Gross HJ (1984) UAG readthrough during TMV RNA translation: Isolation and sequence of two tRNAsTyr with suppressor activity from tobacco plants. EMBO J 3:351-356 Beier H, Grimm M (2001) Misreading of termination codons in eukaryotes by natural nonsense suppressor tRNAs. Nucleic Acids Res 29:4767-4782 Bekaert M, Rousset JP (2005) An extended signal involved in eukaryotic –1 frameshifting operates through modification of the E site tRNA. Mol Cell 17 : 61-68 Bertram G, Innes S, Minella O, Richardson J, Stansfield I (2001) Endless possibilities: translation termination and stop codon recognition. Microbiology 147:255-269 Bertrand C, Prere MF, Gesteland RF, Atkins JF, Fayet O (2002) Influence of the stacking potential of the base 3' of tandem shift codons on -1 ribosomal frameshifting used for gene expression. RNA 8:16-28 Beuning PJ, Musier-Forsyth K (1999) Transfer RNA recognition by aminoacyl-tRNA synthetases. Biopolymers 52:1-28 Bidou L, Hatin I, Perez N, Allamand V, Panthier JJ, Rousset JP (2004) Premature stop codons involved in muscular dystrophies show a broad spectrum of readthrough efficiencies in response to gentamicin treatment. Gene Ther 11:619-627 Björk GR (1995) Genetic dissection of synthesis and function of modified nucleosides in bacterial transfer RNA. Prog Nucleic Acid Res Mol Biol 50:263-338 Björk GR, Durand JM, Hagervall TG, Leipuviene R, Lundgren HK, Nilsson K, Chen P, Qian Q, Urbonavicius J (1999) Transfer RNA modification: influence on translational frameshifting and metabolism. FEBS Lett 452:47-51 Björk GR, Rasmuson T (1998) Links between tRNA modification and metabolism and modified nucleosides as tumor markers. In: Grosjean H, Benne R (eds) Modification and editing of RNA, ASM press. Washington D.C. pp 471-491 Blight SK, Larue RC, Mahapatra A, Longstaff DG, Chang E, Zhao G, Kang PT, GreenChurch KB, Chan MK, Krzycki JA (2004) Direct charging of tRNA(CUA) with pyrrolysine in vitro and in vivo. Nature 431:333-335 Blum H, Gross HJ, Beier H (1989) The expression of the TMV-specific 30-kDa protein in tobacco protoplasts is strongly and selectively enhanced by actinomycin. Virology 169:51-61 Bonetti B, Fu LW, Moon J, Bedwell DM (1995) The efficiency of translation termination is determined by a synergistic interplay between upstream and downstream sequences in Saccharomyces cerevisiae. J Mol Biol 251:334-345 Brégeon D, Colot V, Radman M, Taddei F (2001) Translational misreading: a tRNA modification counteracts a +2 ribosomal frameshift. Genes Dev 15:2295-2306 Brierley I, Meredith MR, Bloys AJ, Hagervall TG (1997) Expression of a coronavirus ribosomal frameshift signal in Escherichia coli: influence of tRNA anticodon modification on frameshifting. J Mol Biol 270:360-373

Translational Recoding and RNA Modifications 25 Bubienko E, Cruz P, Thomason JF, Borer PN (1983) Nearest-neighbor effects in the structure and function of nucleic acids. Prog Nucleic Acid Res Mol Biol 30:41-90 Buck M, Ames BN (1984) A modified nucleotide in tRNA as a possible regulator of aerobiosis: synthesis of cis-2-methyl-thioribosylzeatin in the tRNA of Salmonella. Cell 36:523-531 Buckingham RH, Grosjean H (1986) The accuracy of mRNA-tRNA recognition. In: Kirkwood TB, Rosenberger RF, Galas DJ (eds) Accuracy in Molecular Processes. Chapman and Hall Ltd, London, pp 83-115 Bujnicki JM, Droogmans L, Grosjean H, K. PS, Lapeyre B (2004) Bioinformatics-guided identification and experimental characterization of novel RNA methyltransferases. In: Bujnicki JM (ed) Practical Bioinformatics. Springer-Verlag, Berlin, Heidelberg pp 139-168 Burkhardt N, Junemann R, Spahn CM, Nierhaus KH (1998) Ribosomal tRNA binding sites: three-site models of translation. Crit Rev Biochem Mol Biol 33:95-149 Carlson BA, Mushinski JF, Henderson DW, Kwon SY, Crain PF, Lee BJ, Hatfield DL (2001) 1-Methylguanosine in place of Y base at position 37 in phenylalanine tRNA is responsible for its shiftiness in retroviral ribosomal frameshifting. Virology 279:130135 Carlson BA, Kwon SY, Lee BJ, Hatfield D (2000) Yeast asparagine (Asn) tRNA without Q base promotes eukaryotic frameshifting more efficiently than mammalian Asn tRNAs with or without Q base. Mol Cells 10:113-118 Cassan M, Delaunay N, Vaquero C, Rousset JP (1994) Translational frameshifting at the gag-pol junction of human immunodeficiency virus type 1 is not increased in infected T-lymphoid cells. J Virol 68:1501-1508 Cassan M, Rousset JP (2001) UAG readthrough in mammalian cells: Effect of upstream and downstream stop codon contexts reveal different signals. BMC Mol Biol 2:3 Chittum HS, Lane WS, Carlson BA, Roller PP, Lung FD, Lee BJ, Hatfield DL (1998) Rabbit beta-globin is extended beyond its UGA stop codon by multiple suppressions and translational reading gaps. Biochemistry 37:10866-10870 Cobucci-Ponzano B, Trincone A, Giordano A, Rossi M, Moracci M (2003) Identification of an archaeal alpha-L-fucosidase encoded by an interrupted gene. Production of a functional enzyme by mutations mimicking programmed -1 frameshifting. J Biol Chem 278:14622-14631 Craigen WJ, Caskey CT (1986) Expression of peptide chain release factor 2 requires highefficiency frameshift. Nature 322:273-275 Crick FH (1966) Codon--anticodon pairing: the wobble hypothesis. J Mol Biol 19(2):54855 Curran JF (1998) Modified nucleosides in translation. In: Grosjean H, Benne R (eds) Modification and editing of RNA, ASM Press, Washington, D.C., pp493-516 Dao V, Guenther R, Malkiewicz A, Nawrot B, Sochacka E, Kraszewski A, Jankowska J, Everett K, Agris PF (1994) Ribosome binding of DNA analogs of tRNA requires base modifications and supports the "extended anticodon". Proc Natl Acad Sci USA 91:2125-2129 Davis DR (1998) Biophysical and conformational properties of modified nucleosides in RNA (nuclear magnetic resonance studies). In: Grosjean H, Benne R (eds) Modification and editing of RNA, ASM Press, Washington, DC pp 85-102

26 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset Davis DR, Veltri CA, Nielsen L (1998) An RNA model system for investigation of pseudouridine stabilization of the codon-anticodon interaction in tRNALys, tRNAHis and tRNATyr. J Biomol Struct Dyn 15:1121-1132 De Crécy-Lagard V (2004) Finding missing tRNA modification genes: a comparative goldmine. In: Bujnicki JM (ed) Practical Bioinformatics. Springer-Verlag, Berlin, Heidelberg pp 169-190 Decatur WA, Fournier MJ (2002) rRNA modifications and ribosome function. Trends Biochem Sci 27:344-351 Driscoll DM, Copeland PR (2003) Mechanism and regulation of selenoprotein synthesis. Annu Rev Nutr 23:17-40 Edmonds CG, Crain PF, Gupta R, Hashizume T, Hocart CH, Kowalak JA, Pomerantz SC, Stetter KO, McCloskey JA (1991) Posttranscriptional modification of tRNA in thermophilic archaea (Archaebacteria). J Bacteriol 173:3138-3148 Farabaugh PJ (1996) Programmed translational frameshifting. Annu Rev Genet 30:507-528 Fearon K, McClendon V, Bonetti B, Bedwell DM (1994) Premature translation termination mutations are efficiently suppressed in a highly conserved region of yeast Ste6p, a member of the ATP-binding cassette (ABC) transporter family. J Biol Chem 269:17802-17808 Feng YX, Hatfield DL, Rein A, Levin JG (1989) Translational readthrough of the murine leukemia virus gag gene amber codon does not require virus-induced alteration of tRNA. J Virol 63:2405-2410 Feng YX, Yuan H, Rein A, Levin JG (1992) Bipartite signal for read-through suppression in murine leukemia virus mRNA: an eight-nucleotide purine-rich sequence immediately downstream of the gag termination codon followed by an RNA pseudoknot. J Virol 66:5127-5132 Forster C, Chakraburtty K, Sprinzl M (1993) Discrimination between initiation and elongation of protein biosynthesis in yeast: identity assured by a nucleotide modification in the initiator tRNA. Nucleic Acids Res 21:5679-5683 Gallant JA, Lindsley D (1992) Leftward ribosome frameshifting at a hungry codon. J Mol Biol 223:31-40 Gesteland RF, Atkins JF (1996) Recoding: dynamic reprogramming of translation. Annu Rev Biochem 65:741-768 Gesteland RF, Weiss RB, Atkins JF (1992) Recoding: Reprogrammed genetic decoding. Science 257:1640-1643 Giegé R, Sissler M, Florentz C (1998) Universal rules and idiosyncratic features in tRNA identity. Nucleic Acids Res 26:5017-5035 Grentzmann G, Ingram JA, Kelly PJ, Gesteland RF, Atkins JF (1998) A dual-luciferase reporter system for studying recoding signals. RNA 4:479-486 Grosjean H, Benne R (eds) (1998) Modification and editing of RNA, ASM press edn. ASM Press, Washington, D.C. Grosjean H, Houssier C, Romby P, Marquet R (1998) Modulatory role of modified nucleotides in RNA loop-loop interactions. In: Grosjean H, Benne R (eds) Modification and editing of RNA, ASM press, Washington D.C., pp 113-133 Grosshans H, Lecointe F, Grosjean H, Hurt E, Simos G (2001) Pus1p-dependent tRNA pseudouridinylation becomes essential when tRNA biogenesis is compromised in yeast. J Biol Chem 276:46333-46339 Hansen TM, Baranov PV, Ivanov IP, Gesteland RF, Atkins JF (2003) Maintenance of the correct open reading frame by the ribosome. EMBO Rep 4:499-504

Translational Recoding and RNA Modifications 27 Harger JW, Dinman JD (2003) An in vivo dual-luciferase assay system for studying translational recoding in the yeast Saccharomyces cerevisiae. RNA 9:1019-1024 Harrell L, Melcher U, Atkins JF (2002) Predominance of six different hexanucleotide recoding signals 3' of read-through stop codons. Nucleic Acids Res 30:2011-2017 Hatfield D, Feng YX, Lee BJ, Rein A, Levin JG, Oroszlan S (1989) Chromatographic analysis of the aminoacyl-tRNAs which are required for translation of codons at and around the ribosomal frameshift sites of HIV, HTLV-1, and BLV. Virology 173:736742 Hatfield DL, Gladyshev VN (2002) How selenium has altered our understanding of the genetic Hoja U, Wellein C, Greiner E, Schweizer E (1998) Pleiotropic phenotype of acetyl-CoAcarboxylase-defective yeast cells-viability of a BPL1-amber mutation depending on its readthrough by normal tRNA(Gln)(CAG). Eur J Biochem 254:520-526 Hopper AK, Phizicky EM (2003) tRNA transfers to the limelight. Genes Dev 17:162-180 Horsfield JA, Wilson DN, Mannering SA, Adamski FM, Tate WP (1995) Prokaryotic ribosomes recode the HIV-1 gag-pol-1 frameshift sequence by an E/P site posttranslocation simultaneous slippage mechanism. Nucleic Acids Res 23:1487-1494 Houssier C, Grosjean H (1985) Temperature jump relaxation studies on the interactions between transfer RNAs with complementary anticodons. The effect of modified bases adjacent to the anticodon triplet. J Biomol Struct Dyn 3:387-408 Huang B, Johansson JO, Byström AS (2005) An early step in wobble uridine tRNA modification requires the elongator complex. RNA (in press) Jacks T, Madhani HD, Masiarz HD, Varmus HE (1988) Signals for ribosomal frameshifting in the Rous sarcoma virus gag-pol region. Cell 55:447-458 Jameson RR, Diamond AM (2004) A regulatory role for Sec tRNA[Ser]Sec in selenoprotein synthesis. RNA 10:1142-1152 Kadaba S, Krueger A, Trice T, Krecic AM, Hinnebusch AG, Anderson J (2004) Nuclear surveillance and degradation of hypomodified initiator tRNAMet in S. cerevisiae. Genes Dev 18:1227-1240 Kuchino Y, Beier H, Akita N, Nishimura S (1987) Natural UAG suppressor glutamine tRNA is elevated in mouse cells infected with Moloney murine leukemia virus. Proc Natl Acad Sci USA 84:2668-2672 Kuchino Y, Muramatsu T (1996) Nonsense suppression in mammalian cells. Biochimie 78:1007-1015 Kurland C, Gallant J (1996) Errors of heterologous protein expression. Curr Opin Biotechnol 7:489-493 Kurland CG (1992) Translational accuracy and the fitness of bacteria. Annu Rev Genet 26:29-50 Lafontaine DL, Preiss T, Tollervey D (1998) Yeast 18S rRNA dimethylase Dim1p: a quality control mechanism in ribosome synthesis? Mol Cell Biol 18:2360-2370 Landick R, Turnbough CL, Yanofsky C (1996) Transcription attenuation. In: F.C. Neidhardt RC, E.C.C. Lin, K.B. Low, B. Magasanik, J.L. Ingraham, W.S. Reznikoff, M. Riley, M. Schaeffer, H.E. Umbarger (ed) Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd edition. ASM Press, Washington, D.C. pp1263-1286 Larsen B, Peden J, Matsufuji S, Matsufuji T, Brady K, Maldonado R, Wills NM, Fayet O, Atkins JF, Gesteland RF (1995) Upstream stimulators for recoding. Biochem Cell Biol 73:1123-1129

28 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset Lecointe F, Namy O, Hatin I, Simos G, Rousset JP, Grosjean H (2002) Lack of pseudouridine 38/39 in the anticodon arm of yeast cytoplasmic tRNA decreases in vivo recoding efficiency. J Biol Chem 277:30445-30453 Leger M, Sidani S, Brakier-Gingras L (2004) A reassessment of the response of the bacterial ribosome to the frameshift stimulatory signal of the human immunodeficiency virus type 1. RNA 10:1225-1235 Li G, Rice CM (1993) The signal for translational readthrough of a UGA codon in Sindbis virus RNA involves a single cytidine residue immediately downstream of the termination codon. J Virol 67:5062-5067 Li GP, Rice CM (1989) Mutagenesis of the in-frame opal termination codon preceding nsP4 of Sindbis virus: studies of translational readthrough and its effect on virus replication. J Virol 63:1326-1337 Lovett PS, Ambulos NP Jr, Mulbry W, Noguchi N, Rogers EJ (1991) UGA can be decoded as tryptophan at low efficiency in Bacillus subtilis. J Bacteriol 173:1810-1812 Manuvakhova M, Keeling K, Bedwell DM (2000) Aminoglycoside antibiotics mediate context-dependent suppression of termination codons in a mammalian translation system. RNA 6:1044-1055 Marck C, Grosjean H (2002) tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA 8:1189-1232 Marczinke B, Hagervall T, Brierley I (2000) The Q-base of asparaginyl-tRNA is dispensable for efficient -1 ribosomal frameshifting in eukaryotes. J Mol Biol 295:179-191 Marquez V, Wilson DN, Tate WP, Triana-Alonso F, Nierhaus KH (2004) Maintaining the ribosomal reading frame: the influence of the E site during translational regulation of release factor 2. Cell 118:45-55 McCloskey JA, Rozenski J (2005) The small subunit rRNA modification database. Nucleic Acids Res (in press) McCloskey JA, Graham DE, Zhou S, Crain PF, Ibba M, Konisky J, Soll D, Olsen GJ (2001) Post-transcriptional modification in archaeal tRNAs: identities and phylogenetic relations of nucleotides from mesophilic and hyperthermophilic Methanococcales. Nucleic Acids Res 29:4699-4706 Motorin Y, Grosjean H (1999) Multisite-specific tRNA:m5C-methyltransferase (Trm4) in yeast Saccharomyces cerevisiae: identification of the gene and substrate specificity of the enzyme. RNA 5:1105-1118 Murgola EJ (1985) tRNA, suppression, and the code. In: Inc AR (ed) Annual Review of Genetics, pp 57-80 Murgola EJ (1995) Translational suppression: when two wrongs do make a right. In: D. Söll and UL RajBhandary (eds) tRNA: structure, biosynthesis and function. ASM Press, Washington, D.C. pp 491-509 Murgola EJ, Arkov AL, Chernyaeva NS, Hedenstierna KOF, Pagel FT (2000) rRNA functional sites and structures for peptide chain termination. In: R. A. Garrett SRD, A. Liljas, A. T. Matheson, P. B. Moore, H. F. Noller (eds) The ribosome: Structure, function, antibiotics and cellular interactions. ASM Press, Washington, D.C. pp509-518 Namy O, Hatin I, Stahl G, Liu H, Barnay S, Bidou L, Rousset JP (2002) Gene overexpression as a tool for identifying new trans-acting factors involved in translation termination in Saccharomyces cerevisiae. Genetics 161:585-594 Namy O, Hatin I, Rousset JP (2001) Impact of the six nucleotides downstream of the stop codon on translation termination. EMBO Rep 2:787-793

Translational Recoding and RNA Modifications 29 Namy O, Rousset JP, Napthine S, Brierley I (2004) Reprogrammed genetic decoding in cellular gene expression. Mol Cell 13:157-168 Napthine S, Vidakovic M, Girnary R, Namy O, Brierley I (2003) Prokaryotic-style frameshifting in a plant translation system: conservation of an unusual single-tRNA slippage event. EMBO J 22:3941-3950 Nierhaus KH, Spahn CM, Burkhardt N, Dabrowski M, Diedrich G, Einfeldt E, Kamp D, Marquez V, Patzke S, Schafer MA, Stelzl U, Blaha G, Willumeit R, B. SH (2000) Ribosomal elongation cycle. In: R. A. Garrett SRD, A. Liljas, A. T. Matheson, P. B. Moore and H. F. Noller (ed) The ribosome: Structure, function, antibiotics and cellular interactions. ASM Press, Washington, D.C. Noller HF, Yusupov MM, Yusupova GZ, Baucom A, Cate JH (2002) Translocation of tRNA during protein synthesis. FEBS Lett 514:11-16 Ofengand J (2002) Ribosomal RNA pseudouridines and pseudouridine synthases. FEBS Lett 514:17-25 Ofengand J, Del Campo M (2005) Modified nucleotides of Escherichia coli ribosomal RNA. In: F.C. Neidhardt RC, E.C.C. Lin, K.B. Low, B. Magasanik, J.L. Ingraham, W.S. Reznikoff, M. Riley, M. Schaeffer, H.E. Umbarger eds Escherichia coli and Salmonella: Cellular and Molecular Biology, 3rd edition. ASM Press, Washington, D.C., (in press) Ogle JM, Carter AP, Ramakrishnan V (2003) Insights into the decoding mechanism from recent ribosome structures. Trends Biochem Sci 28:259-266 Paul CP, Barry JK, Dinesh-Kumar SP, Brault V, Miller WA (2001) A sequence required for -1 ribosomal frameshifting located four kilobases downstream of the frameshift site. J Mol Biol 310:987-999 Persson BC (1993) Modification of tRNA as a regulatory device. Mol Microbiol 8:10111016 Pure GA, Robinson GW, Naumovski L, Friedberg EC (1985) Partial suppression of an ochre mutation in Saccharomyces cerevisiae by multicopy plasmids containing a normal yeast tRNAGln gene. J Mol Biol 183:31-42 Reil H, Hoxter M, Moosmayer D, Pauli G, Hauser H (1994) CD4 expressing human 293 cells as a tool for studies in HIV-1 replication: the efficiency of translational frameshifting is not altered by HIV-1 infection. Virology 205:371-375 Rodnina MV, Daviter T, Gromadski K, Wintermeyer W (2002) Structural dynamics of ribosomal RNA during decoding on the ribosome. Biochimie 84:745-754 Rodnina MV, Pape T, Savelsbergh A, Mohr D, Matassova NB, Wintermeyer W (2000) Mechanisms of partial reactions of the elongation cycle catalyzed by elongation factors Tu and G. In: R. A. Garrett SRD, A. Liljas, A. T. Matheson, P. B. Moore, H. F. Noller (eds) The ribosome: Structure, function, antibiotics and cellular interactions. ASM Press, Washington, D.C. pp301-317 Rodnina MV, Wintermeyer W (2001) Ribosome fidelity: tRNA discrimination, proofreading and induced fit. Trends Biochem Sci 26:124-130 Rozenski J, Crain PF, McCloskey JA (1999) The RNA modification database: 1999 update. Nucleic Acids Res 27:196-197 Satoh A, Takai K, Ouchi R, Yokoyama S, Takaku H (2000) Effects of anticodon 2'-Omethylations on tRNA codon recognition in an Escherichia coli cell-free translation. RNA 6:680-686 Schmeing TM, Moore PB, Steitz TA (2003) Structures of deacylated tRNA mimics bound to the E site of the large ribosomal subunit. RNA 9:1345-1352

30 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset Skuzeski JM, Nichols LM, Gesteland RF, Atkins JF (1991) The signal for a leaky UAG stop codon in several plant viruses includes the two downstream codons. J Mol Biol 218:365-373 Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res 26:148-153 Stahl G, Ben Salem S, Li Z, McCarty G, Raman A, Shah M, Farabaugh PJ (2001) Programmed +1 translational frameshifting in the yeast Saccharomyces cerevisiae results from disruption of translational error correction. Cold Spring Harb Symp Quant Biol 66:249-258 Stahl G, Bidou L, Rousset JP, Cassan M (1995) Versatile vectors to study recoding: conservation of rules between yeast and mammalian cells. Nucleic Acids Res 23:15571560 Steitz TA, Moore PB (2003) RNA, the first macromolecular catalyst: the ribosome is a ribozyme. Trends Biochem Sci 28:411-418 Takai K, Yokoyama S (2003) Roles of 5-substituents of tRNA wobble uridines in the recognition of purine-ending codons. Nucleic Acids Res 31:6383-6391 Takkinen K (1986) Complete nucleotide sequence of the nonstructural protein genes of Semliki Forest virus. Nucleic Acids Res 14:5667-5682 Tork S, Hatin I, Rousset JP, Fabret C (2004) The major 5' determinant in stop codon readthrough involves two adjacent adenines. Nucleic Acids Res 32:415-421 Tujebajeva RM, Copeland PR, Xu XM, Carlson BA, Harney JW, Driscoll DM, Hatfield DL, Berry MJ (2000) Decoding apparatus for eukaryotic selenocysteine insertion. EMBO Rep 1:158-163 Tzeng TH, Tu CL, Bruenn JA (1992) Ribosomal frameshifting requires a pseudoknot in the Saccharomyces cerevisiae double-stranded RNA virus. J Virol 66:999-1006 Urbonavicius J, Stahl G, Durand JM, Ben Salem SN, Qian Q, Farabaugh PJ, Bjork GR (2003) Transfer RNA modifications that alter +1 frameshifting in general fail to affect -1 frameshifting. RNA 9:760-768 Urbonavicius J, Qian Q, Durand JM, Hagervall TG, Björk GR (2001) Improvement of reading frame maintenance is a common function for several tRNA modifications. EMBO J 20:4863-4873 Walczak R, Westhof E, Carbon P, Krol A (1996) A novel RNA structural motif in the selenocysteine insertion element of eukaryotic selenoprotein mRNAs. RNA 2:367-379 Warner GJ, Berry MJ, Moustafa ME, Carlson BA, Hatfield DL, Faust JR (2000) Inhibition of selenoprotein synthesis by selenocysteine tRNASec lacking isopentenyladenosine. J Biol Chem 275:28110-28119 Weissenbach J, Dirheimer G, Falcoff R, Sanceau J, Falcoff E (1977) Yeast tRNALeu (anticodon UAG) translates all six leucine codons in extracts from interferon treated cells. FEBS Lett 82:71-76 Weissenbach J, Grosjean H (1981) Effect of threonylcarbamoyl modification (t6A) in yeast tRNA Arg III on codon-anticodon and anticodon-anticodon interactions. A thermodynamic and kinetic evaluation. Eur J Biochem 116:207-213 Widerak M, Kern R, Malki A, Richarme G (2005) U2552 methylation at the ribosomal Asite is a negative modulator of translational accuracy. Gene (in press) Wilson DN, Dalphin ME, Pel HJ, Major LL, Mansell JB, Tate W (2000) Factor-mediated termination of protein synthesis: a welcome return to the mainstream of translation. In: R. A. Garrett SRD, A. Liljas, A. T. Matheson, P. B. Moore, H. F. Noller (eds) The ri-

Translational Recoding and RNA Modifications 31 bosome: Structure, function, antibiotics and cellular interactions. ASM Press, Washington, D.C. pp 495-508 Winkler ME (1998) Genetics and regulation of base modification in the tRNA and rRNA of prokaryotes and eukaryotes. In: Grosjean H, Benne R (eds) Modification and editing of RNA, ASM Press, Washington, D.C. pp 441-469 Yarian CS, Basti MM, Cain RJ, Ansari G, Guenther RH, Sochacka E, Czerwinska G, Malkiewicz A, Agris PF (1999) Structural and functional roles of the N1- and N3protons of psi at tRNA's position 39. Nucleic Acids Res 27:3543-3549 Yarus M (1982) Translational efficiency of transfer RNA's: uses of an extended anticodon. Science 218:646-652 Yokoyama S, Nishimura S (1995) Modified nucleosides and codon recognition. In: D. Söll and UL RajBhandary (eds) tRNA: structure, biosynthesis and function. ASM Press, Washington, D.C. pp 207-223 Yu YT, Shu MD, Steitz JA (1997) A new method for detecting sites of 2’-O-methylation in RNA molecules. RNA 3:324-331 Yusupova GZ, Yusupov MM, Cate JH, Noller HF (2001) The path of messenger RNA through the ribosome. Cell 106:233-241 Zerfass K, Beier H (1992a) The leaky UGA termination codon of tobacco rattle virus RNA is suppressed by tobacco chloroplast and cytoplasmic tRNAs(Trp) with CmCA anticodon. EMBO J 11:4167-4173 Zerfass K, Beier H (1992b) Pseudouridine in the anticodon G-psi-A of plant cytoplasmic tRNA(Tyr) is required for UAG and UAA suppression in the TMV-specific context. Nucleic Acids Res 20:5911-5918

Grosjean, Henri Laboratoire d’Enzymologie et Biochimie Structurales, CNRS, F-91198 Gifsur-Yvette, France [email protected] Lecointe, François Laboratoire d’Enzymologie et Biochimie Structurales, CNRS, F-91198 Gifsur-Yvette, France Namy, Olivier Institut de Génétique et Microbiologie, Université Paris-Sud, F-91405 Orsay cedex, France Rousset, Jean-Pierre Institut de Génétique et Microbiologie, Université Paris-Sud, F-91405 Orsay cedex, France [email protected]

32 Olivier Namy, François Lecointe, Henri Grosjean, and Jean-Pierre Rousset

Data bases http://recode.genetics.utah.edu/ http://medstat.med.utah.edu/RNAmods http://www.unibareuth.de/departments/biochimie/rna http://medstat.med.utah.edu/SSUmods/ ftp://ncbi.nlm.nih.gov/genbank/genomes/ http://mbcr.bcm.tmc.edu/smallRNA/smallrna.html

Adenosine to inosine RNA editing in animal cells Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

Abstract Major advances in the understanding of adenosine deaminases acting on RNA (ADARs) have come from the generation of ADAR mutant animals. In mice, ADAR1 is a widely expressed essential gene and loss of function in embryos leads to apoptosis through unknown mechanisms in many different cell types. Mammalian ADAR2 is required primarily to edit glutamate receptor transcripts in the nervous system. The Drosophila melanogaster genome contains one Adar gene; mutant flies are normal in morphology and lifespan, but severely compromised neurologically and behaviourally. In C. elegans, double mutants in Adr1 and Adr2 genes are viable with chemosensory defects that appear to arise from interactions between RNA editing and RNA interference. ADARs also extensively deaminate long double-stranded (ds) RNA in a process that has been proposed to have antiviral effects. Genome sequences have facilitated progress in identifying edited RNAs. The majority of the twenty-three edited transcripts identified in Drosophila encode proteins involved in rapid chemical and electrical neurotransmission and extensive editing of embedded Alu RNAs has been found.

1 Introduction: ADAR RNA editing in vertebrates ADARs (Fig. 1) were first discovered in Xenopus laevis. Early efforts to extend antisense RNA injection experiments in oocytes with later injections to silence maternal transcripts expressed in activated eggs failed to produce gene silencing. It was found that silencing in activated eggs failed because the double-stranded (ds)RNA intermediate formed by pairing of the injected antisense RNA with the target transcript was unstable (Bass and Weintraub 1987; Rebagliati and Melton 1987). The instability was due to an activity released from the nucleus at germinal vesicle breakdown when the oocyte completes meiosis. When these experiments were conducted it was thought that stable dsRNA would produce silencing by inhibiting translation. The enzyme responsible was thought to have prevented silencing by an apparent helicase-like action that led to separation of RNA strands. The released target RNA was assumed to be still translatable. Further work showed that the resulting single-stranded (ss)RNA was altered and did not reanneal due to conversion of up to half of the adenosines in

Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106651 / Published online: 20 January 2005 © Springer-Verlag Berlin Heidelberg 2005

342 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

Fig. 1. Structures of ADAR genes from animal model organisms. Deaminase domains, double-stranded RNA binding domains (dsRBDs), and Z-DNA-binding domains are indicated. The glutamate in the deaminase active site is labelled E and the motifs in the deaminase domain that chelate zinc in the active site are labelled I to III.

dsRNA to inosine by an adenosine deaminase acting on RNA (ADAR) (Bass and Weintraub 1988). Inosine forms only one hydrogen bond with uracil, weakening the RNA strand pairing (Fig. 2B).

Adenosine to inosine RNA editing in animal cells 343

ADAR activity was also found in cultured mammalian cells (Wagner and Nishikura 1988). ADAR was purified using as an assay the conversion of radiolabelled adenosine in long synthetic dsRNA to inosine (Hough and Bass 1994; Kim et al. 1994a; O'Connell and Keller 1994). The reacted dsRNA is then digested to mononucleotides and these are separated in a thin layer chromatography system to quantitate the inosine. The gene was cloned using oligonucleotide primers based on peptide sequence (Kim et al. 1994b; O'Connell et al. 1995) and found to encode a protein comprising three dsRNA binding domains and an adenosine deaminase catalytic domain (Fig. 1). RT-PCR and cloning and sequencing of ADAR-treated dsRNA showed that adenosines are replaced by guanosines in the edited cDNA product since inosine can form two hydrogen bonds with cytosine (Bass and Weintraub 1988) (Fig. 2A), and prefers this base-pairing partner during cDNA synthesis. Adenosine to guanosine conversions in other cDNAs made from dsRNA viruses were recognised as probable examples of ADAR-mediated RNA editing (Bass et al. 1989). The first example of site-specific RNA editing was a conversion of a single adenosine to guanosine within the GluR-B transcript encoding subunit B of the glutamate-gate ion channel receptors. Editing leads to a glutamine (Q) codon (CAG) being converted to an arginine (R) codon (CIG) (Sommer et al. 1991). RNA editing at this GluR-B Q/R site is mediated by an intramolecular dsRNA structure formed between the exon and an editing site complementary sequence (ECS) in a flanking intron (Fig. 2) (Higuchi et al. 1993). Editing also occurs at a second site, the R/G site, elsewhere in the GluR-B transcript and the predicted RNA structure at this site is shown as an example in Fig. 2B. The first ADAR activity purified from mammalian cells did not efficiently catalyse editing at the GluR-B Q/R site, although it did show site-specific editing at another position within the intron in this transcript (Fig. 2). This led to the identification of a second ADAR, named ADAR2, which catalysed GluR-B Q/R site editing more efficiently in vitro than ADAR1 (Melcher et al. 1996a; O'Connell et al. 1997). Like ADAR1, ADAR2 converts adenosine to inosine in long dsRNA and was also purified using this non-specific editing activity (O'Connell et al. 1997). Site-specific editing in vitro is measured by reacting ADAR with a minisubstrate containing the editing site and ECS followed by ADAR removal and primer extension past the edited A in the presence of dideoxythymidine. The radiolabelled primer extension products are resolved by PAGE and the proportion of the product extending past the edited A to the next A in the sequence gives the editing efficiency. The edited substrate can also be amplified by RT-PCR and sequenced but unless a large number of individual cDNAs are cloned and sequenced this quantitates editing less accurately. RT-PCR and sequencing is used to quantitate editing levels in vivo. Both vertebrate ADAR1 and ADAR2 enzymes are capable of highly sitespecific RNA editing and will also edit up to half of all the adenosines in long dsRNA. The enzymes have different but partially overlapping site specificities in vitro. ADARs require at least seventeen base pairs of dsRNA for editing and it appears that any stretch of dsRNA of this length or longer is a potential target, whether it is formed by intermolecular or intramolecular RNA pairing. In addition,

344 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

Fig. 2. (A) Adenosine to inosine deamination. The reaction and the base-pairing consequences. (B) Structure of a short RNA substrate sufficient to support site-specific RNA editing by ADAR2 and by ADAR1 at the exonic GluR-B R/G site. At other editing sites the intronic pairing partner may be up to kilobases from the edited exon.

the RNA pairing within the substrate does not have to be perfect as there are usually mismatches between edited sites and ECS elements. Also, site specific RNA editing events are usually less than 100% efficient; GluR-B Q/R site editing is the exception. Only a small number of site-specific editing events are known and it is presumed that by far the greater proportion of inosine arises by relatively nonspecific editing in longer dsRNA stretches. A major goal of functional studies on ADAR1 and ADAR2 have been study the significance of different site-specific editing events and to determine whether the recoding role of site-specific editing or the non-specific editing of long dsRNA are more important for each ADAR.

Adenosine to inosine RNA editing in animal cells 345

1.1 Functional studies on vertebrate ADAR2 Glutamate is the major type of excitatory transmitter in the vertebrate brain and functional studies on vertebrate ADARs focussed initially on the role of ADAR2 in the editing of GluR-B Q/R and R/G sites (Melcher et al. 1996b). Q/R site editing also occurs in GluR-5 and GluR-6 transcripts that encode pore-forming subunits of glutamate receptors of the Kainate-sensitive class (Herb et al. 1996). The GluR-B R/G site in particular has been used for in vitro studies because a seventytwo base pair hairpin substrate supports editing here (Fig.2B) (Ohman et al. 2000), whereas the GluR-B Q/R site is within a very long predicted RNA structure and the minimal Q/R site substrate has only recently been defined (Stephens et al. 2004). R/G site editing also occurs in GluR-C and GluR-D transcripts encoding subunits of AMPA receptors (Lomeli et al. 1994). Five sites in the metabotropic, G protein-coupled, serotonin receptor 5-HT2C (Burns et al. 1997) are also edited. The unedited GluR-B(Q) version of the subunit has increased calcium permeability when expressed in electrophysiological test systems (Burnashev et al. 1992), and the ADAR2 knockout mice show increased calcium permeability in hippocampal neurons, localised hippocampal neurodegeneration and seizure susceptibility (Higuchi et al. 2000). The same phenotype arises when the ECS that directs editing at the GluR-B Q/R site is deleted from the GluR-B gene in mice (Brusa et al. 1995). Altering the genomic GluR-B gene to encode the edited GluRB(R) isoform and crossing this to ADAR2 knockout mice shows that the ADAR2 mutant phenotype is rescued to produce viable mice, suggesting that this is the most significant target of ADAR2. The mice survive weaning and appear to be normal but more subtle abnormalities that might result from loss of editing at other sites cannot be excluded (Higuchi et al. 2000). A continuing enigma is that genome-encoded GluR-B(Q) isoform has no apparent functional significance (Kask et al. 1998), since this site is edited with essentially 100% efficiency. The evolution of this RNA editing event has been investigated (Kung et al. 2001) and it appears to have arisen in Gnathostome fishes together with the acquisition of the intron that contains the ECS for this editing event. More primitive fishes have the R isoform encoded in the genome. Clearly, editing at this site probably confers some advantage by using ADAR to regulate channel function in response to signals that have not yet been discovered. More recent work has further underlined the importance of editing the GluR-B Q/R site in controlling glutamate receptor function. The arginine introduced by editing in the ion channel pore controls not only the calcium permeability of the assembled receptor but it also reduces the rate of receptor assembly and transport to synapses in culture hippocampal neurons (Greger et al. 2002, 2003). The GluR-B subunit forms heterotetramers with GluR-A, GluR-C, and GluR-D subunits which do not undergo editing at the Q/R site, have glutamine (Q) in the pore and in the absence of GluR-B(R) assemble more rapidly to form receptors with higher calcium permeability. However GluR-B is more highly expressed than the other subunits, occurs in central brain regions almost exclusively in the edited form and appears to establish a pool in the ER of partially assembled receptors that may limit transfer of the less abundant subunit types to synapses. By this means, the edited

346 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

GluR-B(R) form dominates the assembly and ion permeability of AMPA receptors. In some brain regions such as the cerebellum and in spinal cord GluR-B expression is lower and even though virtually 100% of the GluR-B transcript is edited some calcium-permeable AMPA receptors are assembled (Kawahara et al. 2003). Amyotrophic lateral sclerosis (ALS, also known as Lou Gehrig’s disease) is a neuromuscular degenerative illness in ageing human populations. Muscles undergo sclerosis associated with death of motor neurons in spinal cord and brain. Glutamate excitotoxicity has been long suspected to contribute (Shaw and Ince 1997), and mutations in a glutamate transporter, EAAT2, cause this disease in some families (Shaw and Eggett 2000). Recently it has been shown that RNA editing at the GluR-B Q/R site is reduced in the spinal motorneurons of ALS patients (Kawahara et al. 2004) suggesting that increased calcium permeability of AMPA receptors may contribute to motor neuron death. These patients have the sporadic form of ALS and do not have mutations in the glutamate transporter. The effect on RNA editing may be a result of feedback from alterations in neurotransmitter levels. Continuing work on editing of the metabotropic serotonin receptor 5-HT2C also focuses on regulation of editing by alterations in extracellular levels of neurotransmitter (Niswender et al. 1999). The commonly used antidepressant Prozac increases serotonin levels at synapses by inhibiting a transporter involved in serotonin reuptake. Editing at the C’ site in HT2C is reduced in brains of depressed suicide victims and increased in mice by Prozac treatment (Niswender et al. 2001; Gurevich et al. 2002a, 2002b). Establishing the mechanism by which variations in neurotransmitter affect ADAR activity now appears to be the most interesting question. Studies on glutamate receptor editing in brain material taken from patients with treatment-resistant temporal lobe epilepsy show coordinated increases or decreases in editing at Q/R and R/G sites in individual patients (Vollmar et al. 2004). One possible mechanism of regulating ADAR activity is through dimerization. Several groups have shown that both ADAR1 and ADAR2 must form dimers on substrate RNAs for catalysis to occur (Jaikaran et al. 2002; Cho et al. 2003; Gallo et al. 2003). This process could be open to regulation by intracellular signalling systems. 1.2 Functional studies on vertebrate ADAR1 The ADAR1 gene encodes a predominant short form that is primarily nuclear (Kawakubo and Samuel 2000) and an interferon-inducible cytoplasmic form that is 296 amino acids longer at the amino terminus (George and Samuel 1999). The long form of ADAR1 is a shuttling protein that enters the nucleus and is exported back to the cytoplasm (Fig. 3), due to a nuclear export sequence located the first Z-DNA-binding domain (Poulsen et al. 2001). Inhibiting nuclear export with leptomycin causes the long form of ADAR1 to accumulate in the nucleus. The short form of ADAR1 is predominantly nuclear and like ADAR2 it accumulates in the nucleolus (Fig. 3) (Desterro et al. 2003). The nucleolus may be a site of sequestration for ADARs as transfection of cells with a plasmid expressing the GluR-B Q/R

Adenosine to inosine RNA editing in animal cells 347

Fig. 3. Subcellular localizations of vertebrate ADAR proteins.

minisubstrate, which has sites edited by both ADAR1 and ADAR2, leads to relocalization of both proteins out of the nucleolus and into nucleoplasm. The extra 296 amino acids in the long form of ADAR1 encodes a Z-DNAbinding domain in addition to the one that is present at the N-terminus of the short nuclear form of ADAR1 (Herbert et al. 1998). The Z-DNA-binding domains bind Z-DNA formed in vitro and ADAR was found in a screen performed to find proteins that bind to Z-DNA. A crystal structure of the Z-DNA binding domains of ADAR1 binding to DNA has been obtained. (Schwartz et al. 1999). To reconcile the role of an RNA processing enzyme with Z-DNA-binding it has been proposed that ADAR1 recognises Z-DNA formed near transcription start sites or during transcription due to local changes in supercoiling (Herbert 1996). It is possible that the Z-DNA-binding domains have roles in protein-protein interaction as they show homology to a region of vaccinia virus E3L protein that interacts with dsRNA-activated protein kinase R (PKR). The transcriptional induction of the longer cytoplasmic form of ADAR1 from a separate promoter by interferon suggests an anti-viral role. Some viral dsRNAs are found only in cytoplasm and cytoplasmic ADAR1 edits cytoplasmic dsRNA very efficiently (Wong et al. 2003). When ADAR2 is redirected to the cytoplasm by addition of a nuclear export signal it also efficiently edits cytoplasmic substrates. Whether cytoplasmic ADAR1 is subject to other types of regulation or autoregulation remains to be determined.

348 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

Table 1. Comparison of ADAR loss of function phenotypes in different animal models.

The fate of the highly modified dsRNA that results from deamination by ADAR has been investigated. Edited dsRNA generated by nuclear versus cytoplasmic ADARs have different fates. Infection with polyoma virus leads to production of dsRNA due to transcription around the circular DNA genome in both directions at later stages of infection. This dsRNA is extensively deaminated and is recognised by an abundant protein complex containing p56/NonA and the polypyrimidine tract binding protein PTB (Zhang and Carmichael 2001). This RNA does not leave the nucleus but accumulates there during a one hour experiment (Kumar and Carmichael 1997). What happens to this material in the longer term has not been reported. In cytoplasmic extracts edited dsRNA is targeted by a specific nuclease that cleaves at symmetrical IU base pairs (Scadden and Smith 2001b), and may initiate degradation, although further degradation of this material has not been shown in extracts. It has recently been shown that two adjacent I-U base pairs are sufficient to create a region of pronounced instability in dsRNA (Serra et al. 2004). ADAR1 knockout mice have been recently described (Hartner et al. 2004; Wang et al. 2004), that are heterozygous viable and homozygous embryonic lethal at embryonic day 12.5 (Table 1) (Hartner et al. 2004; Wang et al. 2004). Dying embryos are visibly pale due to failure to establish haematopoiesis in the liver. However death at this stage is unlikely to arise solely from loss of haematopoiesis. Apoptotic cells are seen in vertebrae and heart as well as in liver. ADAR1 is more widely and strongly expressed in mesoderm-derived tissues than ADAR2. Mouse embryo fibroblasts cultured from ADAR-/- embryos undergo apoptosis in response to serum starvation suggesting that loss of ADAR1 has similar pro-apoptotic effects on a wide range of different cells (Wang et al. 2004). Apotosis in cultured cells lacking ADAR1 may be a useful phenotype for further work but it does not point to any particular type of editing target since the number of defects that can lead to apoptosis is large. Apotosis is not relieved by mutations in protein kinase R (PKR) implying that the lethality is not mediated through this pathway. Combining ADAR1 and ADAR2 mutations and circumventing the early lethality of ADAR1 by culturing ES cells from E11.5 brains for three weeks, through to neural differentiation, allows the contributions of the ADARs to site-specific editing at individual sites to be established, although only the serotonin receptor editing sites have been fully reported (Hartner et al. 2004). Double mutants eliminate

Adenosine to inosine RNA editing in animal cells 349

all editing at all serotonin receptor sites. This confirms that the considerable residual editing seen at most sites in the ADAR2-/- mice was due to ADAR1. Sites A and B within the editing site cluster in the serotonin receptor are specific for ADAR1 and ADAR2 does not edit them even in ADAR1-/- neurons. This shows that ADAR1 does indeed participate in site-specific editing in vivo and has a distinct specificity. ADAR2 does edit sites C, C’, and D in the cluster, however, and these sites have a more significant effect on G-protein coupling than the ADAR1 sites. Therefore, there is still no transcript known in which a recoding role for ADAR1 is clearly critical. Since ADAR1 clearly engages in some site-specific RNA editing it is not clear whether this or some general role in targeting dsRNA is the source of the apoptotic phenotype in ADAR1-/- embryos. ADAR1 does edit the GluR-B Q/R site in the absence of ADAR2 but in normal animals competition been the two ADARs could render the ADAR1 contribution less significant (Higuchi et al. 2000). Generating ADAR mutant mice that produce ADAR proteins with catalytic site mutations may be necessary to resolve competition effects and to determine whether ADARs have roles as RNA-binding proteins over and above their catalytic roles. In contrast to the lethality obtained with mouse ADAR1 knockouts point mutations leading to truncation in ADAR1 in humans produce a mild heterozygous dominant phenotype (Miyamura et al. 2003; Zhang et al. 2004). Dominant autosomal dyschromatosis symmetrica hereditaria (DSH) affects skin pigmentation on the backs of both hands and feet from birth. Patients have apparently normal physical and mental abilities. The symmetrical effects suggest that the phenotype may relate to alterations in melanocyte precursors in the neural crest during embryonic development. Transcripts edited by ADAR1 could mediate both this effect and the apoptosis seen in ADAR1-/- mice but these transcripts remain to be identified. 1.3 Other ADAR genes in vertebrates The evolution of ADAR genes has been recently reviewed (Keegan et al. 2004). Vertebrates have an ADAR3 gene that is very similar to ADAR2, with expression strictly limited to the brain (Melcher et al. 1996b; Chen et al. 2000). ADAR3 protein shows neither non-specific activity on dsRNA nor is it active on any ADAR2 substrate tested even though essential residues in the deaminase domain are conserved. ADAR3 may have an unknown, highly specific substrate or might interact in some way with ADAR2. Vertebrates also have an ADAR-like gene called TENR that is expressed in the male germline (Hough and Bass 1997). It lacks deaminase domain residues involved in zinc chelation and catalysis and the gene has not been studied functionally.

350 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

1.4 Searches for edited transcripts in human cells The proportion of inosine among the canonical nucleotides in hydrolyzed polyA+ RNA from mammalian tissues was measured (Paul and Bass 1998). All the nucleotides were labelled with 32P and the labelled IMP purified through a series of TLC steps using UV visualization of spiked unlabelled IMP to follow and recover and quantitate the trace of labelled IMP. Inosine was estimated at one in 17,000 nucleotides in brain and one in 32,000 nucleotides in heart. This is a surprising high level of inosine; equivalent to one inosine in every sixth transcript in brain if the average transcript size is taken as 3 kb and at least one hundredfold more inosine that accounted for by the know site-specific editing events. The finding has strongly encouraged searches for RNA editing events. A chemical screen was devised to identify inosine positions in transcripts in polyA+ RNA (Morse and Bass 1997). RNase T1 normally cleaves 3’ to either guanosine or inosine. Guanosine has an exocyclic amino group not present in inosine that allows guanosine to form the third hydrogen bond with cytosine (Fig. 2A). Glyoxal targets this exocyclic amino group on guanosine to form a covalent adduct that blocks RNase T1 cleavage 3’ of guanosine. Glyoxalated poly(A)+ RNA was either digested with RNase T1 or the RNase T1 was omitted. The cleavage products were amplified using a complex PCR strategy and candidate ADAR substrates were identified as RNase T1-dependent bands on differential display gels. Cloning and sequencing these products identified instances in which inosine occurs at a position where adenosine is found in genomic sequence (Morse and Bass 1999; Morse et al. 2002). The edited transcripts identified in this way in RNA from C. elegans and from human cells (Morse and Bass 1999; Morse et al. 2002), all contained large numbers of editing events; no transcripts having single site-specific editing events were detected, suggesting practical limitations of this daunting chemical search technique. Several examples were long RNA hairpins found in 3’ UTR regions of transcripts; such as inverted repeats of Alu elements (Morse et al. 2002). Alu elements make up 10% of the human genome and repeated sequences constitute 40% of the genome. Many copies of Alu repeats occur in polyA+ RNA, often in transcripts with retained introns and other aberrant transcripts. Two groups have now taken advantage of full genome sequences and large sets of betterquality cDNA sequences in public databases to find editing events by searching for mismatches between cDNA and the corresponding genome sequence. The two groups use different strategies to distinguish likely RNA editing events from single nucleotide polymorphisms (SNPs) between genomic DNA sequences of different individuals. Levanon et al. (2004), sought genome to cDNA mismatches only within potential RNA hairpins in transcripts. By aligning the reverse complements of human ESTs and cDNA sequences with the corresponding genomic region potential edited dsRNA regions were identified that have more than 85% identity over 32 base pairs or longer (Levanon et al. 2004). Kim et al. (2004), aligned cDNA and genomic DNA directly and selected for multiple mismatches close together, which is not likely to be the case with SNPs (Kim et al. 2004). After extensive cleaning procedures, which probably also removed many edited se-

Adenosine to inosine RNA editing in animal cells 351

quences, the two groups found over a thousand genes, most of which are presumably the same ones, which appear to be edited. Levanon et al. confirmed experimentally that about 90% are indeed edited. In both screens approximately 90% of sites occur within Alu repeats, with editing at multiple adjacent bases. Kim et al. point out that most of the RNA editing events are detected in nonstandard transcripts, for example, a retained intron in the APOBEC3G transcript that contains oppositely oriented Alu sequences. This transcript will probably be subject to degradation due to nonsense mediated decay and Kim et al suggest that editing may help to mark it for degradation. The Alu hairpin in this transcript is also a potential target for DICER cleavage. The non-standard transcript does allow the edited Alus to be detected in bioinformatic screens but this intron is also correctly spliced out since this gene encodes a functional protein. The edited Alus are likely to be edited also in introns in pre-mRNA that are subsequently correctly spliced out.

2 RNA editing in Drosophila Drosophila melanogaster has a single Adar located on the X chromosome (Palladino et al. 2000a). dADAR contains conserved structural motifs such as two dsRBDs and an adenosine deaminase catalytic domain. Many different isoforms can be generated from the same locus via two alternative promoters (promoter -4a, -4b) and the alternative splicing of four exons. Expression occurs from the -4a promoter at all stages of development; expression from the -4b promoter is detectable in pupal and adult flies. Inclusion or exclusion of alternative exon 3a (111 bases) changes the spacing between the dsRNA-binding domains of dADAR. Transcripts arising from the -4b promoter lack exon 3a (Palladino et al. 2000a). dADAR functions as a dimer; each homodimer may have a different target specificity or level of activity, and heterodimeric combinations may yield additional functional diversity (Gallo et al. 2003). The Drosophila melanogaster Adar transcript itself is a target of editing at a single editing site within the deaminase domain. Editing at this site is developmentally regulated (Palladino et al. 2000a). Editing generates an ADAR isoform with reduced activity and this appears to be a mechanism for negative autoregulation (L. Keegan et al. submitted). The Adar transcript is enriched in the nervous system of the embryo (Palladino et al. 2000a). Null mutants of Adar are morphologically normal but show marked abnormalities in posture, and move and mate only with great difficulty (Table 1) (Palladino et al. 2000b). In addition, Adar mutations confer temperature-sensitive paralysis and obsessive grooming behaviour. Amazingly, even though numerous neurological and behavioural deficits are manifest, these mutants have a normal lifespan when kept with meticulous care and in the absence of any competition for resources. Flies lacking editing exhibit frequent seizures that increase in severity with age and head sections show marked neurodegeneration with increasing

352 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

Table 2. Known targets of RNA editing identified in Drosophila primarily encode ion channel subunits and synaptic fusion proteins. Ontology and Accession number

Common name

Voltage-gated ion channel CG1522

-

CG9907 CG9071-RA CG15899-RB (CG4222) CG4894-RD CG12295-RB CG12348-RB CG10952-RA CG10693-RB CG1066-RA

cacophony (cac) para

Edited position(s) in reference Edits Reference(s) transcript Edited positions in transcripts are given with reference to specific splicing isoforms. -

11

(Smith et al. 1998)

-

11

(Hanrahan et al. 1998, 2000; Palladino et al. 2000a) (Hoopengardner et al. 2003)

DSC 3920 (NaCP60E) D. pseudoobscura-specific DSC edit, 3995; see Fig. 4. Ca++ chan- 3612 nel, T-type Ca alpha1D 2061, 2083, 2097, 2098, 2139 (DmCa1D) a2d L-type 1982, 2042, 2153 subunit Shaker (Sh) 1072, 1073, 1620, 1800, 1875, 1882 ether-a-go- 1864, 2107, 2159, 2163, 2177, go (eag) 2560 slowpoke 1225, 3364 (slo) Shab 1746, 1747, 1927, 1979, 2040, 2041

1 1

ibid

5

ibid

3

ibid

6

ibid

6

ibid

2

ibid

6

(Bhalla et al. 2004) (R. Reenan, unpublished) -

Ligand-gated ion channel CG7535 DrosGluCl

-

-

-

4

CG4128

-

6

1872, 1873, 2020, 2023, 2028, 2037, 2049 384, 385, 435, 436

7 4

(Hoopengardner et al. 2003) ibid

963, 968

2

ibid

728, 735, 1218, 1251, 1448, 1449

6

ibid

CG4498-RA CG11348-RA CG6798-RA CG10537-RA

nAcRalpha30D (Dalpha6) nAcRalpha34E nAcRbeta64B nAcRbeta96A Rdl

(Semenov and Pak 1999) (Grauso et al. 2002)

Adenosine to inosine RNA editing in animal cells 353 Ontology and Accession number

Common name

Synaptic release machinery synaptotagmin (syt) CG2999-RA unc-13 CG40306-RB stoned B (CG12473) (stnB) CG32490-RA complexin (cpx) CG2520-RA lap RNA editing CG12598 dADAR RNA binding CG3312

Rnp-4f

Total genes = 23 -

Edited position(s) in reference Edits Reference(s) transcript Edited positions in transcripts are given with reference to specific splicing isoforms. 1175, 1212, 1223, 1291

4

7673 6226

1 1

(Hoopengardner et al. 2003) ibid ibid

709, 722, 723

3

ibid

1529 -

1 1

ibid

-

vari- (Petschek et al. able* 1996; Peters et al. 2003) Total edits = 92*

-

(Palladino et al. 2000a)

vacuolization in the brain as well as retinal degeneration with age (Palladino et al. 2000b). This suggests that A-to-I RNA editing can exert subtle and compounded affects upon nervous system function, leading to changes in behaviour. 2.1 Edited transcripts in Drosophila: from serendipity to systematic identification Known targets of A-to-I pre-mRNA editing are almost exclusively nervous system-specific; such target classes include voltage-gated ion channels, ligand-gated ion channels, and components of the synaptic release machinery (Table 2) (Hoopengardner et al. 2003). As in vertebrates the first A-to-I edited transcripts in Drosophila were discovered serendipitously as A to G discrepancies between cloned cDNAs and the corresponding genomic sequence. These fortunate discoveries include transcripts of the Rnp-4F gene (31% of adenosines in a 3.2 kb transcript, 263 sites) (Petschek et al. 1996) the voltage-gated calcium channel subunit gene cacophony (11 sites) (Smith et al. 1998), the voltage-gated sodium channel subunit gene paralytic (11 sites) (Hanrahan et al. 2000), the glutamate-gated chloride channel subunit gene

354 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

Fig. 4. Prediction and experimental confirmation of an RNA editing site in Drosophila. (A) Scale representation of the D. melanogaster DSCI (CG9071) Na+-channel transcription unit in the region of the RNA editing site. Boxes represent exons and the line represents introns. Exon 15 contains the editing site. (B) Bar graph showing percent sequence identity for the exons of DSCI transcript CG9071-RA. Exon numbering corresponds with CG accession number. Exon 15 indicates the edited exon, exons 12, 13, 14, 16, and 17 are the flanking exons shown in (A). (C) Sequence analysis of DSCI RT-PCR amplification products from whole-fly RNA. Shown are electropherograms of products generated from dADAR+ (top), dADAR– (middle), and D. pseudoobscura flies (bottom). Sequences are labelled above the electropherogram in codon triplets with editing sites indicated by mixed A/G signals. Amino acid sequences for each codon are shown above the nucleotide sequence with the amino acid change shown for the editing site by an arrow. (D) An additional editing site seen in D. pseudoobscura. Electropherogram from D. pseudoobscura RTPCR products (top) and D. pseudoobscura genomic DNA amplification products (middle). No editing was seen in D. melanogaster dADAR+ flies (bottom).

Adenosine to inosine RNA editing in animal cells 355

DrosGluCl (4 sites) (Semenov and Pak 1999), the adenosine deaminase dADAR gene itself (1 site) (Palladino et al. 2000a) and the nicotinic acetylcholine receptor gene Dalpha5 (7 sites) (Grauso et al. 2002). A systematic approach for the detection of A-to-I editing sites has been developed by comparing genome sequences between two related but reasonably divergent species, D. melanogaster and D. pseudoobscura (Hoopengardner et al. 2003) (Fig. 3). ADAR enzymes recognize an imperfectly duplex substrate structure comprising the editing site and an editing site complementary sequence (ECS); this method of enzyme target recognition precludes the identification of consensus sequence motifs common to editing sites (Higuchi et al. 1993). However, comparative sequence analysis can be used to identify editing sites shared between species by identifying highly conserved exons as RNA-editing signatures. These areas of substantial sequence conservation reflect retention of a dsRNA substrate structure. Several ontological gene classes have been examined for RNA-editing signatures, for a total of 914 genes examined via this method. The gene classes in which such signatures were found included voltage-gated ion channels, ligandgated ion channels (11 positive, n = 135) and synaptic release machinery (4 positive, n = 102). Several additional gene classes were screened for editing signatures and included all known Drosophila G protein-coupled receptors (n = 178) and transcription factors (n = 499), as well as more than 100 genes from various ontological groupings such as circadian rhythm, learning, and memory (B. Hoopengardner, unpublished). Genes of these classes with potential editing signatures were examined by direct sequencing of RT-PCR products in Adar mutants; all such candidates were found to lack RNA editing. The slow evolutionary change in these duplex regions argues for selective constraint upon them and false positives were often due to conservation of alternatively spliced exons. In addition, although RNA editing might potentiate formation of new species through effects on behaviour, the detection of species-specific RNA editing remains problematic. The apparent majority of transcripts edited by dADAR are ion channels, including both voltage-gated and ligand-gated channels. Voltage-gated channels, which are edited, include sodium channels, calcium channels, potassium channels, and chloride channels (Table 2, references therein). Ligand-gated ion channels, which are edited by ADAR, include nicotinic acetylcholine receptor subunits and the aminobutyric acid (GABA) receptor Resistance to Dieldrin (Rdl) (Table 2, references therein) but do not include the homologs of the vertebrate glutamate receptors or the metabotropic serotonin receptor HT2C. Editing sites have also been found in transcripts encoding several components of the synaptic release machinery, including Synaptotagmin (4 sites), D-unc-13 (1 site), Stoned B (1 site), complexin (3 sites), and lap (1 site). Such sites likely reflect the necessity for neurons to respond rapidly to signalling events via the release of neurotransmitters into the synaptic cleft; changes that increase the rate of vesicle fusion would be advantageous. Two transcripts encoding RNA-binding proteins are also edited. One is the Adar transcript itself. The Rnp-4F transcript encoding an RNA-binding protein involved in recycling of spliceosomes is edited (Petschek et al. 1996; Peters et al.

356 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

2003). The editing reported for the Rnp-4F transcript is extensive, with one transcript containing 31% of exonic adenosines converted to guanosine; this editing is believed to be the result of extensive base-pairing with a readthrough transcript from the sas-10 gene. Rnp-4F is expressed in many embryonic tissues and in the adult head (Petschek et al. 1996; Peters et al. 2003). Eventually, it should be possible to carry out the homology-based search on every gene in the Drosophila genome. Furthermore, this search method can be extended to vertebrates. A human homologue (KCNA1) of Drosophila SHAKER was identified as a target of RNA editing by comparisons of human, rat, and mouse Kv1.1 sequences and the resultant discovery of extensive exonic sequence conservation (Hoopengardner et al. 2003). Intriguingly, although the human and Drosophila editing events target the same codon they have arisen by convergent evolution of different RNA structures (Bhalla et al. 2004).

3 RNA editing in squid The squid giant axon is a classical material for the study of ion channel function. Cloning of squid genes encoding voltage gated potassium channels showed that these transcripts are very highly edited (Patton et al. 1997; Rosenthal and Bezanilla 2002). Edited channel subunits have been shown to assemble less efficiently than unedited subunits. It has been proposed that editing in squid may be correlated with temperature differences between shallow and deep water habitats as editing levels appear to be higher in cold adapted squid. Editing often leads to introduction of amino acids with smaller side chains as seen in anti-freeze proteins of fish (Rosenthal and Bezanilla 2002). Squid RNA editing is intriguing because so many sites are edited and the squid ADAR is extremely active in vitro (J. Rosenthal and M. O’Connell, unpublished).

4 RNA editing in C. elegans C. elegans has two ADAR genes. Adr1 has two dsRBDs and conserves deaminase domain residues required for editing, whereas Adr2 has just one dsRBD and lacks a conserved glutamate in the deaminase domain that is required for activity (Hough et al. 1999; Tonkin et al. 2002). The Adr1 and Adr2 genes in C. elegans are too divergent from vertebrate ADARs to be described as either ADAR1 or ADAR2-like (Keegan et al. 2004), and the names do no imply any such relationship. Serendipity did not identify edited transcripts in C. elegans. Edited transcripts here have been identified by the inosine-specific glyoxal screening method that has already been described above (Morse and Bass 1997, 1999; Morse et al. 2002). Several edited transcripts were long RNA hairpins found in 3’ UTR regions of transcripts, some of which were expressed in neurons. ADR1 is expected to be the primary catalytic subunit based on the deaminase domain sequence and mu-

Adenosine to inosine RNA editing in animal cells 357

tants in Adr1 alone or in Adr1, Adr2 double mutants show defects in chemotaxis and loss of editing at all known sites (Tonkin et al. 2002). Mutants in Adr2 alone show some slight defects however, suggesting that it is required to cooperate with Adr1 for full activity. Adr1, Adr2 double mutants were shown to increase transgene silencing in C. elegans (Tonkin and Bass 2003). It was suggested that this arises because editing normally antagonizes RNA interference. This may be the main role of RNA editing in C. elegans since mutations preventing RNA silencing in C. elegans alleviate the chemotaxis defect in Adr1, Adr2 double mutants. Therefore, the chemotaxis defect might arise because some target transcript requires editing to protect it against RNA silencing. These findings have prompted one suggestion for the function of RNA editing in human Alu RNA hairpins, i.e. that editing destabilizes these RNA hairpins to prevent them being cleaved by DICER and triggering further RNAi responses (Kim et al. 2004). Very highly edited dsRNA generated by ADAR treatment in vitro has been shown to be a very inefficient substrate for DICER cleavage in vitro (Scadden and Smith 2001a). However, the levels of editing seen in vivo are often not so high as in the in vitro edited dsRNA. MicroRNAs have been shown to be edited (Luciano et al. 2004), and possibly the dsRNA products of DICER cleavage can be further edited. This could affect their incorporation into RISC complexes or the siRNA in the resulting RISC complexes might no longer correctly recognize the target RNA. It is not clear that the inosine in the dsRNA rather than, for instance, binding-competition or protein–protein interaction between ADAR itself and RNA silencing proteins is the basis for this effect in vivo.

Conclusion In each of the major animal model organisms there are now complete ADAR loss of function mutants available for study. This has allowed the roles of the multiple vertebrate ADARs to be distinguished. Important questions remain about the regulation and the biological purpose of editing in ion channel transcripts by ADAR2. Comparisons between ADAR loss of function mutants in different species suggest that ADAR2 and Drosophila Adar are functionally similar (L. Keegan, unpublished), in targeting nervous system transcripts. Although the target transcripts are not exactly the same, the overall function of ADAR2-type editing is likely to be conserved between flies and vertebrates. Comparison between mouse and suitable vertebrate genome sequences and other bioinformatic methods should allow the full complement of conserved vertebrate site-specific targets to be defined. At present, the list of known ion-channel target transcripts is much larger in Drosophila. The discovery of previously unknown editing sites in intensely studied proteins like Drosophila SHAKER and human KCNA1 shows just how easily the range of vertebrate targets could still be underestimated (Hoopengardner et al. 2003).

358 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan

The neurodegeneration that occurs in the hippocampus in ADAR2 mutant mice, in motorneurons in ALS and more widely in Adar mutant flies also deserves investigation. The suggested role of RNA editing in neurodegeneration in motorneuron disease must be confirmed. Riluzole, which interferes with glutamate neurotransmission is currently used to treat ALS. Studies on glutamate receptor transcript editing in motor neurons could lead to a better understanding of neuron death by glutamate excitotoxicity and to improved drugs to slow neuron death. The key targets of ADAR1 are still unknown, nor is not known whether the targets should be sought among nuclear or cytoplasmic RNAs The ADAR1 knockouts described so far do not distinguish between potentially different roles of the longer cytoplasmic and the shorter nuclear form of ADAR1. Differential inactivation of these isoforms is possible and may help to resolve this, together with further tissue-specific ADAR1 inactivation studies. It is not known whether the critical editing events are site-specific RNA editing events or generalised editing on long dsRNA. There are no ADAR mutations that separate the two classes of editing although lowering ADAR protein levels might affect one type of editing more than the other. ADAR1 shows similarities to PKR in the exon structure of its dsRNA binding domains and could have evolved in the chordate lineage as an aspect of innate immunity to foreign dsRNA and interferon responses. It is possible that the interaction between RNA editing and RNA interference discovered in C. elegans is a remnant of such an innate immunity. Searching for interactions between ADAR1 and RNA interference is clearly a promising approach. Editing events in Alu sequences and other hairpin RNAs, must now be experimentally tested for the possible effects that have been proposed such as inhibition of RNAi, or nuclear retention and destruction of aberrant transcripts. These editing events are probably not restricted to non-standard transcripts and they may represent just the tip of an iceberg of editing that also occurs in normally spliced introns and within pre-mRNAs. There are indications that editing may facilitate splicing at the GluR-B Q/R site and several of the known examples of editing in glutamate or serotonin receptors involve predicted RNA hairpins that cover splice sites. It would be interesting to identify cases for further study where the newly-predicted RNA hairpins overlap or abut splice sites. Perhaps certain introns are less efficiently spliced in ADAR mutant animals? In Drosophila and C. elegans an important issue is to identify the phenocritical editing events, the one that most influence the editing loss and gain of function phenotypes. This is important for maximizing the experimental value of these powerful genetic systems. Testing the edited forms of a number of transcripts for rescue of aspects of the Adar mutant phenotype is possible in Drosophila and in C. elegans. Drugs to decrease RNA editing might be designed by introducing an effective adenosine base analog inhibitor (Haudenschild et al. 2004), at the edited site in dsRNA mimics of characterized editing sites if these can then be effectively introduced in the body. Drugs targeting dsRNA structures might affect RNA editing. Drugs to increase ADAR to affect editing at sites in ion channel transcripts may be possible when we better understand whether external neurotransmitters modulate editing and how they so this. RNA editing is sometimes altered in patients under-

Adenosine to inosine RNA editing in animal cells 359

going drug treatment for diseases where the connection to RNA editing might not have been understood only a few years ago (Harris and Hajduk 1992). Given the impacts editing of serotonin receptors may have on the mental health of patients and the apparent sensitivity of RNA editing levels to alteration, side effects of other drugs on RNA editing may be an important issue. There seems to be an ever-widening range of further work to be done on ADAR-mediated RNA editing. This is in addition to roasting some old chestnuts like the function and targets of ADAR1. The new bioinformatic screens in particular have opened a Pandora’s Box of RNA secondary structure and basemodification in transcripts.

Acknowledgments Both L.P.K. and M.O’C. receive funding from the Medical Research Council. We would like to thank Sandy Bruce for drawing some of the figures.

References Bass BL, Weintraub H (1987) A developmental regulated activity that unwinds RNA duplexes. Cell 48:607-613 Bass BL, Weintraub H (1988) An unwinding activity that covalently modifies its doublestrand RNA substrate. Cell 55:1089-1098 Bass BL, Weintraub H, Cattaneo R, Billeter MA (1989) Biased hypermutation of viral RNA genomes could be due to unwinding/modification of double-stranded RNA. Cell 56:331 Bhalla T, Rosenthal JJC, Holmgren M, Reenan R (2004) Control of human potassium channel inactivation by editing of a small mRNA hairpin. Nature Struct Biol (in press) Brusa R, Zimmermann F, Koh D-S, Feldmeyer D, Gass P, Seeburg PH, Sprengel R (1995) Early-onset epilepsy and postnatal lethality associated with editing-deficient GluR-B allele in mice. Science 270:1677-1680 Burnashev N, Monyer H, Seeburg PH, Sakmann B (1992) Divalent ion permeability of AMPA receptor channels is dominated by the edited form of a single subunit. Neuron 8:189-198 Burns CM, Chu H, Rueter SM, Hutchinson LK, Canton H, Sanders-Bush E, Emeson RB (1997) Regulation of serotonin-2C receptor G-protein coupling by RNA editing. Nature 387:303-308 Chen CX, Cho DS, Wang Q, Lai F, Carter KC, Nishikura K (2000) A third member of the RNA-specific adenosine deaminase gene family, ADAR3, contains both single- and double-stranded RNA binding domains. RNA 6:755-767 Cho DS, Yang W, Lee JT, Shiekhattar R, Murray JM, Nishikura K (2003) Requirement of dimerization for RNA editing activity of adenosine deaminases acting on RNA. J Biol Chem 278:17093-17102

360 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan Desterro JM, Keegan LP, Lafarga M, Berciano MT, O'Connell M, Carmo-Fonseca M (2003) Dynamic association of RNA-editing enzymes with the nucleolus. J Cell Sci 116:1805-1818 Gallo A, Keegan LP, Ring GM, O'Connell MA (2003) An ADAR that edits transcripts encoding ion channel subunits functions as a dimer. EMBO J 22:3421-3430 George CX, Samuel CE (1999) Human RNA-specific adenosine deaminase ADAR1 transcripts possess alternative exon 1 structures that initiate from different promoters, one constitutively active and the other interferon inducible. Proc Natl Acad Sci USA 96:4621-4626 Grauso M, Reenan RA, Culetto E, Sattelle DB (2002) Novel putative nicotinic acetylcholine receptor subunit genes, Dalpha5, Dalpha6 and Dalpha7, in Drosophila melanogaster identify a new and highly conserved target of adenosine deaminase acting on RNA-mediated A-to-I pre-mRNA editing. Genetics 160:1519-1533 Greger IH, Khatri L, Kong X, Ziff EB (2003) AMPA receptor tetramerization is mediated by q/r editing. Neuron 40:763-774 Greger IH, Khatri L, Ziff EB (2002) RNA editing at arg607 controls AMPA receptor exit from the endoplasmic reticulum. Neuron 34:759-772 Gurevich I, Englander MT, Adlersberg M, Siegal NB, Schmauss C (2002a) Modulation of serotonin 2C receptor editing by sustained changes in serotonergic neurotransmission. J Neurosci 22:10529-10532 Gurevich I, Tamir H, Arango V, Dwork AJ, Mann JJ, Schmauss C (2002b) Altered editing of serotonin 2C receptor pre-mRNA in the prefrontal cortex of depressed suicide victims. Neuron 34:349-356 Hanrahan CJ, Palladino MJ, Bonneau LJ, Reenan RA (1998) RNA editing of a Drosophila sodium channel gene. Ann NY Acad Sci 868:51-66 Hanrahan CJ, Palladino MJ, Ganetzky B, Reenan RA (2000) RNA editing of the drosophila para Na(+) channel transcript. Evolutionary conservation and developmental regulation. Genetics 155:1149-1160 Harris ME, Hajduk SL (1992) Kinetoplastid RNA editing: in vitro formation of cytochrome b gRNA-mRNA chimeras from synthetic substrate RNAs. Cell 68:1091-1099 Hartner JC, Schmittwolf C, Kispert A, Muller AM, Higuchi M, Seeburg PH (2004) Liver disintegration in the mouse embryo caused by deficiency in the RNA-editing enzyme ADAR1. J Biol Chem 279:4894-4902 Haudenschild BL, Maydanovych O, Veliz EA, Macbeth MR, Bass BL, Beal PA (2004) A transition state analogue for an RNA-editing reaction. J Am Chem Soc 126:1121311219 Herb A, Higuchi M, Sprengel R, Seeburg PH (1996) Q/R site editing in kainate receptor GluR5 and GluR6 pre-mRNAs requires distant intronic sequences. Proc Natl Acad Sci USA 93:1875-1880 Herbert A (1996) RNA editing, introns and evolution. Trends Genet 12:6-9 Herbert A, Schade M, Lowenhaupt K, Alfken J, Schwartz T, Shlyakhtenko LS, Lyubchenko YL, Rich A (1998) The Zalpha domain from human ADAR1 binds to the ZDNA conformer of many different sequences. Nucleic Acids Res 26:3486-3493 Higuchi M, Maas S, Single FN, Hartner J, Rozov A, Burnashev N, Feldmeyer D, Sprengel R, Seeburg PH (2000) Point mutation in an AMPA receptor gene rescues lethality in mice deficient in the RNA-editing enzyme ADAR2. Nature 406:78-81

Adenosine to inosine RNA editing in animal cells 361 Higuchi M, Single FN, Köhler M, Sommer B, Sprengel R, Seeburg PH (1993) RNA editing of AMPA receptor subunit GluR-B: A base-paired intron-exon structure determines position and efficiency. Cell 75:1361-1370 Hoopengardner B, Bhalla T, Staber C, Reenan R (2003) Nervous system targets of RNA editing identified by comparative genomics. Science 301:832-836 Hough RF, Bass BL (1994) Purification of the Xenopus laevis dsRNA adenosine deaminase. J Biol Chem 269:9933-9939 Hough RF, Bass BL (1997) Analysis of Xenopus dsRNA adenosine deaminase cDNAs reveals similarities to DNA methlytransferases. RNA 3:356-370 Hough RF, Lingam AT, Bass BL (1999) Caenorhabditis elegans mRNAs that encode a protein similar to ADARs derive from an operon containing six genes. Nucleic Acids Res 27:3424-3432 Jaikaran DC, Collins CH, MacMillan AM (2002) Adenosine to inosine editing by ADAR2 requires formation of a ternary complex on the GluR-B R/G site. J Biol Chem 277:37624-37629 Kask K, Zamanillo D, Rozov A, Burnashev N, Sprengel R, Seeburg PH (1998) The AMPA receptor subunit GluR-B in its Q/R site-unedited form is not essential for brain development and function. Proc Natl Acad Sci USA 95:13777-13782 Kawahara Y, Ito K, Sun H, Aizawa H, Kanazawa I, Kwak S (2004) Glutamate receptors: RNA editing and death of motor neurons. Nature 427:801 Kawahara Y, Kwak S, Sun H, Ito K, Hashida H, Aizawa H, Jeong SY, Kanazawa I (2003) Human spinal motoneurons express low relative abundance of GluR2 mRNA: an implication for excitotoxicity in ALS. J Neurochem 85:680-689 Kawakubo K, Samuel CE (2000) Human RNA-specific adenosine deaminase (ADAR1) gene specifies transcripts that initiate from a constitutively active alternative promoter. Gene 258:165-172 Keegan LP, Leroy A, Sproul D, O'Connell MA (2004) Adenosine deaminases acting on RNA (ADARs): RNA-editing enzymes. Genome Biol 5:209 Kim DD, Kim TT, Walsh T, Kobayashi Y, Matise TC, Buyske S, Gabriel A (2004) Widespread RNA editing of embedded alu elements in the human transcriptome. Genome Res 14:1719-1725 Kim U, Garner TL, Sanford T, Speicher D, Murray JM, Nishikura K (1994a) Purification and characterization of double-stranded RNA adenosine deaminase from bovine nuclear extracts. J Biol Chem 269:13480-13489 Kim U, Wang Y, Sanford T, Zeng Y, Nishikura K (1994b) Molecular cloning of cDNAs for double-stranded RNA adenosine deaminase, a candidate enzyme for nuclear RNA editing. Proc Natl Acad Sci USA 91:11457-11461 Kumar M, Carmichael GC (1997) Nuclear antisense RNA induces extensive adenosine modifications and nuclear retention of target transcripts. Proc Natl Acad Sci USA 94:3542-3547 Kung SS, Chen YC, Lin WH, Chen CC, Chow WY (2001) Q/R RNA editing of the AMPA receptor subunit 2 (GRIA2) transcript evolves no later than the appearance of cartilaginous fishes. FEBS Lett 509:277-281 Levanon EY, Eisenberg E, Yelin R, Nemzer S, Hallegger M, Shemesh R, Fligelman ZY, Shoshan A, Pollock SR, Sztybel D, Olshansky M, Rechavi G, Jantsch MF (2004) Systematic identification of abundant A-to-I editing sites in the human transcriptome. Nat Biotechnol 22:1001-1005

362 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan Lomeli H, Mosbacher J, Melcher T, Höger T, Geiger JR, Kuner T, Monyer H, Higuchi M, Bach A, Seeburg PH (1994) Control of kinetic properties of AMPA receptor channels by nuclear RNA editing. Science 266:1709-1713 Luciano DJ, Mirsky H, Vendetti NJ, Maas S (2004) RNA editing of a miRNA precursor. RNA 10:1174-1177 Melcher T, Maas S, Herb A, Sprengel R, Higuchi M, Seeburg PH (1996a) RED2, a brain specific member of the RNA-specific adenosine deaminase family. J Biol Chem 271:31795-31798 Melcher T, Maas S, Herb A, Sprengel R, Seeburg PH, Higuchi M (1996b) A mammalian RNA editing enzyme. Nature 379:460-464 Miyamura Y, Suzuki T, Kono M, Inagaki K, Ito S, Suzuki N, Tomita Y (2003) Mutations of the RNA-specific adenosine deaminase gene (DSRAD) are involved in dyschromatosis symmetrica hereditaria. Am J Hum Genet 73:693-699 Morse DP, Aruscavage PJ, Bass BL (2002) RNA hairpins in noncoding regions of human brain and Caenorhabditis elegans mRNA are edited by adenosine deaminases that act on RNA. Proc Natl Acad Sci USA 99:7906-7911 Morse DP, Bass BL (1997) Detection of inosine in messenger RNA by insine-specific cleavage. Biochemistry 36:8429-8434 Morse DP, Bass BL (1999) Long RNA hairpins that contain inosine are present in Caenorhabditis elegans poly(A)+ RNA. PNAS 96:6048-6053 Niswender CM, Copeland SC, Herrick-Davis K, Emeson RB, Sanders-Bush E (1999) RNA editing of the human serotonin 5-hydroxytryptamine 2C receptor silences constitutive activity. J Biol Chem 274:9472-9478 Niswender CM, Herrick-Davis K, Dilley GE, Meltzer HY, Overholser JC, Stockmeier CA, Emeson RB, Sanders-Bush E (2001) RNA editing of the human serotonin 5-HT2C receptor. Alterations in suicide and implications for serotonergic pharmacotherapy. Neuropsychopharmacology 24:478-491 O'Connell MA, Gerber A, Keller W (1997) Purification of human double-stranded RNAspecific editase1 (hRed1), involved in editing of brain glutamate receptor B premRNA. J Biol Chem 272:473-478 O'Connell MA, Keller W (1994) Purification and properties of double-stranded RNAspecific adenosine deaminase from calf thymus. Proc Natl Acad Sci USA 91:1059610600 O'Connell MA, Krause S, Higuchi M, Hsuan JJ, Totty NF, Jenny A, Keller W (1995) Cloning of cDNAs encoding mammalian double-stranded RNA-specific adenosine deaminase. Mol Cell Biol 15:1389-1397 Ohman M, Kallman AM, Bass BL (2000) In vitro analysis of the binding of ADAR2 to the pre-mRNA encoding the GluR-B R/G site. RNA 6:687-697 Palladino MJ, Keegan LP, O'Connell MA, Reenan RA (2000a) dADAR, a Drosophila double-stranded RNA-specific adenosine deaminase is highly developmentally regulated and is itself a target for RNA editing. RNA 6:1004-1018 Palladino MJ, Keegan LP, O'Connell MA, Reenan RA (2000b) A-to-I pre-mRNA editing in Drosophila is primarily involved in adult nervous system function and integrity. Cell 102:437-449 Patton DE, Silva T, Bezanilla F (1997) RNA editing generates a diverse array of transcripts encoding squid Kv2 K+ channels with altered functional properties. Neuron 19:711722

Adenosine to inosine RNA editing in animal cells 363 Paul M, Bass BL (1998) Inosine exists in mRNA at tissue-specific levels and is most abundant in brain mRNA. EMBO J 17: 1120-1127 Peters NT, Rohrbach JA, Zalewski BA, Byrkett CM, Vaughn JC (2003) RNA editing and regulation of Drosophila 4f-rnp expression by sas-10 antisense readthrough mRNA transcripts. RNA 9:698-710 Petschek JP, Mermer MJ, Scheckelhoff MR, Simone AA, Vaughn JC (1996) RNA editing in Drosophila 4f-rnp gene nuclear transcripts by multiple A-to-G conversions. J Mol Biol 259:885-890 Poulsen H, Nilsson J, Damgaard CK, Egebjerg J, Kjems J (2001) CRM1 mediates the export of ADAR1 through a nuclear export signal within the Z-DNA binding domain. Mol Cell Biol 21:7862-7871 Rebagliati MR, Melton DA (1987) Antisense RNA injections in fertilized frog eggs reveal an RNA duplex unwinding activity. Cell 48:599-605 Rosenthal JJ, Bezanilla F (2002) Extensive editing of mRNAs for the squid delayed rectifier K(+) channel regulates subunit tetramerization. Neuron 34:743-757 Scadden AD, Smith CW (2001a) RNAi is antagonized by A-->I hyper-editing. EMBO Rep 2:1107-1111 Scadden AD, Smith CW (2001b) Specific cleavage of hyper-edited dsRNAs. EMBO J 20:4243-4252 Schwartz T, Rould MA, Lowenhaupt K, Herbert A, Rich A (1999) Crystal structure of the Zalpha domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. Science 284:1841-1845 Semenov EP, Pak WL (1999) Diversification of Drosophila chloride channel gene by multiple posttranscriptional mRNA modifications. J Neurochem 72:66-72 Serra MJ, Smolter PE, Westhof E (2004) Pronounced instability of tandem IU base pairs in RNA. Nucleic Acids Res 32:1824-1828 Shaw PJ, Eggett CJ (2000) Molecular factors underlying selective vulnerability of motor neurons to neurodegeneration in amyotrophic lateral sclerosis. J Neurol 247 Suppl 1:I17-27 Shaw PJ, Ince PG (1997) Glutamate, excitotoxicity and amyotrophic lateral sclerosis. J Neurol 244 Suppl 2:S3-S14 Smith LA, Peixoto AA, Hall JC (1998) RNA editing in the Drosophila DMCA1A calciumchannel alpha 1 subunit transcript. J Neurogenet 12:227-240 Sommer B, Köhler M, Sprengel R, Seeburg PH (1991) RNA editing in brain controls a determinant of ion flow in glutamate-gated channels. Cell 67:11-19 Stephens OM, Haudenschild BL, Beal PA (2004) The binding selectivity of ADAR2's dsRBMs contributes to RNA-editing selectivity. Chem Biol 11:1239-1250 Tonkin LA, Bass BL (2003) Mutations in RNAi rescue aberrant chemotaxis of ADAR mutants. Science 302:1725 Tonkin LA, Saccomanno L, Morse DP, Brodigan T, Krause M, Bass BL (2002) RNA editing by ADARs is important for normal behavior in Caenorhabditis elegans. EMBO J 21:6025-6035 Vollmar W, Gloger J, Berger E, Kortenbruck G, Kohling R, Speckmann EJ, Musshoff U (2004) RNA editing (R/G site) and flip-flop splicing of the AMPA receptor subunit GluR2 in nervous tissue of epilepsy patients. Neurobiol Dis 15:371-379 Wagner RW, Nishikura K (1988) Cell cycle expression of RNA duplex unwindase activity in mammalian cells. Mol Cell Biol 8:770-777

364 Barry Hoopengardner, Mary A. O’Connell, Robert Reenan, and Liam P. Keegan Wang Q, Miyakoda M, Yang W, Khillan J, Stachura DL, Weiss MJ, Nishikura K (2004) Stress-induced apoptosis associated with null mutation of ADAR1 RNA editing deaminase gene. J Biol Chem 279:4952-4961 Wong SK, Sato S, Lazinski DW (2003) Elevated activity of the large form of ADAR1 in vivo: very efficient RNA editing occurs in the cytoplasm. RNA 9:586-598 Zhang XJ, He PP, Li M, He CD, Yan KL, Cui Y, Yang S, Zhang KY, Gao M, Chen JJ, Li CR, Jin L, Chen HD, Xu SJ, Huang W (2004) Seven novel mutations of the ADAR gene in Chinese families and sporadic patients with dyschromatosis symmetrica hereditaria (DSH). Hum Mutat 23:629-630 Zhang Z, Carmichael GG (2001) The fate of dsRNA in the nucleus. A p54(nrb)-containing complex mediates the nuclear retention of promiscuously A-to-I edited RNAs. Cell 106:465-475

Hoopengardner, Barry Department of Genetics and Developmental Biology, University of Connecticut Health Center, 263 Farmington Avenue, Farmington, Connecticut 06030, USA Keegan, Liam P. MRC Human Genetics Unit, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, U.K [email protected] O’Connell, Mary A. MRC Human Genetics Unit, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, U.K Reenan, Robert Department of Genetics and Developmental Biology, University of Connecticut Health Center, 263 Farmington Avenue, Farmington, Connecticut 06030, USA

Mammalian C to U editing Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

Abstract The sequencing of genomes from higher organisms demonstrated that the number and complexity of expressed mRNA sequences and proteins exceeds the quantity of predicted genes. This disparity has been attributed to a variety of cellular mechanisms including the use of alternative promoters, mRNA splice sites and/or polyadenylation sites. Additionally, single nucleotide modifications within RNA, and more recently DNA, can generate diversity in protein expression. C to U or dC to dU modification at specific sites within RNA or DNA can arise from targeted editing activities rather than spontaneous mutation and is catalyzed by APOBEC-1 or related zinc-dependent, cytidine deaminases. The function and substrate specificity are known for only five of the ten deaminases in the APOBEC-1 Related Protein family. Hence, exciting discoveries are predicted regarding the role of editing enzymes as modifiers of protein expression in normal physiology, in conferring resistance to invading pathogens, and possibly activities underlying human disease.

1 Introduction This chapter addresses the function of APOBEC-1 and the family of related mammalian cytidine deaminases. Many of these proteins have the ability to deaminate free nucleosides or nucleotides, as well as the capacity to convert cytidine to uridine in RNA or deoxycytidine to deoxyuridine in DNA. The focus will be on mammalian apolipoprotein B (apoB) C to U mRNA editing, a nuclear RNA processing event that mediates cytidine to uridine conversion and which occurs to a limited extent cotranscriptionally but largely coincident with, or subsequent to, pre-mRNA splicing (Lau 1991; Sowden et al. 1996b; Sowden and Smith 2001). The catalytic and auxiliary proteins involved in this process will be described along with a scheme for the regulation of apoB mRNA editing. A model for the structure of APOBEC-1 is presented that has predictive value for the structure and function of Activation Induced Deaminase (AID) and APOBEC-3G (CEM15; see Marquet and Dardel, this volume), which are members of the APOBEC-1 Related Protein family whose respective DNA editing activities are required for diversification and expression of immunoglobulins (Reynaud et al. 2003), and disruption of retroviral, e.g. HIV-1, infectivity (Sheehy et al. 2002; Mangeat et al. 2003). The finding that other members of the family (Mian et al. Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b105432 / Published online: 14 December 2004 © Springer-Verlag Berlin Heidelberg 2005

366 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

1998; Jarmuz et al. 2002; Wedekind et al. 2003) exhibit DNA deaminase activity (Harris et al. 2002, 2003; Lecossier et al. 2003; Mangeat et al. 2003; Zhang et al. 2003; Zheng et al. 2004; Wiegand et al. 2004) suggests a broader role for mammalian C to U editing enzymes in biological processes than previously considered. Although relatively few editing events have been characterized in mammals, they can have profound effects on: the function of transmembrane receptors and ion channels (Reenan 2001), erythropoiesis and inflammation (Beghini et al. 2000; Yang et al. 2003; Hartner et al. 2004), cardiovascular disease (Kozarsky et al. 1996; Yang et al. 2002), cancer (Anant et al. 2002; Harris et al. 2002; Cappione et al. 1997; Okazaki et al. 2003; Machida et al. 2004), and upon the life cycle of viruses (Sheehy et al. 2002; Wong and Lazinski 2002; Harris et al. 2003; Lecossier et al. 2003; Macnaughton et al. 2003; Mangeat et al. 2003; Zhang et al. 2003; Turelli et al. 2004; Yu et al. 2004; Zheng et al. 2004). Editing activity affects protein expression by altering nucleotides that change codon sense or by producing translation initiation or stop codons (reviewed in Gott and Emeson 2000; Keegan et al. 2001, 2004; Reenan 2001; Wedekind et al. 2003) or by modifying the nucleotides necessary for pre-mRNA splice site selection (Rueter et al. 1999; Palladino et al. 2000a; Keegan et al. 2004). For additional information on these topics, the reader is directed to Hoopengardner, this volume.

2 Site-specific apoB mRNA editing: the basic facts ApoB mRNA is edited within the epithelial cells (enterocytes) that line the small intestines of all mammals and in the liver (hepatocytes) of some species (Chen et al. 1987; Powell et al. 1987; Davidson et al. 1988; Greeve et al. 1993). Editing at cytidine 6666 of this mRNA converts a CAA glutamine codon to a UAA stop codon, thereby, enabling both full length (apoB100) and truncated (apoB48) variants of apoB protein to be expressed from a single gene. ApoB48 is stored in the enterocyte and assembled with dietary lipids as the structural protein core of chylomicrons. These are secreted into the lymphatic ducts draining the small intestine and enter the blood stream from which they are rapidly taken up by the liver. Chylomicron derived lipids are reassembled in the liver as very low density lipoproteins (VLDL) on apoB100 protein which are secreted into the circulation for peripheral tissue utilization. In several mammals, apoB mRNA editing also occurs in liver (Greeve et al. 1993) where, unlike intestine, apoB mRNA editing is regulated to determine the proportion of edited apoB mRNA as well as the amount of secreted B48 VLDL (Sparks et al. 1981). Hepatic VLDL are assembled and secreted with apoB100 and apoB48 protein cores. B100 VLDL are digested by peripheral lipases, rendering them to protein and cholesterol rich low density lipoproteins (LDL), whose elevated abundance in blood is an atherosclerotic risk factor (Corsetti et al. 2003). ApoB48 VLDL is cleared from the blood more rapidly than apoB100 VLDL and is not metabolized to LDL (Chan 1992). For this reason, hepatic apoB mRNA editing has been considered as a means of reducing the risk of atherogenic disease.

Mammalian C to U editing 367

APOBEC-1 is the sole cytidine deaminase responsible for editing apoB mRNA (Hirano et al. 1996; Nakamuta et al. 1996). Although APOBEC-1 can bind and deaminate free cytidine nucleoside or nucleotide substrates, as well as bind weakly to AU-rich RNA sequences (Anant et al. 1995; Anant and Davidson 2000), it cannot bind specifically to, nor under physiological temperature and salt concentrations, edit apoB RNA (Teng and Davidson 1992; Driscoll and Zhang 1994; Chester et al. 2004). In cells, site-specific apoB mRNA editing requires an editing complex (or C to U editosome) consisting minimally of an APOBEC-1 homodimer (Lau et al. 1994; Oka et al. 1997; Navaratnam et al. 1998) interacting with a single stranded RNA binding protein known as APOBEC-1 Complementation Factor (ACF), which binds to the mooring sequence of the apoB mRNA (Lellek et al. 2000; Mehta et al. 2000; Blanc et al. 2001a; Dance et al. 2002). In addition, APOBEC-1 and ACF must traffic from the cytoplasm to the nucleus (Yang and Smith 1997; Sowden 2002; Blanc et al. 2003; Chester et al. 2003; Sowden et al. 2004) where editing takes place, prior to the export of spliced apoB mRNA (Lau 1991; Yang et al. 2000; Sowden et al. 2001).

3 Characteristics of the RNA substrate A critical cis-acting element for apoB mRNA editing is the mooring sequence (5’UGAUCAGUAUA-3’, nts 6671-6681), which is the 3' most element of a 21 nt tripartite motif that also includes an enhancer element (UGAUA, immediately 5’ of C6666) and a spacer element (AAUU) between C6666 and the mooring sequence (Table 1). Translocation of the mooring sequence downstream of an otherwise unedited C within apoB mRNA or within heterologous RNAs was sufficient to direct editing activity to 5’ cytidines (Backus and Smith 1991; Driscoll et al. 1993; Backus et al. 1994). A sequence similar to the mooring sequence is located 3' of a less frequently edited site at nucleotide C6802 within apoB mRNA (Navaratnam et al. 1991). The mooring sequence was required for promiscuous editing of several cytidines within apoB mRNA (some as far as 50 nucleotides 5’ of the mooring sequence) (Sowden et al. 1996a, 1998) and for hyperediting of cytidines in heterologous mRNAs (Yamanaka et al. 1995, 1997) under conditions where APOBEC-1 was overexpressed in rat hepatoma cells or transgenic mouse livers, respectively. Mutagenesis of the tripartite motif demonstrated there was lax sequence specificity in the enhancer and spacer elements (Chen et al. 1990), although spacer length was critical (Backus and Smith 1992). In contrast only a limited number of nucleotide changes within the mooring sequence were tolerated (Backus and Smith 1992; Driscoll et al. 1993; Backus et al. 1994). The 5' portion of the mooring sequence (UGAU) is critical as it was present whenever RNAs supported editing activity. Chicken apoB mRNA, which has a 5’ AGAG in this position, does not support editing (Teng and Davidson 1992) (Table 1). Weighted matrix sequence modeling of the mooring sequence and its tolerated variations identified 80 candidate editing substrates from 3860 human ESTs sampled, though none

368 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

Table 1. Examples of mooring sequences with their associated editing efficiency. The 21 nucleotide apolipoprotein B tripartite editing cassette comprises an enhancer element and a spacer region flanking the cytidine to be edited and most importantly an eleven nucleotide mooring sequence. The apoB mooring sequence of a number of species is shown together with other mRNAs that have been shown or predicted to be targets of the APOBEC-1 editosome (see text for details). Mutations in the 5’ end of the mooring sequence revealed this region to be most critical for ACF binding whilst the 3’ end has a more lax sequence requirement (Backus and Smith 1992; Smith 1993). The exact editing efficiency of the cytidine in mouse NAT-1 was undetermined, but of 42 cytidines in that region 13 (31%) were hyperedited (Yamanaka et al. 1997). # H. Smith and M. Forsythe, unpublished data. Sequence name

% editing

Sequence

human apoB guinea pig apoB pig apoB mouse apoB rat apoB apoB alternate site (6802) chicken apoB homolog human NF1 rat NF1 mouse NAT-1 mouse tyrosine kinase TEC human Rb (C2221) human Rb (C1744/C1745) human Rb (C2103/C2104) mouse fatty acid synthase mouse P1 protein mouse prostaglandin synthase homolog human RAG-1 human MTAP

85-100 85-100 85-100 85-100 85-100 8-12 0 4-17 0 31∗ 1 0# 0/0# 0/0# 0 0 0#

Enhancer Spacer UGAUA C AAUU UGAUg C AAUU UGAUA C AAUU cGAUA C AAUU cGAUA C AAUU AaAaA C AAUc ca UggUg C AaaU UauUA C gAAUU g UGuUA C gAAUc a UugcA C Aaaa c AcgUu C ugga UcAaA C gugUu u AaggA C cAac a Uggac C AaaU u Ccggg C Agc Uaugg C AggU aU GagcA C Agga U

Mooring UGAUCAGUAUA UGAUCAuUAUA UGAUCAGUAUA UGAUCAGUAUA UGAUCAGUAUA UGAUCuacAUu aGAgCAGUAcA UGAUCAcaucc UuAUCAcaucc UGAUCAGUuUg UGAUCAGUAcA UGAUCAaagaA UGAUCAccuUg UGAUguGUucc UGAUCAGUAUA UGAUCAGUAUA UGAcCAGUAUA

0# 0#

UcuUu C Auugu C

UGAUCAGUuUA UGAUCAGUucA

uAUa a AuUa u

supported editing in cell lines or tissues. (Table 1, D. Landsman, M. Forsythe, and H. Smith, unpublished data). Consequently, although the mooring sequence is essential for editing, it alone is not a sufficient predictor of potential C to U editing substrates. The length and sequence of RNA flanking the tripartite motif contributed to the efficiency of editing at C6666 (Davies et al. 1989; Backus et al. 1994). Short apoB reporter RNAs were not efficiently edited (Davies et al. 1989; Backus and Smith 1991) while longer heterologous RNAs containing the tripartite motif were edited with higher efficiency albeit never as well as similar length transcripts containing apoB RNA flanking sequence (Backus et al. 1994). ApoB mRNA contains a disproportionately high number of UGAU motifs in the 250 nts surrounding C6666 (Smith 1993). Some or all of these motifs bound editing factors as cytidines were edited when inserted 5' of these sites (Sowden et al. 1996a, 1998). Hence, addi-

Mammalian C to U editing 369

tional flanking sequence may be necessary beyond the mooring sequence to recruit protein factors to the editing site and enhance editing activity (Smith 1993). Alternatively, flanking RNA sequence may improve editing efficiency through formation of a specific secondary structure. ApoB RNA sequences both proximal and distal to C6666 (Smith 1993; Hersberger and Innerarity 1998; Hersberger et al. 1999) were predicted to form stem loops. These structures involve highly AUrich (70%) sequences and, therefore, numerous thermodynamically unstable conformers were predicted (Smith 1993). Analyses of predicted secondary structures of apoB RNAs from several species were compared to their editing efficiencies and provided support for the role of RNA secondary structure as an enhancer of editing site utilization (Hersberger et al. 1999). Single and double stranded RNA nuclease cleavage assays of apoB RNA also suggested secondary structure formation in the vicinity of C6666 (Richardson et al. 1998). The nebulous nature of a secondary structure requirement may be a result of the analyses having been performed on naked RNA. Although both APOBEC-1 and ACF only bind singlestranded RNA, their interactions with the editing site may culminate in the formation or stabilization of a required secondary structure (Smith 1993; Chester et al. 2004).

4 APOBEC-1 4.1 Requirement of APOBEC-1 for C to U mRNA editing APOBEC-1 was isolated from a functional complementation assay in which S100 extracts from Xenopus oocytes injected with fractionated rat intestinal polyA+ RNA were mixed with chicken enterocyte extracts and editing determined in vitro. A poly A+ fraction positive for editing was converted into a cDNA library and by means of sib selection, a cDNA was isolated that encoded a 229 amino acid (~27 kDa) protein (Teng et al. 1993). APOBEC-1 was expressed in several tissues, albeit at very low levels compared to liver and intestine, the only two tissues that supported significant apoB mRNA editing (Teng et al. 1993; Driscoll and Zhang 1994; Nakamuta et al. 1995; Qian et al. 1997). Comparison of APOBEC-1 genes between rodents and human revealed the absence of a liver specific promoter region in the human APOBEC-1 gene resulting in the inability of human liver to express APOBEC-1 and edit apoB mRNA (Hirano et al. 1997; Greeve et al. 1998a). This observation focused attention on the enzyme as a potential human gene therapy agent (Apostel et al. 2002; Yang et al. 2002; Qian et al. 1998). Apobec-1 knockout mice did not edit apoB mRNA (Hirano et al. 1996; Nakamuta et al. 1996) but otherwise these mice were healthy, suggesting that expression of APOBEC-1 in different tissues was not essential.

370 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

Fig. 1. The zinc dependent deaminase (ZDD) of APOBEC-1 Related Proteins. The conserved residues within the ZDD (Mian et al. 1998) of the APOBEC-1 Related Proteins discussed in this chapter are aligned and shown in bold type within boxes. CEM15 (APOBEC3G) contains two ZDDs and each, the N-terminal and C-terminal, is aligned separately. The ZDDs exhibit generally conserved primary sequence spacing and comprise three zinc ligands (either His or Cys), a glutamic acid that serves as a proton shuttle in the deamination reaction and a proline residue that coordinates the leaving group. Mutagenesis studies highlighted the importance of these residues in the function of AID and also APOBEC-3G (Murumatsu et al. 2000; Shindo et al. 2003).

4.2 Catalytic residues, RNA binding and oligomerization of APOBEC-1 A comparison of several mammalian APOBEC-1 amino acid sequences revealed high levels of conservation (Driscoll and Zhang 1994; Giannoni et al. 1994; Lau 1994; Yamanaka et al. 1994; Nakamuta et al. 1995; Anant et al. 1998). The presence of residues [H/C]xE xn PCxxC implied the existence of a zinc dependent deaminase (ZDD) motif (Mian et al. 1998) suggesting that the catalytic mechanism of APOBEC-1 is homologous in form and function to a variety of nucleoside and nucleotide deaminases (Fig. 1) (Driscoll et al. 1993; Anant et al. 1998; Wedekind et al. 2003). His61, Cys93, and Cys96 are essential as Zn(II) ligands, Glu63 serves as a catalytic proton shuttle (Driscoll and Zhang 1994; Anant et al. 1995; MacGinnitie et al. 1995; Carlow et al. 1999; Teng et al. 1999), whereas Pro92 is believed to coordinate the leaving group by analogy to bacterial deaminases (Betts et al. 1994; Xiang et al. 1996). Residues His61, Glu63, Phe66, Phe87, and Cys93 within the catalytic domain were required for APOBEC-1’s modest (Kd~450 nM, (Anant and Davidson 2000)) and nonspecific RNA binding activity (Anant et al. 1995; Navaratnam et al. 1995) (Fig. 2). Recent studies have suggested that residues within the N- and C-termini of APOBEC-1 contribute to RNA binding as well (Chester et al. 2003; Xie et al. 2004). APOBEC-1 has nucleoside and nucleotide deaminase activity in vitro, but when combined with ACF it edits a specific RNA substrate (MacGinnitie et al. 1995; Navaratnam et al. 1995). The ability of ACF to confer substrate specificity was evaluated in vitro and was commensurate with a 12-fold enhancement of Km in editing (Chester et al. 2004), although Vmax increased only 2.5-fold. Interestingly, in the absence of ACF, APOBEC-1 edited apoB RNA at temperatures >37oC (Chester et al. 2004). However, comparisons of V/K effects at 30oC versus

Mammalian C to U editing 371

Fig. 2. Functional domains of APOBEC-1 and ACF. The catalytic domain of APOBEC-1 is characterized by a zinc-dependent deaminase domain (ZDD) whose conserved residues are shown in the upper APOBEC-1 model (and see Fig. 1). Residues F66 and F87 in the catalytic domain bind AU-rich RNA with weak affinity (Navaratnam et al. 1995). The leucine rich region (LRR; amino acids 173-210) has been implicated in homo-dimerization, ACF interaction, and is required for editing (Teng et al. 1999), although structural analysis suggests that the LRR forms the hydrophobic core of the protein monomer (Betts et al. 1994; Carter 1998; Xie et al. 2004). A number of domains including a bipartite SV40 like NLS, an M domain and a leucine rich nuclear export sequence are responsible for the subcellular localization of APOBEC-1 (shown in the lower APOBEC-1 model), although they should not be considered mutually exclusive (Yang and Smith 1997; Yang et al. 2001; Chester et al. 2003). ACF64 complements APOBEC-1 through its RNA binding and APOBEC-1 interaction activities (Blanc et al. 2001a; Mehta and Driscoll 2002). The single-stranded RNA recognition motifs (RRMs) were delimited by motif searches (http://www.expasy.org) as well as by comparison to known RRM structures. The RRMs are required for Mooring Sequence specific RNA binding and together with their flanking sequences are required for APOBEC-1 complementation. A short domain (ANS) of ACF mediates nuclear localization of ACF (Blanc et al. 2001a). The function of the putative double stranded RNA binding motif (DRBM) is unknown. ACF65, an alternative RNA splice variant of ACF64, includes an eight amino acid insertion at residue 380 and complements RNA editing by APOBEC-1 as effectively as ACF64 (Dance et al. 2002). ACF45 and ACF43 are rat specific alternative RNA splice forms of ACF whose C-termini after amino acid 380 differ from ACF64/65 (Sowden et al. 2004). They have different functional characteristics with respect to APOBEC-1 complementation of RNA editing (see text for details).

41oC demonstrated the greatest ratios (i.e. 2600 compared to 1500) in the presence of ACF at the lower temperature. In the absence of ACF, V/K values were thirty to tenfold reduced, respectively. These data suggested that the optimal substrate for editing exists at 30o C in the presence of ACF. Although there are several interpretations of these results, one plausible explanation is that the function of ACF

372 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

is to reorganize the RNA in such as manner that it binds productively to the APOBEC-1 active site (Chester et al. 2004). This reorganization would be predicted to occur at higher temperatures as a “melting out’ of secondary structure in the absence of ACF, and is corroborated by the observation that V/K values are higher at 41oC than 30oC (130 compared to 81). These data are consistent with the prescribed role of RNA recognition motifs, such as those within ACF, which bind to single-stranded regions of RNA (Mehta and Driscoll 2002; Xie et al. 2004). The interplay between editing enzyme and auxiliary factor is likely also applicable to other members of the APOBEC family. Homodimerization of APOBEC-1 is essential for function as demonstrated by transgenic studies in which the expression of dimerization competent, but catalytically inactive mutants of APOBEC-1 had a dominant negative effect (Oka et al. 1997). Comparative modeling of the APOBEC-1 structure suggested head-to-head dimerization is essential because a functional catalytic site requires contributions from two subunits (Xie et al. 2004). APOBEC-1 C-terminal leucines (residues 169-184; PQYPPLWMMLYLALEL) were required for dimerization and have been presumed essential for ACF binding (Lau et al. 1994; Navaratnam et al. 1998; Teng et al. 1999). This region also contains a potent nuclear export/cytoplasmic retention signal (NES/CRS) that dominated the nuclear localization signal (NLS) of SV40 T antigen in chimeric reporter proteins (Yang and Smith 1997; Dance et al. 2001; Yang et al. 2001; Chester et al. 2003). Thus, the Cterminus contains a multi-functional domain that regulates protein interactions, subcellular localization and editing activity. Similar conclusions have been drawn for AID in which separate subdomains have been ascribed to subcellular localization, class switch recombination and somatic hypermutation (Brar et al. 2004; Ito et al. 2004). 4.3 Post-translational modification of APOBEC-1 Post-translational modification of APOBEC-1 may also be important for proteinprotein interactions, subcellular distribution, or assembly of nuclear editosomes. Site-directed mutants of predicted protein kinase Cθ phosphorylation sites within APOBEC-1 have been created that affected editing activity. The Ser47Asp or Ser72Ala constructs stimulated apoB mRNA editing in rat hepatoma cells, whereas Ser47Ala or Ser72Asp were inhibitory (Chen et al. 2001). Although Cθ is expressed predominantly in skeletal muscle, its overexpression in hepatoma cells stimulated editing. Whether APOBEC-1 is phosphorylated in cells remains to be demonstrated. However, an analysis of AID revealed it was phosphorylated in situ (Chaudhuri et al. 2004). Enhanced deaminase activity on sites of single stranded DNA associated with transcription and its ability to interact with Replication Protein A (a presumed auxiliary factor that promotes somatic hypermutation) were associated with the phosphorylated form of AID. Regulation of activity through phosphorylation, therefore, may be a recurring theme amongst APOBEC Related Proteins.

Mammalian C to U editing 373

4.4 The conserved deaminase fold Phylogenetic and structural comparisons of cytidine deaminases (CDAs) revealed that those involved in pyrimidine metabolism are likely to be closely related to cytidine deaminases acting on RNA (CDARs) as well as adenosine deaminases acting on RNA (ADARs; see Chapter 12) and tRNA (ADATs), but not bacterial cytosine diaminase (CD) or adenosine deaminases acting on free nucleotides (Maas and Rich 2000; Wedekind et al. 2003), which may have evolved divergently (Betts et al. 1994; Ireton et al. 2003). The first indication that the cytidine deaminase fold of pyrimidine metabolism might be sufficient for RNA editing came from the discovery that a cytidine deaminase from Saccharomyces cerevisiae (ScCDD1) mediated specific editing of reporter apoB mRNA (Dance et al. 2001). The structure determination of ScCDD1 represented the first example of a eukaryotic enzyme of the CDA family (Xie et al. 2004) and demonstrated that the subunit tertiary fold (Fig. 3a) exhibited high homology to the catalytic domains of bacterial CDAs (Johansson et al. 2002), as well as ScCD (Ireton et al. 2003; Ko et al. 2003). The topology shared by each structure was a triangular βsheet of five strands flanked by three conserved α-helices (Fig. 3b). Key differences exist between CDA and ScCD catalytic domains that account for substrate specificities for cytidine versus cytosine. The ScCD structure exhibited a bulky multi-helical ‘flap’ that folds over the active site entrance (Fig. 3c). This structure excludes substrates larger than a single nucleotide base. Similarly, the CDA from E. coli exhibited a long, structured linker that passed over the active site (Fig. 3c), allowing activity on cytidine, but not on RNA (Xie et al. 2004). The flap of ScCDD1 was a much shorter coil (Fig. 3c) that allowed access to nucleosides (Kurtz et al. 1999), as well as larger RNA substrates (Dance et al. 2001; Xie et al. 2004). The precedent that deaminases embellish their C-termini to confer distinct substrate selectivity, while maintaining a conserved polypeptide fold, was highly relevant in consideration of the possible fold and function of APOBEC-1 and its related proteins. It is also significant that the C-terminus of AID confers CSR activity (Barreto et al. 2003; Ta et al. 2003), which can be considered a positive evolutionary innovation of a gene-duplicated flap region. 4.5 Comparative models of APOBEC-1 and AID The structure determination of ScCDD1 provided a tenable connection between the enzymes of pyrimidine metabolism and members of the APOBEC Related Protein family (Xie et al. 2004). By use of alignments of known oligomeric CDA crystal structures (Fig. 3a), as well as isolated subunits of ScCD, comparative models were calculated for APOBEC-1 (Fig. 3d) and AID that led to some general predictions about their structure and function (Xie et al. 2004). These features will be discussed and contrasted to a previous model for APOBEC-1 based on the E. coli CDA (Navaratnam et al. 1998). Ultimately, the merits of both models can only be addressed through high resolution experimental structure determinations, which are unavailable at present.

374 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

The newer models explain the need for dimerization (Lau et al. 1994; Oka et al. 1997; Ta et al. 2003) as a means to organize two polypeptide chains into a single catalytic domain. This form is observed for ScCDD1 and EcCDA (Fig. 3c) in which the symmetric domain arrangement assures that the N-terminal catalytic domain (Fig. 3d, red subunit) receives essential trans-acting loops from the dyad related molecule, such as L2 (compare Fig. 3c to Fig. 3d, orange subunit) and the interdomain flap (Fig. 3d, cyan coil between violet and orange subunits). Further consideration of known CDA structures (Fig. 3c) indicates that these intermolecular structural elements would be essential for ribose binding and catalytic activity (Betts et al. 1994; Johansson et al. 2002; Xie et al. 2004). As such, the subunit interface of any comparative model would be expected to be analogously assembled and hence robust. Calculations of buried surface area for the recent APOBEC-1 model (Fig. 3d) showed that 9,000 Å2 of a total 24,000 Å2 is sequestered from solvent upon dimerization (i.e. between the respective red & purple and violet & orange subunits in Fig. 3d). These results contrast with a previous model of APOBEC-1 in which a ‘novel’ double-stranded RNA binding cleft interrupts the subunit interface (Navaratnam et al. 1998). Although this model

Mammalian C to U editing 375

Fig. 3 (overleaf). Schematic depictions of cytidine deaminase structures. (a) The Cα backbone of the dimeric EcCDA (blue and cyan) superimposed upon tetrameric ScCDD1 (red, violet, purple and orange). The pairwise rms deviation was 1.32 Å (28% sequence identity). A central “linker” (yellow) joins the domains of the dimer, whereas each subunit of the tetramer terminates with a “flap” (representative green, hatched oval) at the active site. The boxed region (dotted line) is shown in greater detail in (c), (b). Topology diagram for the conserved cytidine deaminase domains derived from bacterial and yeast crystal structures in (a) and detailed in (Xie et al. 2004). The dimeric enzymes comprise a tripartite fold in which two αβ domains are joined by a linker. The N-terminal catalytic domain (NTCD) harbors the ZDD motif that coordinates Zn2+. The non-catalytic C-terminal domain (NCCTD) exhibits the same fold as the N-terminal domain. The tetrameric enzymes, such as CDD1, comprise two identical polypeptide chains that terminate in a ‘flap’; hence the linker does not exist. (c) Active site superposition of the respective linker (E. coli, yellow) or flap structures (ScCD, semi-transparent helical flap or CDD1, labeled “COOH” colored cyan). The Cα coordinates from the latter structures were superposed on ScCDD1, and only their flaps/linkers are depicted. The pairwise superposition resulting from CDD1 Cα comparisons to the ScCD structure is 1.42 Å rmsd (16% identity). The backbone agreement is significantly better than predicted by sequence identity alone (Chothia and Lesk 1986). (d) Ribbon diagram of the dimeric APOBEC-1 comparative model. Individual polypeptide chains are colored red and purple (NTCD and NCCTD); violet and orange (NTCD and NCCTD). Each NTCD is connected to a NCCTD by a flap (cyan). The NTCD harbors a ZDD motif that coordinates Zn2+ (green sphere). A model for apoB mRNA (comprising the sequence 5’-GAUAU6666AA-3’) bound to both catalytic centers is depicted as a stick model (yellow – front, and black -- behind). Note that the flap (cyan) sequesters the RNA substrate (yellow) bound within the catalytic cleft.

describes a possible mode for double-stranded RNA binding, the fundamental CDA topology (Fig. 3b) was altered to bind substrate, resulting in a loss of nearly 4,500 Å2 of buried surface area at the dimer interface. Such a loss of buried area is energetically unfavorable for an oligomeric protein of this size (Miller et al. 1987). A second aspect of the comparative models of APOBEC-1 and AID is that each predicts two catalytically competent active sites exist on opposite faces of the dimer (Fig. 3d). Hence, each active site is symmetric and would not require allosteric activation for substrate binding as proposed (Carter 1998). These results differ from the prior APOBEC-1 model that predicted an essential role for each active site in binding C6666 and an unspecified downstream U (Navaratnam et al. 1998; Carter 1998); thereby, functioning in an asymmetric manner. At present, no evidence exists for cooperativity in the kinetics of APOBEC-1 editing or other CDAs such as that from B. subtilis (Carlow et al. 1999). A third feature of the new models is that each catalytic site accommodates single-stranded, but not double-stranded nucleic acids (Fig. 3d). This observation is based upon efforts to model a double-stranded substrate into active sites. However, a single bulged base did not extend deeply enough to be deaminated (Xie et al. 2004). Modeling demonstrated that a minimal unstructured substrate of length seven nucleotides could be accommodated by the active sites and that the deaminated C could occupy a site spatially identical to cytidine analogs observed in

376 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

crystal structures (compare Fig. 3b to Fig. 3d). For APOBEC-1, the results of modeling suggested that the 11-nt mooring sequence could be sufficiently exposed such that the apoB mRNA would accommodate the RNA recognition motifs (RRMs) (Mehta et al. 2000) of ACF. Finally, the comparative models make key predictions regarding the biological activity of APOBEC-1 Related Proteins (Wedekind et al. 2003; Xie et al. 2004). The requirement for AID in class switch recombination (CSR), somatic hypermutation (SHM) and gene conversion (Muramatsu et al. 2000; Revy et al. 2000; Arakawa et al. 2002) and APOBEC-3G (CEM15) as well as APOBEC-3B and APOBEC-3F, in suppression of lentiviral infectivity (Sheehy et al. 2002; Wiegand et al. 2004) has raised questions of how substrates are targeted. Mechanisms have been proposed for AID in which either RNA or DNA represents the primary substrate (Kinoshita et al. 1999; Barreto et al. 2003; Doi et al. 2003) while APOBEC3G and presumably –3B and –3F, target single-stranded first-strand viral cDNA (Harris et al. 2003; Wiegand et al. 2004; Bishop et al. 2004a; Zheng et al. 2004 and see Chapter 14). Interestingly AID, APOBEC-1, and APOBEC-3G deaminated dC in DNA (Harris et al. 2002; Petersen-Mahrt 2002; Petersen-Mahrt and Neuberger 2003) and APOBEC-3G was shown to bind non-specifically to RNA (Svarovskaia et al. 2004). Furthermore, despite over 70% amino acid sequence identity, rat, but not human APOBEC-1, edited the RNA of HIV-1 (Bishop et al. 2004). However, APOBEC-1 could not substitute for AID in CSR or SHM (Eto et al. 2003) and AID could not substitute for APOBEC-1 in apoB mRNA editing (Muramatsu et al. 1999) suggesting each enzyme targets its own substrate by a distinct protein-specific mechanism. Consistent with this possibility is the observation that deamination of dC in E. coli genomic DNA occurred at sites unique to each protein tested (Harris et al. 2002; Petersen-Mahrt et al. 2002). Modeling of preferred deamination substrates (Fig. 3b) in the active sites of APOBEC-1 and AID indicated either RNA or DNA substrates could be accommodated by both enzymes (Xie et al. 2004). These observations have implications for other APOBEC1 Related Proteins, as well as the identification of their physiological targets. 4.6 APOBEC-1 and dC to dU DNA mutation The APOBEC-1 Related Protein family member, AID, has a similar amino acid sequence to APOBEC-1 and the residues within the zinc dependent deaminase domain are critical for function (Fig. 1). The high frequency of dC to dU mutations in DNA associated with somatic hypermutation and class switch recombination suggested that AID might act as a cytidine deaminase on DNA. To test this hypothesis, AID and APOBEC-1 (as well as other APOBEC-1 Related Proteins including APOBEC-3G, (CEM15) were overexpressed in E. coli and several genetic markers measured as a means to detect dC to dU mutations in bacterial genomic DNA (Harris et al. 2002; Petersen-Mahrt et al. 2002). An elevated mutation frequency in bacterial strains deficient in DNA Uracil N-glycosylase (UNG) (an enzyme that removes uracil, thereby, creating apyrimidinic sites that will be repaired through nucleotide insertion) strongly demonstrated that AID, APOBEC-1,

Mammalian C to U editing 377

and APOBEC-3G deaminated dC to dU in this heterologous prokaryotic system. Subsequent studies have revealed that AID, APOBEC-1, and APOBEC-3G supported dC to dU modification on single stranded DNA in vitro (Petersen-Mahrt et al. 2002; Harris et al. 2003; Beale et al. 2004; Suspene et al. 2004). The studies in E. coli failed to identify a consensus motif for each deaminase (Harris et al. 2003) but nearest neighbor preferences for nucleotides immediately 5' of the target cytidine (dT for APOBEC-1, dA/dG for AID and dC for APOBEC3G) were observed (Beale et al. 2004). The cognate enzyme did not deaminate all of the dC sites with appropriate flanking sequences suggesting that while the site selectivity of APOBEC-1 Related Protein activity on ssDNA may be determined by the enzyme's intrinsic nucleotide context preferences, other contexts such as chromatin structure (Woo et al. 2003), transcription complexes (Chaudhuri et al. 2003; Sohail et al. 2003), or reverse transcription complexes (Yu et al. 2004) may ultimately determine target site modification. 4.7 APOBEC-1 and neoplasia The above studies stimulated speculation that APOBEC-1 overexpression might, in addition to causing promiscuous RNA editing of apoB and other mRNAs (Table 1) (Yamanaka et al. 1995; Sowden et al. 1996a; Yamanaka et al. 1997; Sowden et al. 1998), lead to DNA mutation and the induction of neoplasia (reviewed in Wedekind et al. 2003). Indeed, liver-specific transgenic overexpression of APOBEC-1 induced liver carcinoma and dysplastic disease (Yamanaka et al. 1995), although the cancer phenotype was attributed to hyperediting of the mRNA encoding the translation factor eIF4G that resulted in the loss of the control of general translation initiation (Yamanaka et al. 1997). Also, elevated levels of APOBEC-1 mRNA were observed in several human gastric, pancreatic, and colonic tumors; however, the significance of this in terms of the disease is unclear as an absence of C to U mRNA editing was reported in human carcinomas (Lee et al. 1998; Greeve et al. 1999). In another example, editing of a cytidine in the mRNA encoding the neurofibromatosis type 1 (NF1) protein led to a malignant phenotype (Skuse et al. 1996; Cappione et al. 1997). This mRNA supported editing despite containing imperfect mooring sequence and spacer elements (Table 1). Editing changes a CGA codon to a STOP codon at position 3916 of the NF1 mRNA and predictably inactivates its tumor suppressor function. In NF-1 tumors, an association was observed between C to U editing, an increase in alternative splicing of a downstream exon and the expression of APOBEC-1 mRNA, which is normally restricted to intestinal cells (Mukhopadhyay et al. 2002). Increased expression of other members of the APOBEC-1 Related Protein family has also been implicated in the cancer phenotype, notably APOBEC-3G in several breast, uterine, and kidney tumors (Harris et al. 2002) (Table 3). AID overexpression also lead to oncogene activation (Okazaki et al. 2003) and the dysregulation of antibody expression seen in leukemias (Oppezzo et al. 2003). Dys-

378 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

regulation of AID activity was suggested as the mechanism by which hepatitis C virus induced genomic mutations and neoplasia (Machida et al. 2004). These findings suggest that, if APOBEC-1 and its related proteins were to have DNA editing (mutation) activity in mammalian cells, their activities must be restricted to specific sites within certain genes. For example, in contrast to its family member AID, overexpression of APOBEC-1 did not induce DNA mutations in immunoglobulin genes of mammalian B lymphocytes (Eto et al. 2003). It is, therefore, likely that subcellular localization and auxiliary protein activity will be of paramount importance in the regulation of the RNA versus DNA activities of APOBEC-1 and its related proteins.

5 Auxiliary proteins 5.1 Emergence of the C to U editosome concept Initially two hypotheses existed for the mechanism of apoB mRNA editing. The autonomous deaminase hypothesis presumed that the specificity to bind and deaminate C6666 resided within a single protein This concept was based on data in which editing activity was recovered from low molecular mass biochemical fractions of intestinal cells and the initiation of editing in these fractions occurred without a kinetic lag (suggesting complex assembly was not required) (Greeve et al. 1991). Moreover, mutagenesis of apoB sequences within 1-4 nucleotides flanking C6666 (both 5' and 3') had marginal effects on editing, suggesting that there was no sequence-specific RNA-binding protein and, therefore, higher order assemblies were not involved (Chen et al. 1990). The mooring sequence hypothesis was based on the discovery that cis-acting sequences beyond 1-4 nucleotides flanking C6666 (specifically the mooring sequence, Table 1) were essential for editing (Backus and Smith 1991). A multiprotein complex, or editosome, was hypothesized based on the observation that short, editing-site reporter RNAs formed RNA-protein complexes in hepatic and intestinal extracts with an approximately 11S sedimentation that matured over time to 27S, and that in vitro assembled complexes harbored edited and unedited apoB RNA (Backus and Smith 1991, 1992; Harris et al. 1993). A kinetic lag was observed that was inversely proportional to the level of editing activity in the extracts (Backus and Smith 1991; Harris et al. 1993). Complex formation and editing were mooring sequence-dependent, required incubation at 30oC and were inhibited by addition of vanadium ribonucleosides, criteria consistent with the selective formation of an editosome (Backus and Smith 1991). Mooring-sequence selective RNA-binding proteins identified by ultraviolet [UV] cross-linking to apoB mRNA comprised a ~66 kDa (p66) factor and a cluster of proteins with an average molecular mass of 44 kDa (p44). These proteins were observed in whole extracts (Lau et al. 1990; Driscoll et al. 1993; Harris et al. 1993; Navaratnam et al. 1993b; Mehta et al. 1996; Mehta and Driscoll 1998), as well as in partially purified editosomes (Harris et al. 1993; Richardson et al.

Mammalian C to U editing 379

1998). Their role in apoB mRNA editing was also suggested by their occurrence as 60S complexes that reorganized to 27S editosomes upon incubation with RNA editing substrates and in S100 whole cell extracts from tissues that supported in vivo apoB mRNA editing (Harris et al. 1993). Moreover, p66/p44 co-purified with 6-His-tagged APOBEC-1 isolated from apobec-1 transfected McArdle cells. P66 remained associated with 6-His-tagged APOBEC-1 following 1 M NaCl washes, but p44 was almost completely removed resulting in reduced editing activity. Reconstitution of the salt soluble and insoluble fractions restored editing activity (Yang et al. 1997) suggesting that editosome assembly from extracts required both proteins. 5.2 The editosome The identification of APOBEC-1 (Teng et al. 1993), which alone could not edit apoB mRNA, provided incontrovertible evidence that apoB mRNA editing required more than one protein. APOBEC-1 complementing activity (auxiliary proteins) could be demonstrated in extracts from a broad range of tissues and cell types even though many of these sources did not express APOBEC-1 (Teng et al. 1993; Driscoll and Zhang 1994; Giannoni et al. 1994; Inui 1994). Among these, only APOBEC-1 Complementation Factor (ACF) was sufficient to complement APOBEC-1 in mooring sequence-dependent editing of apoB mRNA in vitro (Lellek et al. 2000; Mehta et al. 2000), in yeast (G. Dance and H. Smith, unpublished findings) in Drosophila S2 cells (Sowden et al. 2004) and in mammalian cells (Blanc et al. 2001a). 5.3 APOBEC-1 complementation factor (ACF) The laboratory of Dr. Donna Driscoll purified biochemically a p66-like, UV crosslinking protein from baboon kidney. Peptide sequencing and database comparisons identified two human Expressed Sequence Tags (ESTs) that facilitated cloning of ACF cDNA (Mehta et al. 2000). ACF contained three RRMs, had the ability to bind to APOBEC-1, and was necessary and sufficient for complementing APOBEC-1 in vitro editing activity (Fig. 2). A second complementation factor, APOBEC-1 Stimulating Protein (ASP), was identified in rat liver, and a corresponding human cDNA was cloned (Lellek et al. 2000). ASP was identical to ACF except for an 8 amino acid insertion. Human ACF and ASP are encoded by a single gene on chromosome 10 and arise through alternative pre-mRNA splicing promoted by the insulin-regulated splicing factor SRp40 (Henderson et al. 2001; Dance et al. 2002). The functional significance of the two nearly identical proteins, renamed ACF64 and ACF65 based on their predicted molecular masses, is unclear as each complemented APOBEC-1 equally well (Dance et al. 2002). Acf64/65 mRNAs were expressed in all tissues examined including those that do not express APOBEC-1 or edit apoB mRNA, which suggested that ACF64/65 may have additional cellular functions.

380 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

Several alternatively spliced mRNA variants of human acf were identified in EST databases, although few have been validated (Chester et al. 2000; Anant et al. 2001a; Dance et al. 2002). Two additional rat specific variants arising from alternative acf pre-mRNA splicing and 3’ end formation were uniquely expressed in liver and intestine (Sowden et al. 2004). These mRNAs encoded 43 and 45 kDa proteins (henceforth termed ACF43 and ACF45) identical to the N-terminal two thirds of ACF64/65 (380 out of 586 amino acids) and, therefore, contain all three RRMs (Fig. 2). The N-terminus of ACF64 contained the domains required for RNA and APOBEC-1 binding/complementing activity: (i) amino acids 1-129 were required for APOBEC-1 binding, (ii) RRM1 (57-132) and RRM2 (137-216) contributed the most to apoB mRNA binding, (iii) RRM3 (231-301) enhanced apoB RNA binding activity, and (iv) sequences following RRM3 (331-385) enhanced apoB RNA binding (Blanc et al. 2001a; Mehta and Driscoll 2002). ACF43 and ACF45 bound APOBEC-1 and apoB mRNA, but were less competent than ACF64/65 in complementing editing activity. ACF43 interacted with APOBEC-1 as strongly as ACF64 and was compatible with apoB mRNA binding by ACF64 and ACF45. ACF45 interacted weakly with APOBEC-1 and complemented editing activity least well, but had a higher affinity for apoB mRNA. In RNA binding competition assays, ACF45 inhibited ACF64 and ACF43 apoB mRNA binding. Consistent with the ability of ACF43/45 to bind apoB RNA is the finding that the C-terminus of ACF64 was not essential for complementation activity. 5.4 Other auxiliary proteins Several other auxiliary proteins have been identified and characterized through yeast two hybrid analyses (Lau et al. 1997, 2001, 2001a; Greeve et al. 1998b; Anant et al. 2001, 2001b; Blanc et al. 2001b; Lau and Chan 2003), purification through APOBEC-1 affinity (Yang et al. 1997), apoB RNA binding (Lau et al. 1990; Harris and Smith 1992; Navaratnam et al. 1993b; Mehta and Driscoll 1998; Richardson et al. 1998; Steinburg et al. 1999; Mehta et al. 2000), antibody production (Schock et al. 1996), and classical biochemical fractionation (Teng and Davidson 1992; Mehta et al. 1996; Mehta and Driscoll 1998) (Table 2, reviewed in Anant and Davidson 2002; Wedekind et al. 2003). The candidates bound to APOBEC-1 and/or apoB mRNA and either enhanced or inhibited editing activity when overexpressed with APOBEC-1 in vitro or in cell-based systems. Interestingly, CUGBP-2 and GRY-RBP are homologous to the amino-terminal two-thirds of ACF and hence, when overexpressed, may fortuitously affect editing in a manner similar to ACF43 or ACF45. Whether the expression of the non-ACF auxiliary proteins is modulated by physiological perturbations known to affect apoB mRNA editing is unknown. ABBP-2, BAG-4, and GRY-RBP, as well as ACF, influenced the intracellular distribution of APOBEC-1 when overexpressed (Blanc et al. 2001a, 2001b; Lau et al. 2001a; Lau and Chan 2003).

Mammalian C to U editing 381

Table 2. Proteins influencing apoB mRNA editing. The identification and role of the listed proteins are described in the text. Protein ACF64/65 (p66) p100 p55 GRY-RBP CUGBP-2 ACF43/45 (p44) HnRNP A/B (ABBP-1) HnRNP C & D KSRP Hsp70 (ABBP-2) αI2 serum protease inhibitor (p240) BAG-4

Effect on Editing Essential; Binding APOBEC-1/RNA Stimulates RNA binding Simulates RNA-binding Conditional +/-; Binding APOBEC-1/ACF/RNA Inhibits, Binding APOBEC-1/RNA Conditional +/-; Binding APOBEC-1/RNA Stimulates; Binding APOBEC-1/RNA Inhibits; Binding APOBEC-1/RNA Stimulates; Alternative Splicing Factor Stimulates; Chaperone Inhibits; Protein sequestrant Inhibits; Protein sequestrant

6 Subcellular distribution of editing factors Under physiological conditions, apoB mRNA editing occurs in the nucleus (Lau et al. 1991; Dance et al. 2000), yet the distribution of ACF and APOBEC-1 in the cell is dynamic, increasing in the nucleus whenever editing was stimulated (Sowden et al. 2002, 2004). Interactions between the N-terminal 80 amino acids of APOBEC-1 and importin-α (Chester et al. 2003), and ACF with transportin-2 (Blanc et al. 2003) (both key nuclear pore-associated transport proteins), as well as the requirement of energy and RNA synthesis for editing factor shuttling argued for a dynamic and regulated process. This highlights the importance of understanding the determinants governing the subcellular distribution of editing factors. 6.1 APOBEC-1 As endogenous APOBEC-1 cannot be detected, localization studies have relied on ectopically expressed and epitope tagged APOBEC-1 cDNAs (Dance et al. 2001; Blanc et al. 2003). These as well as heterokaryon studies (Blanc et al. 2003; Chester et al. 2003) have indicated that APOBEC-1 traffics between the cytoplasm and nucleus. Editing factor subcellular trafficking in the regulation of nuclear apoB mRNA editing activity has broad implications as it may be a general characteristic of APOBEC-1 Related Proteins. For example, AID has been shown to only mediate nuclear DNA mutation during CSR if it can traffic to and from the cytoplasm (Brar et al. 2004; Ito et al. 2004). The factors that contribute to the nuclear distribution of APOBEC-1 have not been completely characterized. A stretch of basic amino acids in the N-terminus of

382 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

APOBEC-1 is similar to the bipartite Nuclear Localization Signal (NLS) of SV40 T antigen (Yang and Smith 1997; Dance et al. 2001; Chester et al. 2003) and point mutations of the key basic residues therein (residues R15-17, R30 and R33) abolished nuclear localization. These mutations also inhibited in vitro editing activity (Teng et al. 1999), suggesting that protein-protein and/or protein/RNA interactions at the N-terminus are fundamental to several levels of APOBEC-1 function. The N-terminal domain, however, was incapable of functioning alone as an NLS in chimeric protein contexts (the defining characteristic of a NLS). Deletion of a central M domain (residues 97-172) inhibited nuclear localization, and while not alone sufficient as an NLS, the M domain complemented the N-terminal domain in nuclear import of APOBEC-1 (Yang and Smith 1997; Yang et al. 2001). The function of these two domains as an NLS was restricted to APOBEC-1 as they could not direct nuclear import of heterologous proteins with a mass in excess of 60 kDa (Yang and Smith 1997). The catalytic domain (Yang and Smith 1997) and residues involved in RNA binding (Chester et al. 2003) are not required for nuclear localization (Fig. 2). The cryptic nature of these localization determinants suggests that the subcellular distribution of APOBEC-1 might not be determined solely by itself, but rather through its interaction with ACF or other editosomal proteins. Whether APOBEC-1 or ACF can direct independently their own nuclear localization is controversial. While in some analyses APOBEC-1 exhibited autonomous nuclear import capacity and an ability to mediate the nuclear import of ACF (Chester et al. 2003), ACF has also been reported to have autonomous nuclear import capacity that may serve to chaperone APOBEC-1 to the nucleus (Dance et al. 2001; Blanc et al. 2003). At this point in the controversy what is apparent is that coexpression of APOBEC-1 and ACF led to an increased nuclear localization of both proteins compared to that observed when either protein was expressed alone (Blanc et al. 2003; Chester et al. 2003). 6.2 ACF Immunofluorescence microscopy of endogenous ACF64 in McArdle cells revealed a nuclear and cytoplasmic distribution and electron micrographs of rat liver demonstrated ACF at the borders of nuclear heterochromatin and on the surface of the ER in the cytoplasm (Sowden et al. 2002). Biochemical fractionation suggested that cytoplasmic ACF65/64 (and APOBEC-1) were organized as 60S complexes whereas nuclear ACF65/64 were organized as active 27S editosomes. In contrast ACF45/43 were predominantly nuclear, enriched in 27S editosomes compared to cytoplasmic 60S complexes and did not shuttle between the nucleus and the cytoplasm (Sowden et al. 2002; Harris et al. 1993). A novel NLS motif (termed ANS) is located in ACF64, adjacent to RRM3 (Blanc et al. 2003) (Fig. 2).

Mammalian C to U editing 383

6.3 Regulation of apoB mRNA stability Nuclear/cytoplasmic trafficking of editing factors also regulates apoB mRNA stability. Premature translational termination codons (PTCs) introduced through either genomic mutations, alternative splicing and/or RNA editing may render an mRNA susceptible to nonsense mediated decay (NMD) if the PTC resides more than 50-55 nucleotides upstream of the terminal exon/exon junction (Zhang et al. 1998a, 1998b). Exon junction complexes (EJCs) mark splice junctions and facilitate mRNA export (Le Hir et al. 2000a, 2000b). RNAs containing PTCs were targeted for 5’ and 3’ exonucleolytic degradation during the ‘pioneer round’ of translation through interactions between EJCs and Upf factors (Mendell et al. 2000; Ishigaki et al. 2001; Lejeune et al. 2002, 2003; Arraiano and Maquat 2003). Editing of apoB mRNA creates a PTC at C6666, approximately 7 kb from the terminal exon/exon junction, yet the apoB48 protein is abundant suggesting edited apoB mRNA is not susceptible to NMD. To evaluate this resistance to NMD, splicing competent and translatable chimeric mRNAs containing apoB RNA were expressed in HeLa cells (Chester et al. 2003). APOBEC-1 dependent editing created a premature STOP codon and the chimeric RNAs underwent NMD. Coexpression of ACF64 imparted resistance to NMD on edited RNAs, whereas an RNA-binding defective mutant of ACF64 did not (Chester et al. 2003). This suggested that auxiliary proteins, and their appropriate subcellular localization, enhance the abundance of edited apoB mRNA by complementing APOBEC-1 as well as by stabilizing edited apoB mRNA. Editing of NF-1 mRNA also creates a PTC (Skuse et al. 1996; Cappione et al. 1997; Mukhopadhyay et al. 2002). Western blotting with NF-1 amino-terminal specific antibodies failed to detect a truncated protein product in whole tumor extracts so whether edited NF1 mRNA is stabilized by ACF or a similar factor stabilizes remains to be determined (H. Smith, unpublished findings).

7 Regulation of apoB mRNA editing ApoB mRNA editing is regulated in a species and tissue specific manner as well as developmentally (Demmer et al. 1986; Chen et al. 1990; Jiao et al. 1990; Huguchi et al. 1992; Patterson et al. 1992; Driscoll et al. 1993; Funahashi et al. 1995). Editing is also modulated by hormonal and metabolic perturbations (Davidson et al. 1988; Inui et al. 1994; Lau et al. 1995; Phung et al. 1996; von Wronski et al. 1998; Sowden et al. 2002, 2004; Mukhopadhyay et al. 2003). APOBEC-1 protein abundance is below detection limits of currently available antibodies and cannot be assessed, however, RT-PCR revealed that apobec-1 mRNA abundance correlated with editing levels in the fasted/refed rat model wherein fasting inhibited editing and refeeding a low fat, high carbohydrate diet stimulated editing. Conversely, in hypothyroid rats and ethanol fed rats APOBEC-1 abundance was unaltered despite increased editing (Inui et al. 1994; Lau et al. 1995). These findings suggested that

384 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

Fig. 4. ACF isoform expression and properties suggest a model for the metabolic regulation of apoB mRNA editing. The relative abundance of editing factors in arbitrary units estimated from scanning densitometry of western blots (Sowden et al. 2004), and for APOBEC-1, alterations in mRNA expression levels (Funahashi et al. 1995) are listed as inset table A. The relative strengths of APOBEC-1 interaction, apoB UV crosslinking activities and complementation activities of each ACF isoform (Blanc et al. 2001a; Mehta and Driscoll 2002; Sowden et al. 2004) are listed as inset table B. ACF64 and ACF65 have equal complementation activity when expressed at equivalent levels (Dance et al. 2002) so for simplicity, ACF64/65 are shown as ACF64. Cartoons of hepatocytes in three different metabolic states are shown. Arrows between hepatocytes signify the reversibility of each metabolic state. Within each cell, nuclear abundance of ACF64 is determined by nucleocytoplasmic trafficking that is indicated by arrows crossing the nucleus. ACF45/43 are proposed not to traffic. The length and thickness of each arrow signifies the net distribution of each editing factor. ACF isoform interactions with APOBEC-1 (green ovals) that may lead to nuclear co-import from the cytoplasm are indicated by arrows. Arrows are used to indicate ACF43/45 interactions with ACF64-APOBEC-1 complexes and apoB mRNA in net editosome disassembly (fasted), assembly (refed), or editosome recycling (basal).

apobec-1 mRNA expression was regulated by insulin but that auxiliary protein activity may also be modulated to affect changes in editing activity. Regulation of auxiliary protein expression was indicated for the developmental induction of apoB mRNA editing in the small intestines (Funahashi et al. 1995), for the reduction of editing by an agonist of the peroxisome proliferator-activated receptor α (PPARα) (Fu et al. 2004), and also in the fasted/refed rat liver metabolic model in which alterations in alternative acf mRNA splicing modulates the abundance of ACF isoforms in the cell nucleus (Sowden et al. 2002, 2004). In addition to de novo synthesis, editing factor activities may be regulated at the post-transcriptional level. Increased nuclear import of ACF was observed in both ethanol and insulin stimulated rat liver (Sowden 2002, 2004). Interestingly, enhanced editing activity could be induced by ethanol in hepatocytes without de novo protein or mRNA synthesis (Giangreco et al. 2001). Consistent with the prospect that the interactions and/or activities of pre-existing editing factors could be regulated to modulate editing activity is the possibility that APOBEC-1 is phosphorylated (Chen et al. 2001) and recent findings ACF65/64 is phosphorylated (D. Lehmann and H. Smith, manuscript in preparation).

Mammalian C to U editing 385

In addition, coexpression of ACF isoforms with different functional properties and their altered subcellular localization upon modulation of editing activity suggests that ACF isoforms are interactive components of a regulatory network controlling the amount of edited apoB mRNA. We propose a model (Fig. 4) that depicts three metabolic states in rat hepatocytes: (i) normal (~0.1nM insulin and ~65% editing), (ii) fasted (<0.1nM insulin and ~30% editing), and (iii) refed on a high sucrose diet (10 nM insulin, ~80% editing). Going from the basal to the fasted state reduced the abundance of ACF64 in liver nuclei (Sowden et al. 2004). ACF43/45 RNA binding activity increases under these conditions and competes with ACF for APOBEC-1 and apoB mRNA binding. This results in reduced apoB mRNA editing activity because ACF45 only weakly complements editing and ACF43-APOPBEC-1 complexes weakly bind to apoB mRNA in the presence of ACF45/64. An increase in nuclear abundance of ACF64 (upon refeeding or administration of insulin) would shift the binding equilibrium to ACF64-APOBEC-1 complexes and hence promote editing relative to the basal state. Insulin also stimulates APOBEC-1 expression (Phung et al. 1996; von Wronski et al. 1998) enabling ACF43/45 to compete with ACF64 for APOBEC-1 binding and, thereby, enhancing the overall editing effect. Implicit in this model is the prediction that coexpression of ACF isoforms may be essential for recycling editing factors to nascent apoB mRNAs following each catalytic event (editosome turnover). Given the high affinity of ACF64 for APOBEC-1 (which was equivalent to the SV40 T antigen interaction with p53 in yeast two hybrid analysis, (Sowden 2004)) and its tight binding to apoB mRNA (Kd ~8 nM, (Mehta and Driscoll 2002)), editosomes may need facilitated disassembly to carryout multiple rounds of editing. Competition by ACF45 and ACF43 for apoB mRNA and APOBEC-1 (respectively) may provide a means by which ACF64/65-containing editosomes disassemble.

8 Prospective for APOBEC-1 and APOBEC-1 related proteins The traditional significance of C to U mRNA editing has been in cardiovascular disease (see section 2). Significantly, lipoprotein analyses revealed that apoB mRNA editing in mammalian liver decreased the VLDL+LDL to HDL ratio (Greeve et al. 1993), which in humans is associated with reduced atherogenesis. Elevated LDL (hypercholesterolemia) was the primary cause of atherosclerosis in 25% of the population (Corsetti et al. 2003). Human liver expresses all the factors necessary for apoB mRNA editing except the editing enzyme APOBEC-1 (Navaratnam et al. 1993a) and consequently does not produce B48. It is likely that hypercholesterolemic patients would experience reduced LDL through the induction hepatic apoB mRNA editing. Unfortunately, the potential for high levels of ectopic enzyme expression to induce hepatic neoplasia has diminished enthusiasm for this therapeutic approach. More recently, use of a protein transduction meth-

386 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

odology in primary hepatocytes has suggested the possibility of delivering transient and controlled doses of APOBEC-1 to the liver of patients at risk for hypercholesterolemia induced atherosclerosis (Yang et al. 2002). An emerging area of biomedical significance stems from the discovery of eleven human paralogs and three mouse orthologs of APOBEC-1 (Jarmuz et al. 2002; Wedekind et al. 2003; Sawyer et al. 2004) (Table 3). The function of many of these APOBEC-1 Related Proteins is unknown except for AID, which is essential for the generation of immunoglobulin diversity by mediating CSR and SHM (Kinoshita et al. 1999; Muramatsu et al. 2000; Bross et al. 2002) and APOBEC3G (CEM15), –3B and –3F, which act to suppress of HIV-1 and HIV-2 infectivity (Sheehy et al. 2002; Zheng et al. 2004; Wiegand et al. 2004; Bishop et al. 2004; Liddament et al. 2004) and possibly AID suppression of hepatitis B infectivity (Turelli et al. 2004; Wieland et al. 2004). The zinc dependent deaminase (ZDD) motif of APOBEC-1 figured prominently in the discovery of the related proteins and in the characterization of their mechanism and substrates (Harris et al. 2002; Lecossier et al. 2003; Wedekind et al. 2003; Zhang et al. 2003; Beale et al. 2004; Xie et al. 2004). APOBEC-1, AID, and APOBEC-3G have deoxycytidine deaminase activity on reporter DNAs (Harris et al. 2002; Petersen-Mahrt 2002; Petersen-Mahrt and Neuberger 2003) indicating their likely biological role (Faili et al. 2002; Harris et al. 2002; Storb and Stavnezer 2002; Reynaud et al. 2003). Comparative structural analyses predicted, however, that they also could bind to and, therefore, possibly edit RNA (Wedekind et al. 2003; Xie et al. 2004). In this regard, there is evidence that de novo protein synthesis was necessary for the initiation of CSR and SHM suggesting that an essential protein may have only been expressed after AID-dependent editing of mRNA (Doi et al. 2003; Begum et al. 2004; Smith et al. 2004). The mechanisms that determine the site-specific mutation of DNA or editing of RNA by AID and APOBEC-3G are currently being explored. Given the ability of APOBEC-1 and its related proteins to broadly mutate numerous sites within reporter DNAs, an important question to consider is what prevents this from happening within mammalian cells under normal physiological conditions? One potential targeting mechanism is employed by APOBEC-3G and –3F which become encapsulated during virion assembly through their interaction with the viral Gag protein (Cen et al. 2004; Alce and Popik 2004) and host or viral RNA (Svarovkaia et al. 2004) and, thereby, achieve the close proximity to the viral RNA genome necessary to mutate nascent cDNA during reverse transcription in the early stages of HIV infection. Homo and/or heterodimerization of APOBEC-3B, 3F and 3G have potential of providing the host cell with an adaptive advantage to a broad range of RNA viruses. APOBEC-1, APOBEC-2, AID, and APOBEC-3A are not encapsulated and hence have limited antiviral activity. An auxiliary protein requirement for targeting AID mutational activity to immunoglobulin locus specific sites has been suggested (Barreto et al. 2003; Doi et al. 2003; Ta et al. 2003; Brar et al. 2004; Smith et al. 2004) and recently Replication Protein A was identified as a factor that interacted with AID at sites of SHM (Chaudhuri et al. 2004). Other factors may be part of a unique chromatin structure

Mammalian C to U editing 387

388 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden

assembled at the immunoglobulin locus (Woo et al. 2003). Further experimentation will be needed to determine whether the same or different auxiliary proteins are required for SHM and CSR. Future research will need to address the cellular functions of all of the APOBEC-1 Related Proteins including APOBEC-2, 3A, and 3C as well as identification of their cognate substrates. Unlike –3B, -3G, and -3F the former proteins contain only a single catalytic domain (Wedekind et al. 2003; Mangeat et al. 2003; Shindo et al. 2003; Zhang et al. 2003) suggesting that they may have different functions. The ability of all these enzymes to catalyze cytidine or deoxycytidine deamination in vitro should be validated in vivo as it is likely that auxiliary proteins and intracellular trafficking control will determine access to substrates within cells. In this regard, an important area of future research will be their tissuespecific expression and the regulation of enzyme activity and cellular localization. Structural studies will be critical for understanding the molecular basis for substrate specificity, enzyme mechanism and regulation as well as aiding the design of drugs to counter their roles in human disease.

Acknowledgements The authors acknowledge the many contributions of all the investigators in the field. We apologize to those whose specific work may not have been referenced due to space limitations. The authors are grateful Celeste MacElrevey for technical discussions and to Jenny M. L. Smith for the preparation of Figure 2. The authors' efforts on this chapter and contributions to the field of RNA editing have been supported in part by grants from the National Institutes of Health, The Air Force Office of Scientific Research, The Alcoholic Beverage Medical Research Foundation, The Council for Tobacco Research, The American Heart Association, The Office of Naval Research, and The Howard Hughes Medical Research Institute.

References Alce TM, Popik W (2004) APOBEC-3G is incorporated into virus-like particles by a direct interaction with HIV-1 Gag nucleocapsid protein. J Biol Chem 279:34083-34806 Anant S, Davidson NO (2000) An AU-rich sequence element (UUUN[A/U]U) downstream of the edited C in apolipoprotein B mRNA is a high-affinity binding site for Apobec-1: binding of Apobec-1 to this motif in the 3' untranslated region of c- myc increases mRNA stability. Mol Cell Biol 20:1982-1992 Anant S, Davidson NO (2002) Identification and regulation of protein components of the apolipoprotein B mRNA editing enzyme. A complex event. Trends Cardiovasc Med 12:311-317 Anant S, Henderson JO, Mukhopadhyay D, Navaratnam N, Kennedy S, Min J, Davidson NO (2001) Novel role for RNA-binding protein CUGBP2 in mammalian RNA editing. CUGBP2 modulates C to U editing of apolipoprotein B mRNA by interacting with

Mammalian C to U editing 389 apobec-1 and ACF, the apobec-1 complementation factor. J Biol Chem 276:4733847351 Anant S, Henderson JO, Mukhopadhyay D, Navaratnam N, Kennedy S, Min J, Davidson NO (2001a) Novel role for RNA-binding protein CUGBP2 in mammalian RNA editing. J Biol Chem 276:47338-47351 Anant S, MacGinnitie AJ, Davidson NO (1995) apobec-1, the catalytic subunit of the mammalian apolipoprotein B mRNA editing enzyme, is a novel RNA-binding protein. J Biol Chem 270:14762-14767 Anant S, Mukhopadhyay D, Hirano K, Brasitus TA, Davidson NO (2002) Apobec-1 transcription in rat colon cancer: decreased apobec-1 protein production through alterations in polysome distribution and mRNA translation associated with upstream AUGs. Biochim Biophys Acta 1575:54-62 Anant S, Mukhopadhyay D, Sankaranand V, Kennedy S, Henerson JO, Davidson NO (2001b) ARCD-1, an apobec-1-related cytidine deaminase, exerts a dominant negative effect on C to U mRNA editing. Am J Physiol Cell Physiol 281:1904-1916 Anant S, Yu H, Davidson NO (1998) Evolutionary origins of the mammalian apolipoprotein B RNA editing enzyme, apobec-1: structural homology inferred from analysis of a cloned chicken small intestinal cytidine deaminase. Biol Chem 379:1075-1081 Apostel F, Dammann R, Pfeifer GP, Greeve J (2002) Reduced expression and increased CpG dinucleotide methylation of the rat APOBEC-1 promoter in transgenic rabbits. Biochim Biophys Acta 1577:384-394 Arakawa H, Hauschild J, Buerstedde JM (2002) Requirement of the activation-induced deaminase (AID) gene for immunoglobulin gene conversion. Science 295:1301-1306 Arraiano CM, Maquat LE (2003) Post-transcriptional control of gene expression: effectors of mRNA decay. Mol Microbiol 49:267-276 Backus JW, Schock D, Smith HC (1994) Only cytidines 5' of the apolipoprotein B mRNA mooring sequence are edited. Biochim Biophys Acta 1219:1-14 Backus JW, Smith HC (1991) Apolipoprotein B mRNA sequences 3' of the editing site are necessary and sufficient for editing and editosome assembly. Nucl Acids Res 19:67816786 Backus JW, Smith HC (1992) Three distinct RNA sequence elements are required for efficient apolipoprotein B (apoB) RNA editing in vitro. Nucl Acids Res 20:6007-6014 Barreto V, Reina-San-Martin B, Ramiro AR, McBride KM, Nussenzweig MC (2003) Cterminal deletion of AID uncouples class switch recombination from somatic hypermutation and gene conversion. Mol Cell 12:501-508 Beale RC, Petersen-Mahrt SK, Watt IN, Harris RS, Rada C, Neuberger MS (2004) Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo. J Mol Biol 337:585-596 Beghini A, Ripamonti CB, Peterlongo P, Roversi G, Cairoli R, Morra E, Larizza L (2000) RNA hyperediting and alternative splicing of hematopoietic cell phosphatase (PTPN6) gene in acute myeloid leukemia. Hum Mol Genet 9:2297-2304 Begum NA, Kinoshita K, Muramatsu M, Nagaoka H, Shinkura R, Honjo T (2004) De novo protein synthesis is required for activation-induced cytidine deaminase-dependent DNA cleavage in class switch recombination. Proc Natl Acad Sci USA 101:1300313007 Betts L, Xiang S, Short SA, Wolfenden R, Carter CW Jr (1994) Cytidine deaminase. The 2.3 Å crystal structure of an enzyme: transition-state analog complex. J Mol Biol 235:635-656

390 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden Bishop KN, Holmes RK, Sheehy AM, Malim MH (2004) APOBEC-mediated editing of viral RNA. Science 305:645 Bishop KN, Holmes RK, Sheehy AM, Davidson NO, Cho S-J, Malim MH (2004a) Cytidine deamination of retroviral DNA by diverse APOBEC proteins. Curr Biol 14:1392-1396 Blanc V, Henderson JO, Kennedy S, Davidson NO (2001a) Mutagenesis of apobec-1 complementation factor reveals distinct domains that modulate RNA binding, proteinprotein interaction with apobec-1, and complementation of C to U RNA-editing activity. J Biol Chem 276:46386-46393 Blanc V, Kennedy SM, Davidson NO (2003) A novel nuclear localization signal in the auxiliary domain of apobec-1 complementation factor (ACF) regulates nucleo-cytoplasmic import and shuttling. J Biol Chem 278: 41198-41204 Blanc V, Navaratnam N, Henderson JO, Anant S, Kennedy S, Jarmuz A, Scott J, Davidson NO (2001b) Identification of GRY-RBP as an apolipoprotein B RNA-binding protein that interacts with both apobec-1 and apobec-1 complementation factor to modulate C to U editing. J Biol Chem 276:10272-10283 Brar SS, Watson M, Diaz M (2004) Activation-induced cytosine deaminase, AID, is actively exported out of the nucleus but retained by the induction of DNA breaks. J Biol Chem 279:26395-26401 Bross L, Muramatsu M, Kinoshita K, Honjo T, Jacobs H (2002) DNA double-strand breaks: prior to but not sufficient in targeting hypermutation. J Exp Med 195:11871192 Cappione AJ, French BL, Skuse GR (1997) A potential role for NF1 mRNA editing in the pathogenesis of NF1 tumors. Am J Hum Genet 60:305-312 Carlow DC, Carter CW Jr, Mejlhede N, Neuhard J, Wolfenden R (1999) Cytidine deaminases from B. subtilis and E. coli: compensating effects of changing zinc coordination and quaternary structure. Biochemistry 38:12258-12265 Carter C (1998) In: Grosjean H, Benne R (ed) Modification and Editing of RNA. ASM Press, Washington DC Cen S, Guo F, Niu M, Saadatmand J, deflassieux J, Kleiman L (2004) The interaction between HIV-1 Gag and APOBEC-3G. J Biol Chem 279:33177-33184 Chan L (1992) Apolipoprotein B, the major protein component of triglyceride-rich and low density lipoproteins. J Biol Chem 267:25621-25624 Chaudhuri J, Khuong C, Alt FW (2004) Replication protein A interacts with AID to promote deamination of somatic hypermutation targets. Nature 430: 992-998 Chaudhuri J, Tian M, Khuong C, Chua K, Pinaud E, Alt FW (2003) Transcription-targeted DNA deamination by the AID antibody diversification enzyme. Nature 422:726-730 Chen SH, Habib G, Yang CY, Gu ZW, Lee BR, Weng SA, Silberman SR, Cai SJ, Deslypere JP, Rosseneu M, et al. (1987) Apolipoprotein B-48 is the product of a messenger RNA with an organ- specific in-frame stop codon. Science 238:363-366 Chen SH, Li XX, Liao WS, Wu JH, Chan L (1990) RNA editing of apolipoprotein B mRNA. Sequence specificity determined by in vitro coupled transcription editing. J Biol Chem 265:6811-6816 Chen Z, Eggerman TL, Patterson AP (2001) Phosphorylation is a regulatory mechanism in apolipoprotein B mRNA editing. Biochem J 357:661-672 Chester A, Scott J, Anant S, Navaratnam N (2000) RNA editing: cytidine to uridine conversion in apolipoprotein B mRNA. Biochim Biophys Acta 1494:1-13

Mammalian C to U editing 391 Chester A, Somasekaram A, Tzimina M, Jarmuz A, Gisbourne J, O'Keefe R, Scott J, Navaratnam N (2003) The apolipoprotein B mRNA editing complex performs a multifunctional cycle and suppresses nonsense-mediated decay. EMBO J 22:3971-3982 Chester A, Weinreb V, Carter CW, Navaratnam N (2004) Optimization of apolipoprotein B mRNA editing by APOBEC-1 apoenzyme and the role of its auxiliary factor, ACF. RNA 10:1399-1411 Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5:823-826 Corsetti JP, Zareba W, Moss AJ, Ridker PM, Marder VJ, Rainwater DL, Sparks CE (2003) Metabolic Syndrome best defines the multivariate distribution of blood variables in postinfarction patients. Athersclerosis 171:351-358 Dance GS, Beemiller P, Yang Y, Mater DV, Mian IS, Smith HC (2001) Identification of the yeast cytidine deaminase CDD1 as an orphan C-->U RNA editase. Nucl Acids Res 29:1772-1780 Dance GS, Sowden MP, Yang Y, Smith HC (2000) APOBEC-1 dependent cytidine to uridine editing of apolipoprotein B RNA in yeast. Nucl Acids Res 28:424-429 Dance GSC, Sowden MP, Cartegni L, Cooper E, Krainer AR, Smith HC (2002) Two proteins essential for apolipoprotein B mRNA editing are expressed from a single gene through alternative splicing. J Biol Chem 277:12703-12709 Davidson NO, Powell LM, Wallis SC, Scott J (1988) Thyroid hormone modulates the introduction of a stop codon in rat liver apolipoprotein B messenger RNA. J Biol Chem 263:13482-13485 Davies MS, Wallis SC, Driscoll DM, Wynne JK, Williams GW, Powell LM, Scott J (1989) Sequence requirements for apolipoprotein B RNA editing in transfected rat hepatoma cells. J Biol Chem 264:13395-13398 Demmer LA, Levin MS, Elovson J, Reuben MA, Lusis AJ, Gordon JI (1986) Tissuespecific expression and developmental regulation of the rat apolipoprotein B gene. Proc Natl Acad Sci USA 83:8102-8106 Doi T, Kinoshita K, Ikegawa M, Muramatsu M, Honjo T (2003) De novo protein synthesis is required for the activation-induced cytidine deaminase function in class-switch recombination. Proc Natl Acad Sci USA 100:2634-2638 Driscoll DM, Lakhe-Reddy S, Oleksa LM, Martinez D (1993) Induction of RNA editing at heterologous sites by sequences in apolipoprotein B mRNA. Mol Cell Biol 13:72887294 Driscoll DM, Zhang Q (1994) Expression and characterization of p27, the catalytic subunit of the apolipoprotein B mRNA editing enzyme. J Biol Chem 269:19843-19847 Eto T, Kinoshita K, Yoshikawa K, Muramatsu M, Honjo T (2003) RNA-editing cytidine deaminase Apobec-1 is unable to induce somatic hypermutation in mammalian cells. Proc Natl Acad Sci USA 100:12895-12898 Faili A, Aoufouchi S, Gueranger Q, Zober C, Leon A, Bertocci B, Weill JC, Reynaud CA (2002) AID-dependent somatic hypermutation occurs as a DNA single-strand event in the BL2 cell line. Nat Immunol 3:815-821 Fu T, Mukhopadhyay D, Davidson NO, Borensztajn J (2004) The Peroxisome Proliferatoractivated Receptor {alpha} (PPAR{alpha}) agonist ciprofibrate inhibits apolipoprotein B mRNA editing in low density lipoprotein receptor-deficient mice: effects on plasma lipoproteins and the development of atherosclerotic lesions. J Biol Chem 279:2866228669

392 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden Funahashi T, Giannoni F, DePaoli AM, Skarosi SF, Davidson NO, (1995) Tissue-specific, developmental and nutritional regulation of the gene encoding the catalytic subunit of the rat apoB mRNA editing enzyme: functional role in the modulation of apoB mRNA editing. J Lipid Res 36:414-428 Giangreco A, Sowden MP, Mikityansky I, Smith HC (2001) Ethanol stimulates apolipoprotein B mRNA editing in the absence of de novo RNA or protein synthesis. Biochem Biophys Res Commun 289:1162-1167 Giannoni F, Bonen DK, Funahashi T, Hadjiagapiou C, Burant CF, Davidson NO (1994) Complementation of apolipoprotein B mRNA editing by human liver accompanied by secretion of apolipoprotein B48. J Biol Chem 269:5932-5936 Gott JM, Emeson RB (2000) Functions and mechanisms of RNA editing. Annu Rev Genet 34:499-531 Greeve J, Altkemper I, Dieterich JH, Greten H, Windler E (1993) Apolipoprotein B mRNA editing in 12 different mammalian species: hepatic expression is reflected in low concentrations of apoB- containing plasma lipoproteins. J Lipid Res 34:1367-1383 Greeve J, Axelos D, Welker S, Schipper M, Greten H (1998a) Distinct promoters induce APOBEC-1 expression in rat liver and intestine. Arterioscler Thromb Vasc Biol 18:1079-1092 Greeve J, Lellek H, Apostel F, Hundoegger K, Barialai A, Kirsten R, Welker S, Greten H (1999) Absence of APOBEC-1 mediated mRNA editing in human carcinomas. Oncogene 18:6357-6366 Greeve J, Lellek H, Rautenberg P, Greten H (1998b) Inhibition of the apolipoprotein B mRNA editing enzyme-complex by hnRNP C1 protein and 40S hnRNP complexes. Biol Chem 379:1063-1073 Greeve J, Navaratnam N, Scott J (1991) Characterization of the apolipoprotein B mRNA editing enzyme: no similarity to the proposed mechanism of RNA editing in kinetoplastid protozoa. Nucl Acids Res 19:3569-3576 Harris RS, Bishop KN, Sheehy AM, Craig HM, Petersen-Mahrt SK, Watt IN, Neuberger MS, Malim MH (2003) DNA deamination mediates innate immunity to retroviral infection. Cell 113:803-809 Harris RS, Petersen-Mahrt SK, Neuberger MS (2002) RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Mol Cell 10:1247-1253 Harris SG, Sabio I, Mayer E, Steinberg MF, Backus JW, Sparks JD, Sparks CE, Smith HC (1993) Extract-specific heterogeneity in high-order complexes containing apolipoprotein B mRNA editing activity and RNA-binding proteins. J Biol Chem 268:7382-7392 Harris SG, Smith HC (1992) In vitro apolipoprotein B mRNA editing activity can be modulated by fasting and refeeding rats with a high carbohydrate diet. Biochem Biophys Res Commun 183:899-903 Hartner JC, Schmittwolf C, Kispert A, Muller AM, Higuchi M, Seeburg PH (2004) Liver disintegration in the mouse embryo caused by deficiency in the RNA-editing enzyme ADAR1. J Biol Chem 279:4894-4902 Henderson JO, Blanc V, Davidson NO (2001) Isolation, characterization, and developmental regulation of the human apobec-1 complementation factot (ACF) gene. Biochim Biophys Acta 1522:22-30 Hersberger M, Innerarity TL (1998) Two efficiency elements flanking the editing site of cytidine 6666 in the apolipoprotein B mRNA support mooring-dependent editing. J Biol Chem 273:9435-9442

Mammalian C to U editing 393 Hersberger M, Patarroyo-White S, Arnold KS, Innerarity TL (1999) Phylogenetic analysis of the apolipoprotein B mRNA-editing region. Evidence for a secondary structure between the mooring sequence and the 3' efficiency element. J Biol Chem 274:3459034597 Hirano K, Min J, Funahashi T, Baunoch DA, Davidson NO (1997) Characterization of the human apobec-1 gene: expression in gastrointestinal tissues determined by alternative splicing with production of a novel truncated peptide. J Lipid Res 38:847-859 Hirano KI, Young SG, Farese RV, Ng J, Sande E, Warburton C, Powell-Braxton LM, Davidson NO (1996) Targeted disruption of the mouse apobec-1 gene abolishes apolipoprotein B mRNA editing and eliminates apolipoprotein B48. J Biol Chem 271:98879890 Huguchi K, Kitagawa K, Kogishi K, Takeda T (1992) Developmental and age-related changes in apoB mRNA editing in mice. J Lipid Res 33:1753-1764 Inui Y, Giannoni F, Funahashi T, Davidson NO (1994) REPR and complementation factor(s) interact to modulate rat apolipoprotein B mRNA editing in response to alterations in cellular cholesterol flux. J Lipid Res 35:1477-1489 Ireton GC, Black ME, Stoddard BL (2003) The 1.14 Å crystal structure of yeast cytosine deaminase: evolution of nucleotide salvage enzymes and implications for genetic chemotherapy. Structure (Camb) 11:961-972 Ishigaki Y, Li X, Serin G, Maquat LE (2001) Evidence for a pioneer round of mRNA translation: mRNAs subject to nonsense-mediated decay in mammalian cells are bound by CBP80 and CBP20. Cell 106:607-617 Ito S, Nagaoka H, Shinkura R, Begum N, Muramatsu M, Nakata M, Honjo T (2004) Activation-induced cytidine deaminase shuttles between nucleus and cytoplasm like apolipoprotein B mRNA editing catalytic polypeptide 1. Proc Natl Acad Sci USA 101:1975-1980 Jarmuz A, Chester A, Bayliss J, Gisbourne J, Dunham I, Scott J, Navaratnam N (2002) An anthropoid-specific locus of orphan C to U RNA-editing enzymes on chromosome 22. Genomics 79:285-296 Jiao S, Moberly JB, Schonfeld G (1990) Editing of apolipoprotein B messenger RNA in differentiated Caco-2 cells. J Lipid Res 31:695-700 Johansson E, Mejlhede N, Neuhard J, Larsen S (2002) Crystal structure of the tetrameric cytidine deaminase from Bacillus subtilis at 2.0 Å resolution. Biochemistry 41:25632570 Keegan LP, Gallo A, O'Connell MA (2001) The many roles of an RNA editor. Nat Rev Genet 2:869-878 Keegan LP, Leroy A, Sproul D, O'Connell MA (2004) Adenosine deaminases acting on RNA (ADARs): RNA-editing enzymes. Genome Biol 5:209 Kinoshita K, Lee CG, Tashiro J, Muramatsu M, Chen XC, Yoshikawa K, Honjo T (1999) Molecular mechanism of immunoglobulin class switch recombination. Cold Spring Harb Symp Quant Biol 64:217-226 Ko TP, Lin JJ, Hu CY, Hsu YH, Wang AH, Liaw SH (2003) Crystal structure of yeast cytosine deaminase. Insights into enzyme mechanism and evolution. J Biol Chem 278:19111-19117 Kozarsky KF, Bone DK, Giannoni F, Funahashi T, Wilson JM, Davidson NO (1996) Hepatic expression of the catalytic subunit of the apolipoprotein B mRNA editing enzyme ameliorates hypercholesterolemia in LDL receptor-deficient rabbits. Hum Gene Therapy 7:943-957

394 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden Kurtz JE, Exinger F, Erbs P, Jund R (1999) New insights into the pyrimidine salvage pathway of Saccharomyces cerevisiae: requirement of six genes for cytidine metabolism. Curr Genet 36:130-136 Lau PP, Cahill DJ, Zhu HJ, Chan L (1995) Ethanol modulates apolipoprotein B mRNA editing in the rat. J Lipid Res 36:2069-2078 Lau PP, Chan L (2003) Involvement of a chaperone regulator, Bcl2-associated Athanogene4 (BAG-4), in apolipoprotein B mRNA editing. J Biol Chem 278:52988-52996 Lau PP, Chang BH, Chan L (2001) Two-hybrid cloning identifies an RNA-binding protein, GRY-RBP, as a component of apobec-1 editosome. Biochem Biophys Res Commun 282:977-983 Lau PP, Chen SH, Wang JC, Chan L (1990) A 40 kilodalton rat liver nuclear protein binds specifically to apolipoprotein B mRNA around the RNA editing site. Nucl Acids Res 18:5817-5821 Lau PP, Zhu H-J, Baldini HA, Charnsangavej C, Chan L (1994) Dimeric structure of a human apo B mRNA editing protein and cloning and chromosomal localization of its gene. Proc Natl Acad Sci USA 91:8522-8526 Lau PP, Villanueva H, Kobayashi K, Nakamuta M, Chang HJ, Chan L (2001a) A DnaJ protein, Apobec-1-binding protein-2, modulates apolipoprotein B mRNA editing. J Biol Chem 276:46445-46452 Lau PP, Xiong W, Zhu H-J, Chen S-H, Chan L (1991) Apo B mRNA editing is an intranuclear event that occurs posttranscriptionally coincident with splicing and polyadenylation. J Biol Chem 266:20550-20554 Lau PP, Zhu HJ, Nakamuta M, Chan L (1997) Cloning of an Apobec-1-binding protein that also interacts with apolipoprotein B mRNA and evidence for its involvement in RNA editing. J Biol Chem 272:1452-1455 Le Hir H, Izaurralde E, Maquat LE, Moore MJ (2000a) The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. EMBO J 19:6860-6869 Le Hir H, Moore MJ, Maquat LE (2000b) Pre-mRNA splicing alters mRNP composition: evidence for stable association of proteins at exon-exon junctions. Genes Dev 14:10981108 Lecossier D, Bouchonnet F, Clavel F, Hance AJ (2003) Hypermutation of HIV-1 DNA in the absence of the Vif protein. Science 300:1112 Lee RM, Hirano K, Anant S, Baunoch D, Davidson NO (1998) An alternatively spliced form of apobec-1 messenger RNA is overexpressed in human colon cancer. Gastroenterology 115:1096-1103 Lejeune F, Ishigaki Y, Li X, Maquat LE (2002) The exon junction complex is detected on CBP80-bound but not eIF4E-bound mRNA in mammalian cells: dynamics of mRNP remodeling. EMBO J 21:3536-3545 Lejeune F, Li X, Maquat LE (2003) Nonsense-mediated mRNA decay in mammalian cells involves decapping, deadenylating, and exonuclelolytic activities. Mol Cell 12:675687 Lellek H, Kirsten R, Diehl I, Apostel F, Buck F, Greeve J (2000) Purification and molecular cloning of a novel essential component of the apolipoprotein B mRNA editing enzyme-complex. J Biol Chem 275:19848-19856 Liddament MT, Brown WL, Schumacher AJ, Harris RS (2004) APOBEC-3F properties and hypermutation preferences indicate activity against HIV-1 in vivo. Curr Biol 14:13851391

Mammalian C to U editing 395 Maas S, Rich A (2000) Changing genetic information through RNA editing. Bioessays 22:790-802 MacGinnitie AJ, Anant S, Davidson NO (1995) Mutagenesis of APOBEC-1, the catalytic subunit of the mammalian apolipoprotein B mRNA editing enzyme, reveals distinct domains that mediate cytosine nucleoside deaminase, RNA-binding, and RNA editing activity. J Biol Chem 270:14768-14775 Machida K, Cheng KT, Sung VM, Shimodaira S, Lindsay KL, Levine AM, Lai MY, Lai MM (2004) Hepatitis C virus induces a mutator phenotype: enhanced mutations of immunoglobulin and protooncogenes. Proc Natl Acad Sci USA 101:4262-4267 Macnaughton TB, Li YI, Doughty AL, Lai MM (2003) Hepatitis delta virus RNA encoding the large delta antigen cannot sustain replication due to rapid accumulation of mutations associated with RNA editing. J Virol 77:12048-12056 Mangeat B, Turelli P, Caron G, Friedli M, Perrin L, Trono D (2003) Broad antiretroviral defence by human APOBEC3G through lethal editing of nascent reverse transcripts. Nature 424:99-103 Mehta A, Banerjee S, Driscoll DM (1996) Apobec-1 interacts with a 65-kDa complementing protein to edit apolipoprotein-B mRNA in vitro. J Biol Chem 271:28294-28299 Mehta A, Driscoll DM (1998) A sequence-specific RNA-binding protein complements apobec-1 To edit apolipoprotein B mRNA. Mol Cell Biol 18:4426-4432 Mehta A, Driscoll DM (2002) Identification of domains in APOBEC-1 complementation factor required for RNA binding and apolipoprotein B mRNA editing. RNA 8:69-82 Mehta A, Kinter MT, Sherman NE, Driscoll DM (2000) Molecular cloning of apobec-1 complementation factor, a novel RNA- binding protein involved in the editing of apolipoprotein B mRNA. Mol Cell Biol 20:1846-1854 Mendell JT, Medghalchi SM, Lake RG, Noensie EN, Dietz HC (2000) Novel Upf2p orthologues suggest a functional link between translation initiation and nonsense surveillance complexes. Mol Cell Biol 20:8944-8957 Mian IS, Moser MJ, Holley WR, Chatterjee A (1998) Statistical modelling and phylogenetic analysis of a deaminase domain. J Comput Biol 5:57-72 Miller S, Lesk AM, Janin J, Chothia C (1987) The accessible surface area and stability of oligomeric proteins. Nature 328:834-836 Mukhopadhyay D, Anant S, Lee RM, Kennedy S, Viskochil D, Davidson NO (2002) C->U editing of neurofibromatosis 1 mRNA occurs in tumors that express both the type II transcript and apobec-1, the catalytic subunit of the apolipoprotein B mRNA-editing enzyme. Am J Hum Genet 70:38-50 Mukhopadhyay D, Plateroti M, Anant S, Nassir F, Samarut J, Davidson NO (2003) Thyroid hormone regulates hepatic triglyceride mobilization and apolipoprotein B messenger ribonucleic Acid editing in a murine model of congenital hypothyroidism. Endocrinology 144:711-719 Muramatsu M, Kinoshita K, Fagarasan S, Yamada S, Shinkai Y, Honjo T (2000) Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102:553-563 Muramatsu M, Sankaranand VS, Anant S, Sugai M, Kinoshita K, Davidson NO, Honjo T (1999) Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RNA-editing deaminase family in germinal center B cells. J Biol Chem 274:18470-18476 Nakamuta M, Chang BHJ, Zsigmond E, Kobayashi K, Lei H, Ishida BY, Oka K, Li E, Chan L (1996) Complete phenotypic characterization of the apobec-1 knockout mice

396 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden with a wild-type genetic background and a human apolipoprotein B transgenic background, and restoration of apolipoprotein B mRNA editing by somatic gene transfer of Apobec-1. J Biol Chem 271:25981-25988 Nakamuta M, Oka K, Krushkal J, Kobayashi K, Yamamoto M, Li WH, Chan L (1995) Alternative mRNA splicing and differential promoter utilization determine tissue-specific expression of the apolipoprotein B mRNA-editing protein (Apobec1) gene in mice. Structure and evolution of Apobec1 and related nucleoside/nucleotide deaminases. J Biol Chem 270:13042-13056 Navaratnam N, Bhattacharya S, Fujino T, Patel D, Jarmuz AL, Scott J (1995) Evolutionary origins of apoB mRNA editing: catalysis by a cytidine deaminase that has acquired a novel RNA-binding motif at its active site. Cell 81:187-195 Navaratnam N, Patel D, Shah RR, Greeve JC, Powell LM, Knott TJ, Scott J (1991) An additional editing site is present in apolipoprotein B mRNA. Nucl Acids Res 19: 17411744 Navaratnam N, Fujino T, Bayliss J, Jarmuz A, How A, Richardson N, Somasekaram A, Bhattacharya S, Carter C, Scott J (1998) Escherichia coli cytidine deaminase provides a molecular model for ApoB RNA editing and a mechanism for RNA substrate recognition. J Mol Biol 275:695-714 Navaratnam N, Morrison JR, Bhattacharya S, Patel D, Funahashi T, Giannoni F, Teng BB, Davidson NO, Scott J (1993a) The p27 catalytic subunit of the apolipoprotein B mRNA editing enzyme is a cytidine deaminase. J Biol Chem 268:20709-20712 Navaratnam N, Shah R, Patel D, Fay V, Scott J (1993b) Apolipoprotein B mRNA editing is associated with UV crosslinking of proteins to the editing site. Proc Natl Acad Sci USA 90:222-226 Oka K, Kobayashi K, Sullivan M, Martinez J, Teng BB, Ishimura-Oka K, Chan L (1997) Tissue-specific inhibition of apolipoprotein B mRNA editing in the liver by adenovirus-mediated transfer of a dominant negative mutant APOBEC-1 leads to increased low density lipoprotein in mice. J Biol Chem 272:1456-1460 Okazaki IM, Hiai H, Kakazu N, Yamada S, Muramatsu M, Kinoshita K, Honjo T (2003) Constitutive expression of AID leads to tumorigenesis. J Exp Med 197:1173-1181 Oppezzo P, Vuillier F, Vasconcelos Y, Dumas G, Magnac C, Payelle-Brogard B, Pritsch O, Dighiero G (2003) Chronic lymphocytic leukemia B cells expressing AID display dissociation between class switch recombination and somatic hypermutation. Blood 101: 4029-4032 Palladino MJ, Keegan LP, O'Connell MA, Reenan RA (2000a) dADAR, a Drosophila double-stranded RNA-specific adenosine deaminase is highly developmentally regulated and is itself a target for RNA editing. RNA 6:1004-1018 Patterson AP, Tennyson GE, Hoeg JM, Sviridov DD, Brewer HB Jr (1992) Ontogenetic regulation of apolipoprotein B mRNA editing during human and rat development in vivo. Arterioscler Thromb 12:468-473 Petersen-Mahrt SK, Harris RS, Neuberger MS (2002) AID mutates E. coli suggesting a DNA deamination mechanism for antibody diversification. Nature 418:99-104 Petersen-Mahrt SK, Neuberger MS (2003) In vitro deamination of cytosine to uracil in single-stranded DNA by apolipoprotein B editing complex catalytic subunit 1 (APOBEC1). J Biol Chem 278:19583-19586 Phung TL, Sowden MP, Sparks JD, Sparks CE, Smith HC (1996) Regulation of hepatic apolipoprotein B RNA editing in the genetically obese Zucker rat. Metabolism 45:1056-1058

Mammalian C to U editing 397 Powell LM, Wallis SC, Pease RJ, Edwards YH, Knott TJ, Scott J (1987) A novel form of tissue-specific RNA processing produces apolipoprotein- B48 in intestine. Cell 50:831-840 Qian X, Balestra ME, Innerarity TL (1997) Two distinct TATA-less promoters direct tissue-specific expression of the rat apo-B editing catalytic polypeptide 1 gene. J Biol Chem 272:18060-18070 Qian X, Balestra ME, Yamanaka S, Boren J, Lee I, Innerarity TL (1998) Low expression of the apolipoprotein B mRNA editing transgene in mice reduces LDL but does not cause liver dysplasia or tumors. Arteriosc Thromb Vasc Biol 18:1013-1020 Reenan RA (2001) The RNA world meets behavior: A-->I pre-mRNA editing in animals. Trends Genet 17:53-56 Revy P, Muto T, Levy Y, Geissmann F, Plebani A, Sanal O, Catalan N, Forveille M, Dufourcq-Labelouse R, Gennery A, Tezcan I, Ersoy F, Kayserili H, Ugazio AG, Brousse N, Muramatsu M, Notarangelo LD, Kinoshita K, Honjo T, Fischer A, Durandy A (2000) Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102:565-575 Reynaud CA, Aoufouchi S, Faili A, Weill JC (2003) What role for AID: mutator, or assembler of the immunoglobulin mutasome? Nat Immunol 4:631-638 Richardson N, Navaratnam N, Scott J (1998) Secondary structure for the apolipoprotein B mRNA editing site. Au-binding proteins interact with a stem loop. J Biol Chem 273:31707-31717 Rueter SM, Dawson TR, Emeson RB (1999) Regulation of alternative splicing by RNA editing. Nature 399:75-80 Sawyer SL, Emerman M, Malik HS (2004) Ancient adaptive evolution of the primate antiviral DNA-editing enzyme APOBEC-3G. PLOS. Biology 2:e275 Schock D, Kuo SR, Steinburg MF, Bolognino M, Sparks JD, Sparks CE, Smith HC (1996) An auxiliary factor containing a 240-kDa protein complex is involved in apolipoprotein B RNA editing. Proc Natl Acad Sci USA 93:1097-1102 Sheehy AM, Gaddis NC, Choi JD, Malim MH (2002) Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature 418:646-650 Shindo K, Takaori-Kondo A, Kobayashi M, Abudu A, Fukunaga K, Uchiyama T (2003) The enzymatic activity of CEM15/APOBEC3G is essential for the regulation of the infectivity of HIv-1 virion, but not a sole determinant of its antiviral activity. J Biol Chem 278:44412-44416 Skuse GR, Cappione AJ, Sowden M, Metheny LJ, Smith HC (1996) The neurofibromatosis type I messenger RNA undergoes base-modification RNA editing. Nucl Acids Res 24:478-485 Smith HC (1993) Apolipoprotein B mRNA editing: the sequence to the event. Semin Cell Biol 4:267-278 Smith HC, Bottaro A, Sowden MP, Wedekind JE (2004) Activation induced deaminase: the importance of being specific. Trends Genet 20:224-227 Sohail A, Klapacz J, Samaranayake M, Ullah A, Bhagwat AS (2003) Human activationinduced cytidine deaminase causes transcription-dependent, strand-biased C to U deaminations. Nucl Acids Res 31:2990-2994 Sowden M, Hamm JK, Smith HC (1996a) Overexpression of APOBEC-1 results in mooring sequence-dependent promiscuous RNA editing. J Biol Chem 271:3011-3017 Sowden M, Hamm JK, Spinelli S, Smith HC (1996b) Determinants involved in regulating the proportion of edited apolipoprotein B RNAs. RNA 2:274-288

398 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden Sowden MP, Ballatori N, de Mesy Jensen KL, Hamilton Reed L, Smith HC (2002) The editosome for cytidine to uridine mRNA editing has a native complexity of 27S: identification of intracellular domains containing active and inactive editing factors. J Cell Science 115:1027-1039 Sowden MP, Eagleton MJ, Smith HC (1998) Apolipoprotein B RNA sequence 3' of the mooring sequence and cellular sources of auxiliary factors determine the location and extent of promiscuous editing. Nucl Acids Research 26:1644-1652 Sowden MP, Lehmann DM, Lin X, Smith CO, Smith HC (2004) Identification of novel alternative splice variants of APOBEC-1 complementation factor with different capacities to support ApoB mRNA editing. J Biol Chem 279:197-206 Sowden MP, Smith HC (2001) Commitment of apolipoprotein B RNA to the splicing pathway regulates cytidine-to-uridine editing-site utilization. Biochem J 359:697-705 Sparks CE, Hnatiuk O, Marsh JB (1981) Hepatic and intestinal contribution of two forms of apolipoprotein B to plasma lipoprotein fractions in the rat. Can J Biochem 59:693-699 Steinburg MF, Schock D, Backus JW, Smith HC (1999) Tissue-specific differences in the role of RNA 3' of the apolipoprotein B mRNA mooring sequence in editosome assembly. Biochem Biophys Res Commun 263:81-86 Storb U, Stavnezer J (2002) Immunoglobulin genes: generating diversity with AID and UNG. Curr Biol 12:R725-727 Suspene R, Sommer P, Henry M, Ferris S, Guetard D, Pochet S, Chester A, Navaratnam N, Wain-Hobson S, Vartanian JP (2004) APOBEC3G is a single-stranded DNA cytidine deaminase and functions independently of HIV reverse transcriptase. Nucl Acids Res 32:2421-2429 Svarovskaia ES, Xu H, Mbisa JL, Barr R, Gorelick RJ, Ono A, Freed EO, Hu WS, Pathak VK (2004) Human APOBEC3G is incorporated into HIV-1 virions through interactions with viral and nonviral RNAs. J Biol Chem 279:35822-35828 Ta VT, Nagaoka H, Catalan N, Durandy A, Fischer A, Imai K, Nonoyama S, Tashiro J, Ikegawa M, Ito S, Kinoshita K, Muramatsu M, Honjo T (2003) AID mutant analyses indicate requirement for class-switch-specific cofactors. Nat Immunol 4:843-848 Teng B, Burant CF, Davidson NO (1993) Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260:1816-1819 Teng B, Davidson NO (1992) Evolution of intestinal apolipoprotein B mRNA editing. Chicken apolipoprotein B mRNA is not edited, but chicken enterocytes contain in vitro editing enhancement factor(s). J Biol Chem 267:21265-21272 Teng BB, Ochsner S, Zhang Q, Soman KV, Lau PP, Chan L (1999) Mutational analysis of apolipoprotein B mRNA editing enzyme (APOBEC1). Structure-function relationships of RNA editing and dimerization. J Lipid Res 40:623-635 Turelli P, Mangeat B, Jost S, Vianin S, Trono D (2004) Inhibition of hepatitis B virus replication by APOBEC3G. Science 303:1829 von Wronski MA, Hirano KI, Cagen LM, Wilcox HG, Raghow R, Thorngate FE, Heimberg M, Davidson NO, Elam MB (1998) Insulin increases expression of apobec-1, the catalytic subunit of the apolipoprotein B mRNA editing complex in rat hepatocytes. Metabolism 47:869-873 Wedekind JE, Dance GS, Sowden MP, Smith HC (2003) Messenger RNA editing in mammals: new members of the APOBEC family seeking roles in the family business. Trends Genet 19:207-216

Mammalian C to U editing 399 Wiegand HL, Doehle BP, Bogerd HP, Cullen BR (2004) A second human antiretroviral factor, APOBEC-3F, is suppressed by the HIV-1 and HIV-2 Vif proteins. EMBO J 23: 2451-2458 Wieland S, Thimme R, Purcell RH, Chisari FV (2004) Genomic analysis of the host response to hepatitis B virus infection. Proc Natl Acad Sci USA 101:6669-6674 Wong SK, Lazinski DW (2002) Replicating hepatitis delta virus RNA is edited in the nucleus by the small form of ADAR1. Proc Natl Acad Sci USA 99:15118-15123 Woo CJ, Martin A, Scharff MD (2003) Induction of somatic hypermutation is associated with modifications in immunoglobulin variable region chromatin. Immunity 19:479489 Xiang S, Short SA, Wolfenden R, Carter CW Jr (1996) Cytidine deaminase complexed to 3-deazacytidine: a "valence buffer" in zinc enzyme catalysis. Biochemistry 35:13351341 Xie K, Sowden MP, Dance GS, Torelli AT, Smith HC, Wedekind JE (2004) The structure of a yeast RNA-editing deaminase provides insight into the fold and function of activation-induced deaminase and APOBEC-1. Proc Natl Acad Sci USA 101:8114-8119 Yamanaka S, Balestra M, Ferrell L, Fan J, Arnold KS, Taylor S, Taylor JM, Innerarity TL (1995) Apolipoprotein B mRNA editing protein induces hepatocellular carcinoma and dysplasia in transgenic animals. Proc Natl Acad Sci USA 92:8483-8487 Yamanaka S, Poksay KS, Arnold KS, Innerarity TL (1997) A novel translational repressor mRNA is edited extensively in livers containing tumors caused by the transgene expression of the apoB mRNA- editing enzyme. Genes Dev 11:321-333 Yamanaka S, Poksay KS, Balestra ME, Zeng GQ, Innerarity TL (1994) Cloning and mutagenesis of the rabbit ApoB mRNA editing protein. A zinc motif is essential for catalytic activity, and noncatalytic auxiliary factor(s) of the editing complex are widely distributed. J Biol Chem 269:21725-21734 Yang JH, Luo X, Nie Y, Su Y, Zhao Q, Kabir K, Zhang D, Rabinovici R (2003) Widespread inosine-containing mRNA in lymphocytes regulated by ADAR1 in response to inflammation. Immunology 109:15-23 Yang Y, Ballatori N, Smith HC (2002) Synthesis and secretion of the atherogenic risk factor apoB100 is reduced through TAT-mediated protein transduction of an mRNA editase into hepatocytes. Molec Pharm 61:269-276 Yang Y, Kovalski K, Smith HC (1997) Partial characterization of the auxiliary factors involved in apolipoprotein B mRNA editing through APOBEC-1 affinity chromatography. J Biol Chem 272:27700-27706 Yang Y, Sowden MP, Yang Y, Smith HC (2001) Intracellular trafficking determinants in APOBEC-1, the catalytic subunit for cytidine to uridine editing of apolipoprotein B mRNA. Exp Cell Res 267:153-164 Yang Y, Smith HC (1997) Multiple protein domains determine the cell type-specific nuclear distribution of the catalytic subunit required for apolipoprotein B mRNA editing. Proc Natl Acad Sci USA 94:13075-13080 Yang Y, Sowden MP, Smith HC (2000) Induction of cytidine to uridine editing on cytoplasmic apolipoprotein B mRNA by overexpressing APOBEC-1. J Biol Chem 275:22663-22669 Yu Q, Konig R, Pillai S, Chiles K, Kearney M, Palmer S, Richman D, Coffin JM, Landau NR (2004) Single-strand specificity of APOBEC3G accounts for minus-strand deamination of the HIV genome. Nat Struct Mol Biol 11:435-442

400 Harold C. Smith, Joseph E. Wedekind, Kefang Xie, and Mark P. Sowden Zhang H, Yang B, Pomerantz RJ, Zhang C, Arunachalam SC, Gao L (2003) The cytidine deaminase CEM15 induces hypermutation in newly synthesized HIV-1 DNA. Nature 424:94-98 Zhang J, Sun X, Qian Y, LaDuca JP, Maquat LE (1998a) At least one intron is required for the nonsense-mediated decay of triosephosphate isomerase mRNA: a possible link between nuclear splicing and cytoplasmic translation. Mol Cell Biol 18:5272-5283 Zhang J, Sun X, Qian Y, Maquat LE (1998b) Intron function in the nonsense-mediated decay of beta-globin mRNA: indications that pre-mRNA splicing in the nucleus can influence mRNA translation in the cytoplasm. RNA 4:801-815 Zheng YH, Irwin D, Kurosu T, Tokunaga K, Sata T, Peterlin BM (2004) Human APOBEC3F is another host factor that blocks human immunodeficiency virus Type 1 replication. J Virol 78:6073-6076

Smith, Harold C. Departments of Biochemistry and Biophysics, Pathology and the Cancer and Environmental Health Science Centers, University of Rochester, 601, Elmwood Avenue, Rochester NY 14642, USA [email protected] Wedekind, Joseph E. Departments of Biochemistry and Biophysics, University of Rochester, 601, Elmwood Avenue, Rochester NY 14642, USA [email protected] Xie, Kefang Departments of Biochemistry and Biophysics, University of Rochester, 601, Elmwood Avenue, Rochester NY 14642, USA Sowden, Mark P. Departments of Biochemistry and Biophysics, Pathology, University of Rochester, 601, Elmwood Avenue, Rochester NY 14642, USA

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription Roland Marquet and Frédéric Dardel

Abstract Reverse transcription is a central step in HIV-1 replication that represents a typical case of interplay between viral and cellular factors. HIV-1 diverts a cellular tRNA, Lys tRNA 3, to prime reverse transcription. The post-transcriptional modifications of Lys tRNA 3 are crucial for completion of reverse transcription. In some HIV-1 isolates, they are required for efficient initiation of (-) strand DNA synthesis, and in all strains, methylation of A58 is required to allow productive strand transfer during (+) strand DNA synthesis. On the other hand, some human cell types have evolved an innate antiretroviral mechanism by promoting extensive deamination of the (-) strand DNA during reverse transcription. In the absence of viral defence, this hyper-editing induces DNA degradation and lethal mutagenesis of the viral DNA. However, Vif, one of the HIV-1 “accessory” proteins, is able to inhibit DNA deamination by preventing incorporation of the editing enzymes APOBEC3G and APOBEC3F into the viral particles.

1 Introduction Retroviruses are diploid, with two identical copies of the positive strand viral RNA packaged within the viral particle (Paillart et al. 2004). Upon cell infection, the viral core, composed of a shell of capsid (CA) proteins surrounding the dense assembly of viral RNA and nucleocapsid protein (NC), is disassembled. The viral genome then undergoes an elaborate retrotranscription process involving several strand transfers, as schematised in Figure 1. As a result of this intricate mechanism, the single stranded genomic RNA is copied into a doubled stranded DNA molecule that is actually longer than the RNA template, with long terminal repeats (LTR) duplicated at each end. Within this process, there are several steps involving enzymatic modifications of the nucleotides within the various polynucleotide partners, leading either to modified nucleotides or to editing processes. The cellular tRNA that serves as a primer contains modified ribonucleotides that are important at several steps of the replicative cycle, and the genomic DNA can undergo deamination, a restriction mechanism developed by host cells and against which the virus has evolved a defence mechanism. During the infective cycle, the virus has to cope with these modifications that would otherwise seriously interfere with Topics in Current Genetics H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106366 / Published online: 7 January 2005 © Springer-Verlag Berlin Heidelberg 2005

2 Roland Marquet and Frédéric Dardel

its replication. In order to overcome these difficulties, HIV-1 has evolved elaborate strategies that either neutralize the modification process or allow to circumvent its negative effect. In some cases, the cellular nucleotide modifications are even used by the virus to its advantage, either as landmarks for reverse transcription or as mediators of ancillary stabilizing interactions. Three viral factors are at the centre of the corresponding processes: the reverse transcriptase (RT), the nucleocapsid protein (NC) and the accessory protein Vif. The first two proteins interact directly with the modified polynucleotides, whereas Vif blocks the cellular modifying activity. The aim of this review is to provide a detailed perspective of how the virus bypasses the inhibitory effects of cellular nucleotide modifications and editing.

2 tRNA modification and HIV-1 reverse transcription The successful completion of viral replication is strictly dependent upon a host RNA molecule, the primer tRNA. Indeed, retroviral reverse transcriptases (RT), as most DNA polymerases, are dependent upon a primer molecule to initiate synthesis of a DNA strand. The 5’ region of the viral genome contains a specific sequence, the primer binding site (PBS), which is complementary to the 3’ end of a given host cytoplasmic tRNA that is recruited and annealed onto the viral genome (reviewed in Marquet et al. 1995; Le Grice 2003; Kleiman et al. 2004) (Fig. 1). For instance, all known mammalian lentiviruses use host tRNALys3 as a primer. This RNA-RNA duplex is recognised by RT, which elongates the (-) DNA strand, thus, producing a chimeric tRNA-DNA strand (Fig. 1, Steps 1 and 2). As reverse transcription proceeds, the template RNA is progressively hydrolysed by the RNase H activity of RT, except in polypurine tract regions (PPT), which are used in turn by RT as primers for the synthesis of the (+) DNA strand (Fig. 1, Step 4). After being used for initiation of reverse transcription, the primer tRNA, which is still attached to the end of the minus DNA strand, will then serve as template for the synthesis of the DNA counterpart of the PBS sequence (Fig. 1, Step 5). The very early steps of reverse transcription might actually take place within the virion, and the primer tRNA is recruited and packaged within the budding viral particle and placed onto the PBS before the infection of a new cell takes place (Huang et al. 1997). 2.1 Function of the modified nucleotides of tRNA The transfer RNA of all living organisms contain a number of nucleotides which are post-transcriptionally modified, either at the level of the sugar or of the base. Within the context of cellular translation of mRNA, the function of these modifications is mainly twofold: (i) Some modifications contribute significantly to the folding of the molecule, mostly by allowing additional stabilising interactions within the 2D or 3D fold, or sometimes by preventing the formation of competing

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 3

Fig. 1. Scheme of HIV-1 reverse transcription. The repeated sequence (R), 5’ unique sequence (U5), primer binding site (PBS), polypurine tract (PPT), 3’ unique sequence (U3) are indicated. RNA regions are shown as thick lines, and DNA as thin lines. Step 1: Annealing of the primer tRNA onto the PBS. Step 2: Minus strand strong stop DNA synthesis. During DNA elongation, RNA template strands are progressively hydrolyzed by the RNAse H activity of reverse transcriptase. Step 3: First strand transfer. Step 4: Priming of the plus DNA strand synthesis on the PPT. The m1A58 modified ribonucleotide within the primer is indicated by a star. Step 5: Plus strand strong stop DNA synthesis. Step 6: Second strand transfer. Step 7: completion of the two DNA strands.

alternate structures (reviewed in Agris 1996). (ii) Within the anticodon loop, modified bases participate in the stability and specificity of the codon-anticodon interaction within the ribosome (reviewed in Grosjean et al. 1998). Some of the tRNA used as primers by retroviruses contain a large number of modified nucleotides. For instance mammalian tRNALys3, the HIV-1 and HIV-2 primer, contains 14 post-transcriptional modifications (Raba et al. 1979) (Fig. 2), which are clustered at the core of the 3D fold and in the anticodon loop (Benas et al. 2000). There is now accumulating evidence that these modifications also affect the viral

4 Roland Marquet and Frédéric Dardel

Fig. 2. Secondary structure and tertiary interactions within human tRNALys3. Base triples and tertiary interactions are indicated as shaded boxes.

replication process at several levels (for an earlier review, see Marquet 1998). From the virus’ point of view, post-transcriptional modifications of the primer tRNA are mixed blessings: on the one hand, they provide specific recognition and interaction elements for viral partners in the replication process, but on the other hand, they significantly strengthen the native tRNA scaffold structure, rendering more difficult the annealing to the viral PBS. Paralleling the primer tRNA functions in the viral cycle, there are three major levels at which its nucleotide modifications can be expected to play a role in HIV1 replication: selective incorporation of the primer into the viral particle, initiation of reverse transcription and (+) strand strong stop DNA synthesis. 2.2 Selective uptake of primer tRNA into the viral particle Early work showed that, in vitro, tRNALys3 can interact with RT (Barat et al. 1989; Sallafranque-Andreola et al. 1989) and this suggested that uptake of the primer could occur directly via the reverse transcriptase itself. However, later studies showed that this interaction is probably not selective for tRNALys3 as other tRNA can bind to RT with similar affinities (Weiss et al. 1992; Arion et al. 1996). The mechanism by which tRNALys3 is selectively incorporated within the HIV-1 virion has now been extensively studied by Kleiman and co-workers and a complete picture is beginning to emerge (for a recent review on this topic, see Kleiman et al.

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 5

2004). The two lysine tRNA isoacceptors (tRNALys1,2 and tRNALys3) are selectively packaged into the virus by means of their natural interaction with their cognate aminoacyl-tRNA synthetase, LysRS (Guo et al. 2003; Cen et al. 2004b; Halwani et al. 2004). LysRS is indeed able to bind to the Gag protein, via interactions with the C-terminal part of the capsid, and is actively packaged in vivo in Gag virus-like particles (Javanbakht et al. 2003). The Gag-Pol precursor is also apparently required for the selective uptake of the primer tRNA, through sequences belonging to the RT part of the viral polyprotein (Mak et al. 1994, 1997). Thus, LysRS and tRNA recruitment is likely to be performed via Gag/Gag-Pol, during the budding of the viral particle. tRNALys3 is, thus, presumably selectively incorporated as a nucleoproteic complex with LysRS or a fragment of this protein. tRNA binding by LysRS, but not its aminoacylation activity, is important for its recruitment into the virions (Cen et al. 2004b). Factors affecting viral tRNA uptake are, thus, likely to be the same as those driving the selective recognition of tRNALys3 by its activating enzyme. In this context, Francin and Mirande have measured the affinity for LysRS of both a naked tRNA transcript (Francin et al. 2002), and a recombinant tRNA expressed in E. coli (Francin and Mirande 2003), which carries 8 of the 14 modified nucleotides (Tisne et al. 2000). The naked transcript is readily recognised and aminoacylated by the synthetase, and there is apparently only a twofold difference in Km compared to the modified, recombinant tRNA (3 µM vs. 1.7 µM). Thus, inasmuch as recruitment of the primer tRNA is principally dependent upon its binding to LysRS, there is currently no evidence that post-transcriptional modifications play a significant role in this step of the viral cycle. 2.3 Initiation of reverse transcription The complex between the viral genomic RNA and the primer involves the annealing of the 18 nucleotides of the PBS to the corresponding complementary sequence at the 3’ end of the tRNA. This implies two major rearrangements within tRNALys3: (i) A strand exchange process during which the acceptor stem and TΨC-stem intramolecular helices are converted into an intermolecular tRNA-PBS helix. (ii) A disruption of the tertiary structure of the tRNA and in particular of the long-range interaction between the D- and TΨC loops. These tertiary interactions are present in all canonical tRNAs and have been shown to be strongly stabilised by modified nucleotides, in particular by the conserved 5-methyl-uridine 54 (rT) and pseudouridine 55 (Ψ). Methylation of T54 increases the melting temperature of tRNAMet by 6°C (Davanloo et al. 1979) and Ψ55 forms an additional hydrogen bond to the phosphate backbone via its second imino group. Accordingly, unmodified tRNAVal is significantly less stable than its native counterpart, with the strongest effects at the level of D-loop /TΨC -loop interactions (Derrick and Horowitz 1993). The tRNA tertiary fold is indeed stabilised by an extensive network of tertiary interactions forming the core of the structure in which a large fraction of the modified nucleotides are clustered (Fig. 2).

6 Roland Marquet and Frédéric Dardel

2.3.1 The tRNA-vRNA initiation complex In addition to the clear-cut interaction between the 18 nucleotides of the PBS and the 3’ end of the primer, for HIV-1 and tRNALys3, there is a large body of evidence indicating that additional interactions occur between neighbouring sequences in both partners. Most prominently, an interaction has been proposed to take place between an A-rich loop located upstream from the PBS, in the 5’-untranslated region of the viral RNA (U5), and the U-rich anticodon loop of the tRNA. Originally, this interaction was evidenced in vitro by chemical probing on the tRNAHIV-1 MAL RNA initiation complex (Isel et al. 1993) and further supported by detailed structural analyses (Isel et al. 1995, 1996, 1998, 1999). The complex formed with the naked transcript differs from that formed with the native tRNALys 3. The anticodon/A-rich loop interaction does not take place for the naked transcript, as evidenced by the reactivity of the corresponding bases toward chemical probing, and the RT elongation from both complexes is different, with the native tRNALys3 complex showing pauses sites at nucleotides +3, +5 and +16, the latter being a hallmark of the anticodon/A-rich loop interaction (Isel et al. 1996), whereas only a pause at +3 is observed for the complex with the unmodified transcript (Isel et al. 1996). The efficiency of the minus strand strong stop DNA (Fig. 1, Step 2) is also significantly reduced. Evidence suggesting that this interaction also takes place in vivo in subtype B isolates was provided by studies in which either the A-rich loop or the tRNALys3 anticodon sequences were altered (Huang et al. 1996; Liang et al. 1997) or the anticodon sequence was targeted by antisense oligonucleotides in infected cell cultures (Wei et al. 2000). The existence of secondary tRNA/vRNA interactions outside the PBS region is also supported by the observation that it was not possible to switch the primer specificity of HIV-1 by simply replacing the 18-nucleotide PBS sequence, as the mutant viruses quickly reverted to the wild type sequence, complementary to tRNALys3 (Li et al. 1994; Das et al. 1995; Wakefield et al. 1995). In order to obtain stably replicating viruses that used tRNAHis as a primer, additional mutations were required which rendered the A-rich loop complementary to the anticodon of the new primer (Wakefield et al. 1996). Stable viruses which used human tRNALys1,2 (Kang et al. 1999), tRNAHis (Zhang et al. 1996), tRNAPro (Kang et al. 1996), or tRNAMet (Kang and Morrow 1999) could, thus, be obtained by substituting the PBS sequence and simultaneously rendering the A-rich loop sequence complementary to the anticodon of the new primer tRNA. In vitro studies indicated that an interaction between the mutated A-rich loop and the tRNAHis anticodon loop is required for efficient initiation of reverse transcription (Rigourd et al. 2003). However, recent in vitro and in vivo experiments indicated that the secondary primer-vRNA interactions in this artificial system do not reflect the interactions taking place in the wild type virus it is derived from (i.e. the HXB2 isolate) (Goldschmidt et al. 2004). In addition, this strategy did not work for all tested tRNAs (Kang et al. 1996), suggesting that additional structural features within the U5 region of the viral RNA and/or the tRNA are involved in the formation of the initiation complex. Accordingly, the Berkhout group has shown that the structure

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 7

and the stability of the U5 hairpin upstream from the PBS influences the efficiency of tRNA annealing and initiation of tRNA-primed reverse transcription (Beerens and Berkhout 2000; Beerens et al. 2000a, 2000b). This already intricate picture is further complicated by the fact that different HIV-1 isolates have different secondary structures in the U5 region, and thus some of the reported results are likely to be context-dependent, which could partly explain some of the conflicting detailed models proposed for the tRNALys3/HIV complex (Isel et al. 1995, 1999; Beerens et al. 2001; Goldschmidt et al. 2003). Recently, it was indeed shown that the A-loop/anticodon interaction does not occur in all virus subtypes and in particular that it is not present in the prototype subtype B strains HXB2/pNL4.3 (Goldschmidt et al. 2004). Thus, some viral isolates apparently require the additional interactions for reverse transcription, while some do not. However, in those viral isolates where additional interactions take place, the modified nucleotides are required to support them. This raises two questions: what is the function of the additional intermolecular interactions outside the PBS, and what is the part played by the tRNA nucleotide modifications in this process? Regarding the first point, two non-exclusive answers can be proposed: (i) The extended network of intermolecular interactions between the tRNA and the viral RNA probably provide additional stability to the complex which could be required to counterbalance the loss of free energy resulting from the melting of the tRNA tertiary structure. (ii) The tertiary structure acquired by the extended complex could specifically contribute to the recognition by the viral RT. The latter explanation is supported by biochemical and kinetic analyses of the initial extension stages of the natural primer, compared to that of simple oligoribonucleotides (Isel et al. 1996; Lanchy et al. 1996). There is a significant body of evidence indicating that the modified nucleotides of tRNALys3 could play a determining role in this step. The first evidence along these lines came from the comparison of native, modified tRNA with naked in vitro transcripts in their abilities to form an extended complex with the viral genomic RNA and to prime (-) strand DNA synthesis (Isel et al. 1993, 1996). These results pointed at a role of the anticodon modifications in the interaction with the viral A-rich loop in the U5 region, at least in those isolates in which this interaction does occur. Recent studies suggested that differences observed between RT priming with natural, modified tRNALys3 and naked transcripts could result from the addition of an extra nucleotide by T7 RNA polymerase during in vitro transcription and not from the base modifications (Miller et al. 2004). Although overextension of the primer can possibly interfere with the formation of the initiation complex, these results cannot account for the difference originally reported by Isel et al. (1993, 1996), where tRNA transcripts were trimmed by the CCA adding enzyme, or for the differences observed with partially modified tRNALys3 obtained in vivo or by chemical oxidation (Isel et al. 1993; Tisné et al. 2000). It is likely that the high RT and primer/template concentrations used by Miller et al. (2004) masked the effects on tRNA3Lys posttranscriptional modifications.

8 Roland Marquet and Frédéric Dardel

2.3.2 Anticodon modifications The interaction between the A-rich loop and the anticodon of the primer tRNA is indeed likely to be significantly stabilised by the presence of the hypermodified bases, mcm5s2U and ms2t6A at position 34 and 37, respectively. It has indeed been shown that both 2-thiolation of U34 (Houssier et al. 1988) and N6-threonylcarbamoylation of A37 (Weissenbach and Grosjean 1981) enhance the stability of loop-loop complexes. It was originally proposed that these two modifications could induce an unusual conformation of the anticodon loop that could serve as a determinant for recognition by RT (Agris et al. 1997), although this was later contradicted by both NMR and X-ray structural studies on the E. coli and human tRNA (Benas et al. 2000; Sundaram et al. 2000). The effects of the anticodon modifications on the stability of the interaction between the tRNALys3 anticodon and the HIV-1 A-rich loop have been directly investigated by plasmon surface resonance on a reduced synthetic system (Bajji et al. 2002). Both modifications synergistically stabilise this loop-loop interaction, which nevertheless remains quite weak, even in high salt conditions (Kd ≈ 70 µM in 1 M NaCl). The impact of the individual chemical groups within these two modifications has been investigated, with the major effect being ascribed to the 2-thio group at position 34. This was both shown by selective chemical oxidation (Isel et al. 1993) and by the analysis of sub-modified species of tRNALys3 produced in a recombinant system (Tisne et al. 2000). The apparent function of the modified nucleotides within the anticodon is, thus, to stabilise the loop-loop interaction that is naturally very weak, being composed only of A-U base pairs. This fact is supported by the failure to directly observe the corresponding base pairs in NMR experiments (Tisne et al. 2004). This interaction would only be required in certain viral isolates, as a guide interaction early in the dynamical process of primer interaction, as no RNA unfolding is required to form this loop-loop complex, and/or as an additional interaction stabilising the complex in those cases were either the template or the primer structure requires a significant structural rearrangement. The former explanation is supported by the NMR observation that the t6A37 threonyl group of the tRNA is affected in the presence of a viral RNA fragment carrying both the A-rich loop and the PBS, even before the primer tRNA structure opens (Tisne et al. 2004), whereas the latter is supported by the structural analysis of various wild type and mutant viruses including the engineered tRNAHis -specific variants that rely on this loop-anticodon interaction (Goldschmidt et al. 2004). For the latter variants, modification of tRNAHis was however not required to support the loop-anticodon interaction (Rigourd et al. 2003). This is in keeping with the proposal that nucleotide modifications are required to stabilise an otherwise weak interaction, specific to tRNALys3. For tRNAHis anticodon, this interaction includes two G-C base pairs, which are probably sufficient to stabilise the loop-loop complex.

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 9

2.3.3 The tRNALys3 core modifications The involvement of the nucleotide modifications located at the core of the tRNA structure in the formation of the initiation complex is comparatively less well documented than the role of those of the anticodon. As discussed above, within native tRNA structures, the role of the core modified nucleotides, many of which are widely conserved among tRNAs of all species (T, Ψ, m7G, D, m1A), is primarily to contribute to the stability of the fold. Thus, from the virus point of view, they are more of a barrier to the annealing process, a negative factor that has to be somehow circumvented. The virus indeed has to perform the annealing of the primer tRNA onto the viral RNA template at physiological temperature (37°C), a temperature at which the native tRNA structure does not spontaneously unfold. This process is dependent upon the nucleic acid chaperone activity of the viral NC protein (Barat et al. 1989; De Rocquigny et al. 1992; Hargittai et al. 2004), reviewed in (Rein et al. 1998; Kleiman et al. 2004). The comparative lack of information on the possible negative role of modified nucleotides in the unfolding process stems from the fact that a significant number of in vitro studies of the initiation of HIV-1 reverse transcription used naked tRNA transcripts as primer and/or annealing of heat-denatured primer tRNA and template. The latter procedure is justified by the observation that heat-annealed and NC-annealed complexes are structurally similar and that both are initiation proficient (Brule et al. 2002), but does not allow one to shed light on the dynamic process of annealing, as it bypasses the problem altogether. In the more global framework of tRNA structure, the role of the core modified nucleotides is believed to contribute mostly to the stabilisation of the tertiary rather than of the secondary structure (Sampson and Uhlenbeck 1988; Perret et al. 1990; Kintanar et al. 1994). For instance, unmodified yeast tRNAPhe can apparently exist in an intermediate state devoid of tertiary structure, where only the 2D cloverleaf structure is formed, whereas native, fully modified tRNA folds extremely cooperatively into fully formed 3D structures, with little or no 2D intermediates (Maglott et al. 1998). None of the 18 nucleotides of tRNALys3 that are eventually base-paired to the PBS is directly involved in tRNA tertiary structure. However, the complete unwinding of the acceptor and TΨC-stems that form one arm of the L-shaped molecule could have significant consequences on the rest of the 3D fold. Indeed, direct NMR studies of the annealing of tRNALys3 to various synthetic oligoribonucleotides corresponding to the PBS show that upon annealing to the 18 nucleotides of the PBS, most of the tRNA tertiary interactions are lost (Tisne et al. 2004). The same study also showed that the tertiary interactions involving the TΨC and Dloops act as a resilient structural “lock” which resists strand invasion by the viral RNA. This analysis was performed with a recombinant tRNALys3 that carries T54, Ψ55, m7G46 and dihydrouridines (Tisne et al. 2000) and, thus, is able to form most of the conserved tertiary interactions within the core, although it lacks m1A58, which is also believed to increase tRNA stability (Droogmans et al. 2003): inactivation of the gene for m1A58 methyltransferase in Thermus thermophilus, leads to a loss of thermoresistance of this extremophile, which was proposed to result from a reduction of stability of tRNA carrying this modification (see Chapter

10 Roland Marquet and Frédéric Dardel

by Johansson and Byström). Removal of this structural lock, which is likely to be stabilized by modified nucleotides, required the action of HIV-1 NC and could not be mimicked by polycationic peptides, which were previously shown to be able to promote annealing of unmodified tRNALys3 transcripts (Hargittai et al. 2001). This led to the proposal that the tertiary interactions supported by the modified nucleotides at the hinge of tRNA3Lys are specifically recognized and destabilized by the viral NC. 2.4 Plus strand strong stop synthesis Retroviruses have a conserved and elaborate reverse transcription scheme, which produces a double stranded DNA copy actually longer than the single stranded RNA template, with long terminal repeats (LTR) duplicated at both ends of the genome (Gilboa et al. 1979). This duplication is achieved in the later stages of viral replication and requires the presence of a key nucleotide modification within the tRNA primer. After extension of the (-) DNA strand up to the PBS sequence (Fig. 1, Step 4), most of the template RNA has been hydrolysed by the RNase H activity of RT, with the exception of the polypurine tract (PPT) which acts as a primer for (+) strand strong stop DNA synthesis. The template for this synthesis is a chimerical tRNA-DNA strand and (+) strand DNA stops 18 nucleotides within the tRNA template, exactly at the 3’ end of the sequence complementary to the PBS. From the observation that all retroviral primer tRNAs carry a m1A58 opposite to this (+) strand strong stop position (shown as a star on Fig. 1, Step 4), Baltimore and co-workers proposed that this modified nucleotide serves as a stop signal for RT (Gilboa et al. 1979). Methylation of the N1 position of the adenosine indeed precludes the formation of a standard Watson-Crick base and, hence, the extension of the DNA/RNA duplex by RT. This stop of reverse transcription, which prevents the complete replication of the primer tRNA sequence, is essential to the reverse transcription process, since overextended (+) strand DNA would not be able to pair exactly with the PBS sequence within the (-) DNA strand after the second strand jump (Fig. 1, Step 6). The dangling 3’ end of the (+) strand DNA resulting from read-through polymerisation of the tRNA template would then be unable to prime the synthesis of the rest of the (+) strand DNA (Steps 6 and 7). The initial model was confirmed experimentally for HIV-1 replication by a number of studies. Original evidence came from experiments with reconstituted in vitro systems mimicking the (+) strand strong stop synthesis and second strand transfer intermediates, using either unmodified tRNA transcripts or native tRNALys 3 in the template strand. Detailed analyses of the (+) strand strong stop position both in in vitro systems and in permeabilised virions showed that in addition to the stop opposite m1A58, slightly longer products could be obtained, corresponding to a replication arrest opposite Ψ55 in the template tRNA, suggesting that either a fraction of tRNALys3 is undermethylated at position 58 or that RT can somehow partly bypass the mismatch induced by this modification (Ben-Artzi et al. 1996; Auxilien et al. 1999; Wu et al. 1999). Finally, in vivo evidence for an essential role of m1A58 came from the analysis of the infectivity of HIV-1 in a recombinant

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 11

CEM cell line expressing a mutant tRNALys3 in which A58 is replaced by a U (Renda et al. 2001), and hence is not methylated by the host methyltransferase. Although these cells simultaneously express the wild type and the mutant tRNALys3, replication of HIV-1 is impaired, and this inhibition correlates with the production of overextended (+) strand strong stop products, in keeping with the hypothesis that m1A58 is the major stop site.

3 DNA editing and HIV-1 reverse transcription Most functions described in this book involve either RNA modifications or editing. However, during the reverse transcription process, HIV-1 is confronted to both phenomena. In the previous sections, we saw how HIV-1 copes with, and finally takes advantage of, the tRNALys3 post-transcriptional modifications. Editing poses more problems to HIV-1 as it constitutes a strong antiviral defence that can cause lethal hypermutation. However, as described in the next sections, HIV-1 not only managed to neutralize this defence, but also apparently converted it into a replicative advantage. 3.1 Vif: a viral infectivity factor that counteracts a cellular restriction factor 3.1.1 Permissive and non-permissive cell lines In addition to the gag, pol, and env genes that are common to all replicationcompetent retroviruses, lentiviruses, including HIV-1, possess a number of regulatory genes. Among them, vif (Viral Infectivity Factor) is conserved in all lentiviruses, except equine infectious anemia virus (EIAV). Vif, the vif product, is crucial for pathogenic infections of feline immunodeficiency virus in cats (Inoshima et al. 1998), caprine arthritis encephalitis virus in goats (Harmache et al. 1996), simian immunodeficiency (SIVmac) in rhesus macaques (Desrosiers et al. 1998), and most likely HIV in humans (Alexander et al. 2002). Vif is required for efficient HIV-1 replication in primary human T-cells, macrophages, and certain CD4+-transformed T-cell lines known as ‘non-permissive’ (CEM, H9, Hut78, PM1), but it is dispensable for replication in other ‘permissive’ T-cell lines (CEMSS, SupT1) and non lymphoid cells (HeLa, 293T) (Simon et al. 1998b, and references therein). Interestingly, the ∆vif phenotype depends on the producer cell, but not on the infected cell: ∆vif virions produced from non-permissive cells are defective whether they infect non-permissive or permissive cells, and ∆vif virions produced from permissive cells can complete a single replication round in non-permissive cells (Strebel et al. 1987; Gabuzda et al. 1992; von Schwedler et al. 1993). However, a biochemical basis for the block to replication in non-permissive cells remained elusive (Dettenhofer et al. 2000; Gaddis et al. 2003).

12 Roland Marquet and Frédéric Dardel

3.1.2 APOBEC3G inhibits ∆vif HIV replication in non-permissive cells The cell dependency of the ∆vif phenotype indicates that Vif interacts with cellular factor(s). Either non-permissive cells lack a positive factor, or permissive cells lack a negative component. In order to distinguish between these possibilities, permissive, and non-permissive cells were fused together. The resulting heterokaryons displayed the non-permissive phenotype, suggesting that the nonpermissive cells express a negative factor that is overcome by Vif (Madani and Kabat 1998; Simon et al. 1998a) (Fig. 3). Sheehy et al. used genetically related non-permissive (CEM) and permissive (CEM-SS) cell lines to identify a gene, which they named CEM15, encoding such a negative factor (Sheehy et al. 2002). CEM15 is transcribed in all non-permissive cells tested, independently of HIV-1 infection, whereas minimal or no expression is observed in permissive cell lines. Stable expression of CEM15 in permissive cells selectively inhibits HIV-1 ∆vif, without affecting viral output from the cells (Sheehy et al. 2002). The N- and C-terminal parts of CEM15 possess significant similarity to APOBEC1, the catalytic subunit of the mammalian apolipoprotein B mRNA editing enzyme (Sheehy et al. 2002). CEM15 has been renamed APOBEC3G (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3G) (Jarmuz et al. 2002). The APOBEC family includes the cellular cytidine deaminases APOBEC1, APOBEC2, APOBEC3, and the activation-induced deaminase (AID) (see Chapter by Smith et al.). In humans, there are seven APOBEC3 genes or pseudogenes (APOBEC3A to APOBEC3G), whereas rodents have a single APOBEC3 gene that corresponds to APOBEC3G (Jarmuz et al. 2002), and no APOBEC3 ortholog is found in Saccharomyces cerevisiae, Drosophila melanogaster, or Caenorhabditis elegans (Sheehy et al. 2002, and Chapter by Smith et al.). Unlike the other members of the family, APOBEC3B, APOBEC3F, and APOBEC3G contain a duplication of the active site, which contains a Cys-His Zn2+ coordination motif characteristic of cytidine deaminases (Jarmuz et al. 2002, and Chapter by Smith et al.). 3.2 Cytosine deamination as an innate antiviral activity 3.2.1 Non-random deamination of (-) strand HIV-1 DNA by APOBEC3G Endogenous APOBEC3G is incorporated into ∆vif HIV-1 virions produced by non permissive cells (Fig. 3B) (Stopak et al. 2003). Similarly, when overexpressed in permissive cells, APOBEC3G is incorporated into HIV-1 ∆vif (Sheehy et al. 2002, 2003; Mariani et al. 2003; Marin et al. 2003; Yu et al. 2003). Incorporation of APOBEC3G requires the nucleocapsid domain of the HIV-1 Gag precursor and the linker region between the two zinc coordination motifs of APOBEC3G (Alce and Popik 2004; Cen et al. 2004a; Douaisi et al. 2004). Deletion of either the Nterminal or C-terminal region of APOBEC3G prevents its virion localization (Li et al. 2004). The requirement of RNA for incorporation of APOBEC3G in the viral particles is still controversial (Cen et al. 2004a; Douaisi et al. 2004; Svarovskaia et al. 2004).

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 13

Fig. 3. Schematic summary of wild type and ∆vif HIV-1 replication in permissive and nonpermissive cells. Permissive cells (A) lack an antiviral factor, APOBEC3G, that is present in non-permissive cells (B, C). APOBEC3G is incorporated into ∆vif virions and deaminates viral DNA during reverse transcription (B). Vif inhibits APOBEC3G translation and induces its degradation by the proteasome (C).

Sequencing of the reverse transcription products of ∆vif viruses produced by non-permissive cells revealed hypermutation of HIV-1 DNA (Fig. 3B) (Lecossier

14 Roland Marquet and Frédéric Dardel

et al. 2003; Zhang et al. 2003). Almost all mutations were G to A transitions when read on the (+) DNA strand. A similar result was obtained when APOBEC3G was expressed in permissive cells (Harris et al. 2003; Mangeat et al. 2003; Shindo et al. 2003; Zhang et al. 2003; Yu et al. 2004). All the factors necessary for hypermutation are packaged within the viral particles, since this phenomenon was also observed when reverse transcription was performed inside purified virions (Lecossier et al. 2003; Zhang et al. 2003). These mutations result from cytosine deamination in the (-) strand of HIV-1 DNA by APOBEC3G, that leads to G to A mutations in the (+) DNA strand (Fig. 4). In contrast, APOBEC3G does not edit the genomic RNA (Lecossier et al. 2003; Mangeat et al. 2003; Yu et al. 2004). Minus strand DNA deamination occurs over the length of the virus genome, with a graded frequency in the 5’ to 3’ direction (Yu et al. 2004). Indeed, APOBEC3G is specific for single-stranded DNA (Fig. 4, Step 3) and the probability that a given (-) strand cytosine is deaminated depends upon the length of time it remains single-stranded (Yu et al. 2004). As a consequence, cytosine deamination also takes place in two short regions of the (+) DNA strand that are transiently single-stranded during reverse transcription: the PBS, which is single-stranded after removal of tRNA3Lys by RNase H, before the second strand transfer (Fig. 1, Step 5), and the 5’ copy of U3, which becomes single-stranded during strand displacement synthesis (Fig. 1, between Steps 6 and 7) (Yu et al. 2004). Cytosine deamination is not random and similar editing hotspots are observed in cell culture and in vitro (Harris et al. 2003; Beale et al. 2004; Suspene et al. 2004; Yu et al. 2004). G-rich sequences (as read in the (+) strand) are the preferred targets: TGGG is the most frequently modified tetranucleotide, and the hotspot consensus sequence is HGGR (with H=C, T, or A, and R=A or G; the mutated G residue is underlined (Harris et al. 2003; Beale et al. 2004; Suspene et al. 2004; Yu et al. 2004). In vitro, APOBEC3G deaminates cytosines in single stranded DNA, but is inactive on DNA-DNA and DNA-RNA hybrids (Suspene et al. 2004; Yu et al. 2004). Single stranded RNA is not a substrate, even though APOBEC3G binds to it (Suspene et al. 2004; Yu et al. 2004). One study reported that recombinant APOBEC3G expressed in E. coli could deaminate dC (Zhang et al. 2003), but this result has not been confirmed (Suspene et al. 2004). 3.2.2 Consequences of HIV-1 DNA editing Hyper-editing of the (-) strand HIV-1 DNA has multiple consequences. The ∆vif viruses produced from non-permissive cells fail to integrate proviruses, and these viruses cannot complete reverse transcription, or the newly synthesized transcripts are unstable (Dornadula et al. 2000, and references therein). Quantitative real-time PCR indicated that deamination of the (-) strand HIV-1 DNA by APOBEC3G does not interfere with completion of reverse transcription, but that the presence of uracil results in degradation of the cDNA prior to integration (Mariani et al. 2003). Moreover, incorporation of uracil into (-) strand DNA affects the specificity of the initiation of (+) strand DNA synthesis in vitro (Klarmann et al. 2003). Interest-

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 15

ingly, HIV-1 particles incorporate uracil DNA glycosylase (hUNG2) (Willetts et al. 1999; Priet et al. 2003a). This enzyme removes uracil bases from DNA, as the first step of the base excision repair pathway (Fig. 4, Step 4). The resulting abasic

Fig. 4. Mechanism and consequences of cytosine deamination by APOBEC3G. After synthesis of the first DNA strand (Step 1), and RNAse H action (Step 2), APOBEC3G edits the single-stranded (-) DNA (Step 3). Either uracil bases are removed by hUNG2 (Step 4), and an endonuclease cleaves DNA at the abasic sites (Step 5), or the edited (-) strand DNA is copied by reverse transcriptase (Step 6), suppressing initiation codons, and introducing premature stop codons.

sites are probable targets for endonucleolytic cleavage (Fortini et al. 2003) (Fig. 4, Step 5). Uracil excision is not quantitative, and an important fraction of cytosine deamination events is fixed as G to A mutations (Fig. 4, Step 6). A few completed double-stranded reverse transcripts generated by ∆vif viruses from APOBEC3G producing cells escape degradation, and are integrated in the infected cells (Mariani et al. 2003; Yu et al. 2004). Because TGG is a preferred target of APOBEG3G and editing at this codon generates TAG, TAA or TGA (the three termination codons), nearly all of the open reading frames from ∆vif viruses produced in the presence of APOBEC3G are prematurely terminated (Yu et al. 2004) (Fig. 4). In addition, the initiation

16 Roland Marquet and Frédéric Dardel

codons of the gag and nef genes, which are followed by a G residue and are preferentially targeted by APOBEC3G, are very frequently mutated (Fig. 4) (Yu et al. 2004). DNA degradation following uracil excision and endonucleolytic cleavage at the abasic site (Fig. 4, Steps 4 and 5) is the result of abortive base excision repair. However, there is evidence for partially successful repair, either before or after integration (Yu et al. 2004). Potentially relevant is the observation that a lysate of wild type virions containing hUNG2 was able to partially correct a G-U mismatch (but not a G-T mismatch) in a DNA-DNA primer-template complex, while a lysate of mutant virions that did not incorporate UNG2 was not (Priet et al. 2003b). 3.3 Vif neutralizes APOBEC3G 3.3.1 Vif binds APOBEC3G and prevents its incorporation into virions It has been initially reported that incorporation of APOBEC3G into virions was independent of vif expression (Sheehy et al. 2002). However, it is now generally accepted that APOBEC3G encapsidation is dramatically reduced in wild type versus ∆vif virions (Kao et al. 2003; Mariani et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Stopak et al. 2003; Li et al. 2004; Liu et al. 2004; Mehle et al. 2004). Conversaly, biologically inactive Vif mutants are unable to prevent APOBEC3G incorporation (Kao et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Shindo et al. 2003; Mehle et al. 2004). Co-immunoprecipitation experiments showed that HIV1 Vif interacts with human APOBEC3G (Fig. 3C) (Mariani et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Stopak et al. 2003; Liu et al. 2004; Mehle et al. 2004). When expressed in human cells, the single Mus musculus APOBEC3 gene confers strong antiviral activity against ∆vif HIV-1 (Mariani et al. 2003). Unexpectedly, mouse APOBEC3 is equally efficient against HIV-1 vectors expressing Vif, and protects cells from productive HIV-1 replication (Mariani et al. 2003). APOBEC3G from African green monkey (AGM), rhesus macaque, and chimpanzee are all active against ∆vif HIV-1, and AGM and macaque APOBEC3G, but not chimpanzee APOBEC3G, are also active against Vif expressing HIV-1 vectors (Mariani et al. 2003). HIV-1 Vif is unable to prevent incorporation of mouse and AGM APOBEC3G into HIV-1 particles, and it does not interact with mouse APOBEC3G (Mariani et al. 2003). Conversely, the Vif protein from AGM SIV (SIVagm Vif) cannot complement ∆vif HIV-1 in non-permissive cells (Simon et al. 1995), and it does not prevent incorporation of human APOBEC3G into HIV-1 particles (Liu et al. 2004). Several studies showed that a single amino acid difference in APOBEC3G governs the species specificity of Vif (Bogerd et al. 2004; Mangeat et al. 2004; Schrofelbauer et al. 2004; Xu et al. 2004). Replacement of Asp-128 in human APOBEC3G with the Lys-128 present in AGM APOBEC3G renders the enzyme sensitive to SIVagm Vif and resistant to HIV-1 Vif. The reciprocal Lys to Asp change in AGM APOBEC3G renders it sensitive to HIV-1 Vif and resistant to

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 17

SIVagm Vif. This phenotype switch correlates with the ability of Vif to bind APOBEC3G and interfere with its incorporation into virions (Bogerd et al. 2004; Mangeat et al. 2004; Schrofelbauer et al. 2004; Xu et al. 2004). These observations demonstrate that Vif neutralizes APOBEC3G by interacting with this enzyme, thus, preventing its incorporation into virions and DNA editing (Fig. 3C). 3.3.2 Vif decreases the cytoplasmic APOBEC3G concentration Analysis of a series of small deletions in Vif impairing its function revealed that most mutants were unable to bind APOBEC 3G (Marin et al. 2003). However, two deletions inactivated Vif without affecting binding to APOBEC3G, indicating that the VIF-APOBEC3G interaction is necessary but not sufficient for function (Marin et al. 2003). Vif decreases the steady-state level of APOBEC3G, and the observation of APOBEC3G fragments on immunoblots suggested that Vif induces degradation of APOBEC3G (Mariani et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Stopak et al. 2003). Two pulse-chase studies failed to detect an effect of Vif on the APOBEC3G stability (Kao et al. 2003; Mariani et al. 2003). However, most studies showed an important decrease of the APOBEC3G half-life in the presence of Vif (Fig. 3C) (Conticello et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Stopak et al. 2003; Yu et al. 2003; Liu et al. 2004; Mehle et al. 2004). Indeed, a fraction of APBEC3G is degraded so rapidly in the presence of Vif (2 min.) that it can easily be missed (Marin et al. 2003). Consistent with these studies, Vif causes poly-ubiquitination of APOBEC3G, and the Vif-induced reduction of APOBEC3G level is blocked by proteasome inhibitors (Conticello et al. 2003; Marin et al. 2003; Sheehy et al. 2003; Stopak et al. 2003; Yu et al. 2003; Liu et al. 2004; Mehle et al. 2004). Vif interacts with Cul5, elongins B and C, and Rbx1 to form a Skp1-cullin-F-box (SCF)-like complex (Yu et al. 2003). These complexes represent one of the largest families of ubiquitinprotein ligases. Vif mutants that bind APOBEC3G but do not interact with Cul5 are functionally inactive (Yu et al. 2003). In addition to inducing APOBEC3G degradation, Vif also seems to reduce its level by impairing its translation (Fig. 3C) (Kao et al. 2003; Stopak et al. 2003). Whether the Vif-induced decrease of intracellular APOBEC3G level can account for the impaired incorporation into virions by a direct concentration effect, or whether the Vif-APOBEC3G interaction actively exclude the latter from the viral particles is not completely clear (Mariani et al. 2003; Sheehy et al. 2003). By transfecting wild type and ∆vif proviruses together with varying amounts of APOBEC3G-expressing vector, it is possible to compare virions produced from cells with similar APOBEC3G levels. Under these conditions, wild type virions incorporate less APOBEC3G than ∆vif virions, suggesting that both diminished expression and exclusion can contribute to the inhibition of APOBEC3G packaging into wild type HIV-1 virions (Sheehy et al. 2003)

18 Roland Marquet and Frédéric Dardel

3.3.3 APOBEC3F: another editing enzyme with antiviral activity Recent studies showed that APOBEC3G is not the only cytosine deaminase with antiretroviral activity. APOBEC3F is not only a structural homolog of APOBEC3G with two deaminase domains, but also appears functionally very similar to APOBEC3G (Bishop et al. 2004; Liddament et al. 2004; Wiegand et al. 2004; Zheng et al. 2004). APOBEC3F strongly inhibits ∆vif HIV-1 replication and it is neutralized by Vif (Bishop et al. 2004; Wiegand et al. 2004; Zheng et al. 2004), even though it could be partially resistant (Liddament et al. 2004). Vif interacts with APOBEC3F, preventing its incorporation in virions (Wiegand et al. 2004; Zheng et al. 2004), and inducing its proteasomal degradation (Zheng et al. 2004). APOBEC3F and APOBEC3G are widely coexpressed in a range of human tissues, including those that are susceptible to HIV-1 infection (Bishop et al. 2004; Liddament et al. 2004; Wiegand et al. 2004). The two enzymes are capable of independently restricting retroviral infectivity when coexpressed (Liddament et al. 2004), even though they form heteromultimers (Wiegand et al. 2004). Interestingly, APOBEC3F and APOBEC3G are not redundant since they have different target sequences (Bishop et al. 2004; Liddament et al. 2004; Wiegand et al. 2004). HIV-1 hypermutation has been previously detected in vivo (Fitzgibbon et al. 1993; Janini et al. 2001; Vartanian et al. 2002), but it was attributed to unbalanced dNTP pools (Vartanian et al. 1994). It now appears that in vivo HIV-1 hypermutation is the result of APOBEC3F and APOBEC3G-induced deamination (Bishop et al. 2004; Liddament et al. 2004). APOBEC3G is active against HIV-1, SIV, EIAV, and MLV (Harris et al. 2003; Mangeat et al. 2003). APOBEC3F appears to have a narrower range of activity, since it is inactive against MLV (Bishop et al. 2004). Interestingly, APOBEC3G also has an antiviral activity that does not require DNA editing: wild type and catalytically inactive APOBEC3G mutants inhibits synthesis of hepatitis B virus (HBV) DNA, apparently by preventing packaging of the pre-genomic RNA (Turelli et al. 2004). In addition, APOBEC3G also produce low levels of HBV DNA deamination in some, but not all hepatic cell lines (Rosler et al. 2004). 3.3.4 Editing, antiviral defences, and viral evolution The previous sections showed that APOBEC3F and APOBEC3G confer a wide innate antiviral activity by massively deaminating viral DNA. HIV-1 has evolved the Vif protein that targets these editing enzymes to the proteasome and prevents their incorporation into viral particles (Fig. 3C). However, Vif does not completely inhibits deamination, and APOBEC3G induces non-lethal hypermutation in wild type HIV-1passaged in long-term culture (Zhang et al. 2003). Unlike human T-cell leukaemia virus (HTLV), which does not possess Vif, the genome of HIV-1 is A-rich (Berkhout and van Hemert 1994; Berkhout et al. 2002; Yu et al. 2004). HIV-1 coding sequences contain up to 36 % A, and 60 % of the third positions of its codons is A. The AGA arginine codon, which is not an APOBEC3G target, is strongly favoured compared to GGN arginine codons (Berkhout and van

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 19

Hemert 1994; Berkhout et al. 2002; Yu et al. 2004). These observations suggest that the HIV-1 genome has been moulded by an evolutionary pressure from APOBEC3G and APOBEC3F (Zhang et al. 2003; Yu et al. 2004). In addition, the APOBEC3G/F-induced mutations might increase HIV-1 variability, facilitating escape of the immune system and adaptation to changing environment, including antiretroviral treatment, even though RT-induced mutations might have a major role in the latter case (Berkhout and De Ronde 2004). Indeed, other retroviruses (e.g. HTLV and MLV) solved the problem of DNA deamination during reverse transcription without evolving a specific protein to counteract the action of APOBEC3G/F. Thus, it is likely that the incomplete protection provided by Vif against hypermutation confers HIV-1 a selective advantage under some circumstances. In addition to HIV-1, G to A hypermutation was previously observed in SIV (Johnson et al. 1991), caprine arthritis-encephalitis virus (CAEV) (Wain-Hobson et al. 1995), spleen necrosis virus (SNV) (Pathak and Temin 1990), and to a lesser extent HTLV (Mansky 2000) and in the pararetrovirus HBV (Gunther et al. 1997). These observations suggest that DNA editing during reverse transcription might be a widespread antiviral defence mechanism. However, it remains to be proven that hypermutation resulted from editing in all these cases (see above for the HBV case). Finally, deamination of adenines in RNA induces hypermutation of avian ALV and RAV-1 retroviruses, as well as of measles virus, parainfluenza virus 3, respiratory syncytial virus, and polyoma virus (Bass 2002). In most cases, the exact biological consequences of hyper-editing are unknown, but hypermutation likely significantly contributes to viral diversity.

4 Conclusions Both tRNA modifications and DNA editing play important roles in HIV-1 reverse transcription. Post-transcriptional modifications of tRNA3Lys, especially in the anticodon loop, are required for efficient initiation of reverse transcription of a subset of HIV-1 isolates. In addition, all HIV-1 isolates require methylation of A58 for productive second strand transfer. This suggests that the corresponding methyltransferase could be a novel interesting target for blocking HIV-1 replication, provided that the inhibition of its activity is not toxic. In Saccharomyces cerevisiae, the corresponding genes are essential (see Chapter by Johansson and Byström), but this has not been determined in mammals. Another possible strategy for limiting HIV-1 replication would be to identify tRNA ligands that further strengthen the stability of the tertiary core, thereby preventing its complete annealing to the viral RNA. Massive cytosine deamination by cellular enzymes constitutes a broad powerful antiviral defence, against which HIV-1 developed Vif. This viral protein, therefore, constitutes a new attractive target for the development of anti-HIV-1 drugs. Drugs that would prevent the interactions between Vif and either APOBEC3F/G

20 Roland Marquet and Frédéric Dardel

or cul5 should allow our APOBEC-mediated innate antiviral defence to win its war against HIV-1. The quest for such drugs will be the goal of a number of scientists in the next few years. Another question waiting for an answer is the normal cellular function(s) of APOBEC3G/F. Indeed, the APOBEC3G gene has been subject to strong positive selection throughout the history of primate evolution, and this selection appears more ancient than modern lentiviruses (Sawyer et al. 2004). In addition, APOBEC3G does not affect transposition of the human LINE-1 retrotransposon, which is considered as a remnant of ancient retroviruses (Turelli et al. 2004b). Thus, APOBEC3G likely has a cellular function in addition to its antiviral role. Identifying the APOBEC3G cellular target(s) would be a crucial step towards its elucidation.

Acknowledgements R Marquet thanks J-C Paillart for careful reading of this manuscript, and B Ehresmann and C Ehresmann for their constant interest. This work was supported by grants of the Agence Nationale de Recherches sur le SIDA (ANRS) to R Marquet and by grants of SIDACTION to F Dardel.

References Agris PF (1996) The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function. Prog Nucleic Acid Res Mol Biol 53:79-129 Agris PF, Guenther R, Ingram PC, Basti MM, Stuart JW, Sochacka E, Malkiewicz A (1997) Unconventional structure of tRNA(Lys)SUU anticodon explains tRNA's role in bacterial and mammalian ribosomal frameshifting and primer selection by HIV-1. RNA 3:420-428 Alce TM, Popik W (2004) APOBEC3G is incorporated into virus-like particles by a direct interaction with HIV-1 Gag nucleocapsid protein. J Biol Chem 279:34083-34086 Alexander L, Aquino-DeJesus MJ, Chan M, Andiman WA (2002) Inhibition of human immunodeficiency virus type 1 (HIV-1) replication by a two-amino-acid insertion in HIV-1 Vif from a nonprogressing mother and child. J Virol 76:10533-10539 Arion D, Harada R, Li X, Wainberg MA, Parniak MA (1996) HIV-1 reverse transcriptase shows no specificity for the binding of primer tRNA(Lys3). Biochem Biophys Res Commun 225:839-843 Auxilien S, Keith G, Le Grice SF, Darlix JL (1999) Role of post-transcriptional modifications of primer tRNALys,3 in the fidelity and efficacy of plus strand DNA transfer during HIV-1 reverse transcription. J Biol Chem 274:4412-4420 Bajji AC, Sundaram M, Myszka DG, Davis DR (2002) An RNA complex of the HIV-1 Aloop and tRNA(Lys,3) is stabilized by nucleoside modifications. J Am Chem Soc 124:14302-14303

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 21 Barat C, Lullien V, Schatz O, Keith G, Nugeyre MT, Gruninger-Leitch F, Barre-Sinoussi F, LeGrice SF, Darlix JL (1989) HIV-1 reverse transcriptase specifically interacts with the anticodon domain of its cognate primer tRNA. EMBO J 8:3279-3285 Bass BL (2002) RNA editing by adenosine deaminases that act on RNA. Annu Rev Biochem 71:817-846 Beale RC, Petersen-Mahrt SK, Watt IN, Harris RS, Rada C, Neuberger MS (2004) Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo. J Mol Biol 337:585-596 Beerens N, Berkhout B (2000) In vitro studies on tRNA annealing and reverse transcription with mutant HIV-1 RNA templates. J Biol Chem 275:15474-15481 Beerens N, Groot F, Berkhout B (2000a) Stabilization of the U5-leader stem in the HIV-1 RNA genome affects initiation and elongation of reverse transcription. Nucleic Acids Res 28:4130-4137 Beerens N, Groot F, Berkhout B (2001) Initiation of HIV-1 reverse transcription is regulated by a primer activation signal. J Biol Chem 276:31247-31256 Beerens N, Klaver B, Berkhout B (2000b) A structured RNA motif is involved in correct placement of the tRNA(3)(Lys) primer onto the human immunodeficiency virus genome. J Virol 74:2227-2238 Ben-Artzi H, Shemesh J, Zeelon E, Amit B, Kleiman L, Gorecki M, Panet A (1996) Molecular analysis of the second template switch during reverse transcription of the HIV RNA template. Biochemistry 35:10549-10557 Benas P, Bec G, Keith G, Marquet R, Ehresmann C, Ehresmann B, Dumas P (2000) The crystal structure of HIV reverse-transcription primer tRNA(Lys,3) shows a canonical anticodon loop. RNA 6:1347-1355 Berkhout B, De Ronde A (2004) APOBEC3G versus reverse transcriptase in the generation of HIV-1 drug-resistance mutations. Aids 18:1861-1863 Berkhout B, Grigoriev A, Bakker M, Lukashov VV (2002) Codon and amino acid usage in retroviral genomes is consistent with virus-specific nucleotide pressure. AIDS Res Hum Retroviruses 18:133-141 Berkhout B, van Hemert FJ (1994) The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. Nucleic Acids Res 22:1705-1711 Bishop KN, Holmes RK, Sheehy AM, Davidson NO, Cho SJ, Malim MH (2004) Cytidine deamination of retroviral DNA by diverse APOBEC proteins. Curr Biol 14:1392-1396 Bogerd HP, Doehle BP, Wiegand HL, Cullen BR (2004) A single amino acid difference in the host APOBEC3G protein controls the primate species specificity of HIV type 1 virion infectivity factor. Proc Natl Acad Sci USA 101:3770-3774 Brule F, Marquet R, Rong L, Wainberg MA, Roques BP, Le Grice SF, Ehresmann B, Ehresmann C (2002) Structural and functional properties of the HIV-1 RNAtRNA(Lys)3 primer complex annealed by the nucleocapsid protein: comparison with the heat-annealed complex. RNA 8:8-15 Burnett BP, McHenry CS (1997) Posttranscriptional modification of retroviral primers is required for late stages of DNA replication. Proc Natl Acad Sci USA 94:7210-7215 Cen S, Guo F, Niu M, Saadatmand J, Deflassieux J, Kleiman L (2004a) The interaction between HIV-1 Gag and APOBEC3G. J Biol Chem 279:33177-33184 Cen S, Javanbakht H, Niu M, Kleiman L (2004b) Ability of wild-type and mutant lysyltRNA synthetase to facilitate tRNA(Lys) incorporation into human immunodeficiency virus type 1. J Virol 78:1595-1601

22 Roland Marquet and Frédéric Dardel Conticello SG, Harris RS, Neuberger MS (2003) The Vif protein of HIV triggers degradation of the human antiretroviral DNA deaminase APOBEC3G. Curr Biol 13:20092013 Das AT, Klaver B, Berkhout B (1995) Reduced replication of human immunodeficiency virus type 1 mutants that use reverse transcription primers other than the natural tRNA(3Lys). J Virol 69:3090-3097 Davanloo P, Sprinzl M, Watanabe K, Albani M, Kersten H (1979) Role of ribothymidine in the thermal stability of transfer RNA as monitored by proton magnetic resonance. Nucleic Acids Res 6:1571-1581 De Rocquigny H, Gabus C, Vincent A, Fournie-Zaluski MC, Roques B, Darlix JL (1992) Viral RNA annealing activities of human immunodeficiency virus type 1 nucleocapsid protein require only peptide domains outside the zinc fingers. Proc Natl Acad Sci USA 89:6472-6476 Derrick WB, Horowitz J (1993) Probing structural differences between native and in vitro transcribed Escherichia coli valine transfer RNA: evidence for stable base modification-dependent conformers. Nucleic Acids Res 21:4948-4953 Desrosiers RC, Lifson JD, Gibbs JS, Czajak SC, Howe AY, Arthur LO, Johnson RP (1998) Identification of highly attenuated mutants of simian immunodeficiency virus. J Virol 72:1431-1437 Dettenhofer M, Cen S, Carlson BA, Kleiman L, Yu XF (2000) Association of human immunodeficiency virus type 1 Vif with RNA and its role in reverse transcription. J Virol 74:8938-8945 Dornadula G, Yang S, Pomerantz RJ, Zhang H (2000) Partial rescue of the Vif-negative phenotype of mutant human immunodeficiency virus type 1 strains from nonpermissive cells by intravirion reverse transcription. J Virol 74:2594-2602 Douaisi M, Dussart S, Courcoul M, Bessou G, Vigne R, Decroly E (2004) HIV-1 and MLV Gag proteins are sufficient to recruit APOBEC3G into virus-like particles. Biochem Biophys Res Commun 321:566-573 Droogmans L, Roovers M, Bujnicki JM, Tricot C, Hartsh T, Stalon V, Grosjean H (2003) Cloning and characterization of tRNA (m1A58) methyltransferase (TrmI) from Thermus thermophilus HB27, a protein required for cell growth at extreme temperatures. Nucleic Acids Res 31:2148-2156 Fitzgibbon JE, Mazar S, Dubin DT (1993) A new type of G-->A hypermutation affecting human immunodeficiency virus. AIDS Res Hum Retroviruses 9:833-838 Fortini P, Pascucci B, Parlanti E, D'Errico M, Simonelli V, Dogliotti E (2003) The base excision repair: mechanisms and its relevance for cancer susceptibility. Biochimie 85:1053-1071 Francin M, Kaminska M, Kerjan P, Mirande M (2002) The N-terminal domain of mammalian Lysyl-tRNA synthetase is a functional tRNA-binding domain. J Biol Chem 277:1762-1769 Francin M, Mirande M (2003) Functional dissection of the eukaryotic-specific tRNAinteracting factor of lysyl-tRNA synthetase. J Biol Chem 278:1472-1479 Gabuzda DH, Lawrence K, Langhoff E, Terwilliger E, Dorfman T, Haseltine WA, Sodroski J (1992) Role of vif in replication of human immunodeficiency virus type 1 in CD4+ T lymphocytes. J Virol 66:6489-6495 Gaddis NC, Chertova E, Sheehy AM, Henderson LE, Malim MH (2003) Comprehensive investigation of the molecular defect in vif-deficient human immunodeficiency virus type 1 virions. J Virol 77:5810-5820

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 23 Gilboa E, Mitra SW, Goff S, Baltimore D (1979) A detailed model of reverse transcription and tests of crucial aspects. Cell 18:93-100 Goldschmidt V, Ehresmann C, Ehresmann B, Marquet R (2003) Does the HIV-1 primer activation signal interact with tRNA3(Lys) during the initiation of reverse transcription? Nucleic Acids Res 31:850-859 Goldschmidt V, Paillart JC, Rigourd M, Ehresmann B, Aubertin AM, Ehresmann C, Marquet R (2004) Structural variability of the initiation complex of HIV-1 reverse transcription. J Biol Chem 279:35923-35931 Grosjean H, Houssier C, Romby P, Marquet R (1998) Modulatory role of modified nucleotides in RNA loop-loop interactions. In: Benne R (ed) Modification and Editing of RNA. ASM Press, Washington, D.C., pp 517-533 Gunther S, Sommer G, Plikat U, Iwanska A, Wain-Hobson S, Will H, Meyerhans A (1997) Naturally occurring hepatitis B virus genomes bearing the hallmarks of retroviral G->A hypermutation. Virology 235:104-108 Guo F, Cen S, Niu M, Javanbakht H, Kleiman L (2003) Specific inhibition of the synthesis of human lysyl-tRNA synthetase results in decreases in tRNA(Lys) incorporation, tRNA(3)(Lys) annealing to viral RNA, and viral infectivity in human immunodeficiency virus type 1. J Virol 77:9817-9822 Halwani R, Cen S, Javanbakht H, Saadatmand J, Kim S, Shiba K, Kleiman L (2004) Cellular distribution of Lysyl-tRNA synthetase and its interaction with Gag during human immunodeficiency virus type 1 assembly. J Virol 78:7553-7564 Hargittai MR, Gorelick RJ, Rouzina I, Musier-Forsyth K (2004) Mechanistic insights into the kinetics of HIV-1 nucleocapsid protein-facilitated tRNA annealing to the primer binding site. J Mol Biol 337:951-968 Hargittai MR, Mangla AT, Gorelick RJ, Musier-Forsyth K (2001) HIV-1 nucleocapsid protein zinc finger structures induce tRNA(Lys,3) structural changes but are not critical for primer/template annealing. J Mol Biol 312:985-997 Harmache A, Russo P, Guiguen F, Vitu C, Vignoni M, Bouyac M, Hieblot C, Pepin M, Vigne R, Suzan M (1996) Requirement of caprine arthritis encephalitis virus vif gene for in vivo replication. Virology 224:246-255 Harris RS, Bishop KN, Sheehy AM, Craig HM, Petersen-Mahrt SK, Watt IN, Neuberger MS, Malim MH (2003) DNA deamination mediates innate immunity to retroviral infection. Cell 113:803-809 Houssier C, Degee P, Nicoghosian K, Grosjean H (1988) Effect of uridine dethiolation in the anticodon triplet of tRNA(Glu) on its association with tRNA(Phe). J Biomol Struct Dyn 5:1259-1266 Huang Y, Shalom A, Li Z, Wang J, Mak J, Wainberg MA, Kleiman L (1996) Effects of modifying the tRNA(3Lys) anticodon on the initiation of human immunodeficiency virus type 1 reverse transcription. J Virol 70:4700-4706 Huang Y, Wang J, Shalom A, Li Z, Khorchid A, Wainberg MA, Kleiman L (1997) Primer tRNA3Lys on the viral genome exists in unextended and two-base extended forms within mature human immunodeficiency virus type 1. J Virol 71:726-728 Inoshima Y, Miyazawa T, Mikami T (1998) In vivo functions of the auxiliary genes and regulatory elements of feline immunodeficiency virus. Vet Microbiol 60:141-153 Isel C, Ehresmann C, Keith G, Ehresmann B, Marquet R (1995) Initiation of reverse transcription of HIV-1: secondary structure of the HIV-1 RNA/tRNA(3Lys) (template/primer). J Mol Biol 247:236-250

24 Roland Marquet and Frédéric Dardel Isel C, Keith G, Ehresmann B, Ehresmann C, Marquet R (1998) Mutational analysis of the tRNA3Lys/HIV-1 RNA (primer/template) complex. Nucleic Acids Res 26:1198-1204 Isel C, Lanchy JM, Le Grice SF, Ehresmann C, Ehresmann B, Marquet R (1996) Specific initiation and switch to elongation of human immunodeficiency virus type 1 reverse transcription require the post-transcriptional modifications of primer tRNA3Lys. EMBO J 15:917-924 Isel C, Marquet R, Keith G, Ehresmann C, Ehresmann B (1993) Modified nucleotides of tRNA(3Lys) modulate primer/template loop-loop interaction in the initiation complex of HIV-1 reverse transcription. J Biol Chem 268:25269-25272 Isel C, Westhof E, Massire C, Le Grice SF, Ehresmann B, Ehresmann C, Marquet R (1999) Structural basis for the specificity of the initiation of HIV-1 reverse transcription. EMBO J 18:1038-1048 Janini M, Rogers M, Birx DR, McCutchan FE (2001) Human immunodeficiency virus type 1 DNA sequences genetically damaged by hypermutation are often abundant in patient peripheral blood mononuclear cells and may be generated during near-simultaneous infection and activation of CD4(+) T cells. J Virol 75:7973-7986 Jarmuz A, Chester A, Bayliss J, Gisbourne J, Dunham I, Scott J, Navaratnam N (2002) An anthropoid-specific locus of orphan C to U RNA-editing enzymes on chromosome 22. Genomics 79:285-296 Javanbakht H, Halwani R, Cen S, Saadatmand J, Musier-Forsyth K, Gottlinger H, Kleiman L (2003) The interaction between HIV-1 Gag and human lysyl-tRNA synthetase during viral assembly. J Biol Chem 278:27644-27651 Johnson PR, Hamm TE, Goldstein S, Kitov S, Hirsch VM (1991) The genetic fate of molecularly cloned simian immunodeficiency virus in experimentally infected macaques. Virology 185:217-228 Kang SM, Morrow CD (1999) Genetic analysis of a unique human immunodeficiency virus type 1 (HIV-1) with a primer binding site complementary to tRNAMet supports a role for U5-PBS stem-loop RNA structures in initiation of HIV-1 reverse transcription. J Virol 73:1818-1827 Kang SM, Wakefield JK, Morrow CD (1996) Mutations in both the U5 region and the primer-binding site influence the selection of the tRNA used for the initiation of HIV-1 reverse transcription. Virology 222:401-414 Kang SM, Zhang Z, Morrow CD (1999) Identification of a human immunodeficiency virus type 1 that stably uses tRNALys1,2 rather than tRNALys,3 for initiation of reverse transcription. Virology 257:95-105 Kao S, Khan MA, Miyagi E, Plishka R, Buckler-White A, Strebel K (2003) The human immunodeficiency virus type 1 Vif protein reduces intracellular expression and inhibits packaging of APOBEC3G (CEM15), a cellular inhibitor of virus infectivity. J Virol 77:11398-11407 Kintanar A, Yue D, Horowitz J (1994) Effect of nucleoside modifications on the structure and thermal stability of Escherichia coli valine tRNA. Biochimie 76:1192-1204 Klarmann GJ, Chen X, North TW, Preston BD (2003) Incorporation of uracil into minus strand DNA affects the specificity of plus strand synthesis initiation during lentiviral reverse transcription. J Biol Chem 278:7902-7909 Kleiman L, Halwani R, Javanbakht H (2004) The selective packaging and annealing of primer tRNALys3 in HIV-1. Curr HIV Res 2:163-175

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 25 Lanchy JM, Ehresmann C, Le Grice SF, Ehresmann B, Marquet R (1996) Binding and kinetic properties of HIV-1 reverse transcriptase markedly differ during initiation and elongation of reverse transcription. EMBO J 15:7178-7187 Le Grice SF (2003) "In the beginning": initiation of minus strand DNA synthesis in retroviruses and LTR-containing retrotransposons. Biochemistry 42:14349-14355 Lecossier D, Bouchonnet F, Clavel F, Hance AJ (2003) Hypermutation of HIV-1 DNA in the absence of the Vif protein. Science 300:1112 Li J, Potash MJ, Volsky DJ (2004) Functional domains of APOBEC3G required for antiviral activity. J Cell Biochem 92:560-572 Li X, Mak J, Arts EJ, Gu Z, Kleiman L, Wainberg MA, Parniak MA (1994) Effects of alterations of primer-binding site sequences on human immunodeficiency virus type 1 replication. J Virol 68:6198-6206 Liang C, Li X, Rong L, Inouye P, Quan Y, Kleiman L, Wainberg MA (1997) The importance of the A-rich loop in human immunodeficiency virus type 1 reverse transcription and infectivity. J Virol 71:5750-5757 Liddament MT, Brown WL, Schumacher AJ, Harris RS (2004) APOBEC3F properties and hypermutation preferences indicate activity against HIV-1 in vivo. Curr Biol 14:13851391 Liu B, Yu X, Luo K, Yu Y, Yu XF (2004) Influence of primate lentiviral Vif and proteasome inhibitors on human immunodeficiency virus type 1 virion packaging of APOBEC3G. J Virol 78:2072-2081 Madani N, Kabat D (1998) An endogenous inhibitor of human immunodeficiency virus in human lymphocytes is overcome by the viral Vif protein. J Virol 72:10251-10255 Maglott EJ, Deo SS, Przykorska A, Glick GD (1998) Conformational transitions of an unmodified tRNA: implications for RNA folding. Biochemistry 37:16349-16359 Mak J, Jiang M, Wainberg MA, Hammarskjold ML, Rekosh D, Kleiman L (1994) Role of Pr160gag-pol in mediating the selective incorporation of tRNA(Lys) into human immunodeficiency virus type 1 particles. J Virol 68:2065-2072 Mak J, Khorchid A, Cao Q, Huang Y, Lowy I, Parniak MA, Prasad VR, Wainberg MA, Kleiman L (1997) Effects of mutations in Pr160gag-pol upon tRNA(Lys3) and Pr160gag-plo incorporation into HIV-1. J Mol Biol 265:419-431 Mangeat B, Turelli P, Caron G, Friedli M, Perrin L, Trono D (2003) Broad antiretroviral defence by human APOBEC3G through lethal editing of nascent reverse transcripts. Nature 424:99-103 Mangeat B, Turelli P, Liao S, Trono D (2004) A single amino acid determinant governs the species-specific sensitivity of APOBEC3G to Vif action. J Biol Chem 279:1448114483 Mansky LM (2000) In vivo analysis of human T-cell leukemia virus type 1 reverse transcription accuracy. J Virol 74:9525-9531 Mariani R, Chen D, Schrofelbauer B, Navarro F, Konig R, Bollman B, Munk C, NymarkMcMahon H, Landau NR (2003) Species-specific exclusion of APOBEC3G from HIV-1 virions by Vif. Cell 114:21-31 Marin M, Rose KM, Kozak SL, Kabat D (2003) HIV-1 Vif protein binds the editing enzyme APOBEC3G and induces its degradation. Nat Med 9:1398-1403 Marquet R (1998) Importance of Modified Nucleotides in Replication of Retroviruses, Plant Pararetroviruses, and Retrotransposons. In: Benne R (ed) Modification and Editing of RNA. ASM Press, Washington, D.C., pp 517-533

26 Roland Marquet and Frédéric Dardel Marquet R, Isel C, Ehresmann C, Ehresmann B (1995) tRNAs as primer of reverse transcriptases. Biochimie 77:113-124 Mehle A, Strack B, Ancuta P, Zhang C, McPike M, Gabuzda D (2004) Vif overcomes the innate antiviral activity of APOBEC3G by promoting its degradation in the ubiquitinproteasome pathway. J Biol Chem 279:7792-7798 Miller JT, Khvorova A, Scaringe SA, Le Grice SF (2004) Synthetic tRNALys,3 as the replication primer for the HIV-1HXB2 and HIV-1Mal genomes. Nucleic Acids Res 32:4687-4695 Paillart JC, Shehu-Xhilaga M, Marquet R, Mak J (2004) Dimerization of retroviral RNA genomes: an inseparable pair. Nat Rev Microbiol 2:461-472 Pathak VK, Temin HM (1990) Broad spectrum of in vivo forward mutations, hypermutations, and mutational hotspots in a retroviral shuttle vector after a single replication cycle: substitutions, frameshifts, and hypermutations. Proc Natl Acad Sci USA 87:60196023 Perret V, Garcia A, Puglisi J, Grosjean H, Ebel JP, Florentz C, Giege R (1990) Conformation in solution of yeast tRNA(Asp) transcripts deprived of modified nucleotides. Biochimie 72:735-743 Priet S, Navarro JM, Gros N, Querat G, Sire J (2003a) Differential incorporation of uracil DNA glycosylase UNG2 into HIV-1, HIV-2, and SIV(MAC) viral particles. Virology 307:283-289 Priet S, Navarro JM, Gros N, Querat G, Sire J (2003b) Functional role of HIV-1 virionassociated uracil DNA glycosylase 2 in the correction of G:U mispairs to G:C pairs. J Biol Chem 278:4566-4571 Raba M, Limburg K, Burghagen M, Katze JR, Simsek M, Heckman JE, Rajbhandary UL, Gross HJ (1979) Nucleotide sequence of three isoaccepting lysine tRNAs from rabbit liver and SV40-transformed mouse fibroblasts. Eur J Biochem 97:305-318 Rein A, Henderson LE, Levin JG (1998) Nucleic-acid-chaperone activity of retroviral nucleocapsid proteins: significance for viral replication. Trends Biochem Sci 23:297-301 Renda MJ, Rosenblatt JD, Klimatcheva E, Demeter LM, Bambara RA, Planelles V (2001) Mutation of the methylated tRNA(Lys)(3) residue A58 disrupts reverse transcription and inhibits replication of human immunodeficiency virus type 1. J Virol 75:96719678 Rigourd M, Goldschmidt V, Brule F, Morrow CD, Ehresmann B, Ehresmann C, Marquet R (2003) Structure-function relationships of the initiation complex of HIV-1 reverse transcription: the case of mutant viruses using tRNA(His) as primer. Nucleic Acids Res 31:5764-5775 Rosler C, Kock J, Malim MH, Blum HE, von Weizsacker F (2004) Comment on "Inhibition of hepatitis B virus replication by APOBEC3G". Science 305:1403; author reply 1403 Sallafranque-Andreola ML, Robert D, Barr PJ, Fournier M, Litvak S, Sarih-Cottin L, Tarrago-Litvak L (1989) Human immunodeficiency virus reverse transcriptase expressed in transformed yeast cells. Biochemical properties and interactions with bovine tRNALys. Eur J Biochem 184:367-374 Sampson JR, Uhlenbeck OC (1988) Biochemical and physical characterization of an unmodified yeast phenylalanine transfer RNA transcribed in vitro. Proc Natl Acad Sci USA 85:1033-1037 Sawyer SL, Emerman M, Malik HS (2004) Ancient adaptive evolution of the primate antiviral DNA-editing enzyme APOBEC3G. PLoS Biol 2:E275

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 27 Schrofelbauer B, Chen D, Landau NR (2004) A single amino acid of APOBEC3G controls its species-specific interaction with virion infectivity factor (Vif). Proc Natl Acad Sci USA 101:3927-3932 Sheehy AM, Gaddis NC, Choi JD, Malim MH (2002) Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature 418:646-650 Sheehy AM, Gaddis NC, Malim MH (2003) The antiretroviral enzyme APOBEC3G is degraded by the proteasome in response to HIV-1 Vif. Nat Med 9:1404-1407 Shindo K, Takaori-Kondo A, Kobayashi M, Abudu A, Fukunaga K, Uchiyama T (2003) The enzymatic activity of CEM15/Apobec-3G is essential for the regulation of the infectivity of HIV-1 virion but not a sole determinant of its antiviral activity. J Biol Chem 278:44412-44416 Simon JH, Gaddis NC, Fouchier RA, Malim MH (1998a) Evidence for a newly discovered cellular anti-HIV-1 phenotype. Nat Med 4:1397-1400 Simon JH, Miller DL, Fouchier RA, Soares MA, Peden KW, Malim MH (1998b) The regulation of primate immunodeficiency virus infectivity by Vif is cell species restricted: a role for Vif in determining virus host range and cross-species transmission. EMBO J 17:1259-1267 Simon JH, Southerling TE, Peterson JC, Meyer BE, Malim MH (1995) Complementation of vif-defective human immunodeficiency virus type 1 by primate, but not nonprimate, lentivirus vif genes. J Virol 69:4166-4172 Stopak K, de Noronha C, Yonemoto W, Greene WC (2003) HIV-1 Vif blocks the antiviral activity of APOBEC3G by impairing both its translation and intracellular stability. Mol Cell 12:591-601 Strebel K, Daugherty D, Clouse K, Cohen D, Folks T, Martin MA (1987) The HIV 'A' (sor) gene product is essential for virus infectivity. Nature 328:728-730 Sundaram M, Durant PC, Davis DR (2000) Hypermodified nucleosides in the anticodon of tRNALys stabilize a canonical U-turn structure. Biochemistry 39:12575-12584 Suspene R, Sommer P, Henry M, Ferris S, Guetard D, Pochet S, Chester A, Navaratnam N, Wain-Hobson S, Vartanian JP (2004) APOBEC3G is a single-stranded DNA cytidine deaminase and functions independently of HIV reverse transcriptase. Nucleic Acids Res 32:2421-2429 Svarovskaia ES, Xu H, Mbisa JL, Barr R, Gorelick RJ, Ono A, Freed EO, Hu WS, Pathak VK (2004) Human apolipoprotein B mRNA-editing enzyme-catalytic polypeptide-like 3G (APOBEC3G) is incorporated into HIV-1 virions through interactions with viral and nonviral RNAs. J Biol Chem 279:35822-35828 Tisne C, Rigourd M, Marquet R, Ehresmann C, Dardel F (2000) NMR and biochemical characterization of recombinant human tRNA(Lys)3 expressed in Escherichia coli: identification of posttranscriptional nucleotide modifications required for efficient initiation of HIV-1 reverse transcription. RNA 6:1403-1412 Tisne C, Roques BP, Dardel F (2004) The annealing mechanism of HIV-1 reverse transcription primer onto the viral genome. J Biol Chem 279:3588-3595 Turelli P, Mangeat B, Jost S, Vianin S, Trono D (2004a) Inhibition of hepatitis B virus replication by APOBEC3G. Science 303:1829 Turelli P, Vianin S, Trono D (2004b) The innate antiretroviral factor APOBEC3G does not affect human LINE-1 retrotransposition in a cell culture assay. J Biol Chem Vartanian JP, Henry M, Wain-Hobson S (2002) Sustained G-->A hypermutation during reverse transcription of an entire human immunodeficiency virus type 1 strain Vau group O genome. J Gen Virol 83:801-805

28 Roland Marquet and Frédéric Dardel Vartanian JP, Meyerhans A, Sala M, Wain-Hobson S (1994) G-->A hypermutation of the human immunodeficiency virus type 1 genome: evidence for dCTP pool imbalance during reverse transcription. Proc Natl Acad Sci USA 91:3092-3096 von Schwedler U, Song J, Aiken C, Trono D (1993) Vif is crucial for human immunodeficiency virus type 1 proviral DNA synthesis in infected cells. J Virol 67:4945-4955 Wain-Hobson S, Sonigo P, Guyader M, Gazit A, Henry M (1995) Erratic G-->A hypermutation within a complete caprine arthritis-encephalitis virus (CAEV) provirus. Virology 209:297-303 Wakefield JK, Kang SM, Morrow CD (1996) Construction of a type 1 human immunodeficiency virus that maintains a primer binding site complementary to tRNA(His). J Virol 70:966-975 Wakefield JK, Wolf AG, Morrow CD (1995) Human immunodeficiency virus type 1 can use different tRNAs as primers for reverse transcription but selectively maintains a primer binding site complementary to tRNA(3Lys). J Virol 69:6021-6029 Wei X, Gotte M, Wainberg MA (2000) Human immunodeficiency virus type-1 reverse transcription can be inhibited in vitro by oligonucleotides that target both natural and synthetic tRNA primers. Nucleic Acids Res 28:3065-3074 Weiss S, Konig B, Muller HJ, Seidel H, Goody RS (1992) Synthetic human tRNA(UUULys3) and natural bovine tRNA(UUULys3) interact with HIV-1 reverse transcriptase and serve as specific primers for retroviral cDNA synthesis. Gene 111:183-197 Weissenbach J, Grosjean H (1981) Effect of threonylcarbamoyl modification (t6A) in yeast tRNA Arg III on codon-anticodon and anticodon-anticodon interactions. A thermodynamic and kinetic evaluation. Eur J Biochem 116:207-213 Wiegand HL, Doehle BP, Bogerd HP, Cullen BR (2004) A second human antiretroviral factor, APOBEC3F, is suppressed by the HIV-1 and HIV-2 Vif proteins. EMBO J 23:2451-2458 Willetts KE, Rey F, Agostini I, Navarro JM, Baudat Y, Vigne R, Sire J (1999) DNA repair enzyme uracil DNA glycosylase is specifically incorporated into human immunodeficiency virus type 1 viral particles through a Vpr-independent mechanism. J Virol 73:1682-1688 Wu T, Guo J, Bess J, Henderson LE, Levin JG (1999) Molecular requirements for human immunodeficiency virus type 1 plus-strand transfer: analysis in reconstituted and endogenous reverse transcription systems. J Virol 73:4794-4805 Xu H, Svarovskaia ES, Barr R, Zhang Y, Khan MA, Strebel K, Pathak VK (2004) A single amino acid substitution in human APOBEC3G antiretroviral enzyme confers resistance to HIV-1 virion infectivity factor-induced depletion. Proc Natl Acad Sci USA 101:5652-5657 Yu Q, Konig R, Pillai S, Chiles K, Kearney M, Palmer S, Richman D, Coffin JM, Landau NR (2004) Single-strand specificity of APOBEC3G accounts for minus-strand deamination of the HIV genome. Nat Struct Mol Biol 11:435-442 Yu X, Yu Y, Liu B, Luo K, Kong W, Mao P, Yu XF (2003) Induction of APOBEC3G ubiquitination and degradation by an HIV-1 Vif-Cul5-SCF complex. Science 302:1056-1060 Zhang H, Yang B, Pomerantz RJ, Zhang C, Arunachalam SC, Gao L (2003) The cytidine deaminase CEM15 induces hypermutation in newly synthesized HIV-1 DNA. Nature 424:94-98

Transfer RNA modifications and DNA editing in HIV-1 reverse transcription 29 Zhang Z, Kang SM, LeBlanc A, Hajduk SL, Morrow CD (1996) Nucleotide sequences within the U5 region of the viral RNA genome are the major determinants for an human immunodeficiency virus type 1 to maintain a primer binding site complementary to tRNA(His). Virology 226:306-317 Zheng YH, Irwin D, Kurosu T, Tokunaga K, Sata T, Peterlin BM (2004) Human APOBEC3F is another host factor that blocks human immunodeficiency virus type 1 replication. J Virol 78:6073-6076

Dardel, Frédéric Laboratoire de Cristallographie et Résonance Magnétique Nucléaire Biologiques, UMR 8015 CNRS, Faculté de Pharmacie, Université René Descartes-Paris 5, 4 avenue de l'Observatoire, 75006 Paris, France Marquet, Roland Unité Propre de Recherche 9002 du CNRS conventionnée à l'Université Louis Pasteur, IBMC, 15 rue René Descartes, 67084 Strasbourg cedex, France, [email protected]

Um34 in selenocysteine tRNA is required for the expression of stress-related selenoproteins in mammals Bradley A. Carlson, Xue-Ming Xu, Vadim N. Gladyshev, and Dolph L. Hatfield

Abstract Selenium is an essential micronutrient in the diet of mammals and has many health benefits. Selenium-containing proteins are responsible for most, if not all, of these benefits. This element is incorporated into protein as selenocysteine (Sec), st the 21 amino acid in the genetic code. There are two species of Sec tRNA in mammalian cells that differ by a single 2’-O-hydroxymethyl group on the ribosyl moiety at position 34 (Um34). The relationship between this modification and selenoprotein synthesis was examined in mice in which the wild type Sec tRNA gene was replaced with a mutant Sec tRNA transgene incapable of forming Um34. This mouse line did not express several stress-related selenoproteins, whereas the levels of several selenoproteins thought to serve housekeeping functions were normal. This novel form of protein regulation occurred at the translational level. The Um34 modification in Sec tRNA, therefore, plays a crucial role in regulating the expression of a subset of mammalian selenoproteins and is a requisite for the synthesis of several stress-related selenoproteins.

1 Introduction Selenium is a vital component in the diet of humans and other mammals and has numerous health benefits. It was reported that this element decreases the incidence of certain forms of cancer, and alleviates heart disease and other cardiovascular and muscle anomalies (Hatfield 2001). Furthermore, selenium has been observed to inhibit viral expression, delay the aging process, slow the progression of AIDS in HIV-positive patients and it has roles in mammalian development, male reproduction and immune function (Hatfield 2001). The underlying mechanisms of how selenium promotes these benefits are just beginning to be understood. The available evidence strongly indicates that selenoproteins are the responsible agents (Diwadkar-Navsariwala and Diamond 2004). There are 25 selenoprotein genes in the human genome and 24 in the genomes of rodents (Kryukov et al. 2003). The functions of less than half of the selenoprotein gene products have been characterized

Topics in Current Genetics, Vol. 12 H. Grosjean (Ed.): Fine-Tuning of RNA Functions by Modification and Editing DOI 10.1007/b106652 / Published online: 20 January 2005 © Springer-Verlag Berlin Heidelberg 2005

432 Bradley A. Carlson, Xue-Ming Xu, Vadim N. Gladyshev, and Dolph L. Hatfield

Selenium is incorporated into protein as the 21st amino acid, selenocysteine (Sec), in the genetic code (Hatfield and Gladyshev 2002). The codeword for Sec is UGA and Sec is biosynthesized on its tRNA (designated tRNA[Ser]Sec) following aminoacylation of the tRNA with serine by seryl-tRNA synthetase. The reason UGA can be used as a Sec codon instead of its usual role as a termination codon is the presence of a stem-loop structure in the 3’-untranslated region of eukaryotic selenoprotein mRNAs designated as the Sec insertion sequence or SECIS element (Low and Berry 1996). The machinery involved in the insertion of Sec into protein includes several additional specific factors such as an elongation factor and a SECIS binding protein (reviewed in Driscoll and Copeland 2003). Interestingly, none of the known factors involved in the insertion of Sec into protein appear to have a regulatory role in translation of selenoprotein mRNAs. However, as described in the present review, base modification in Sec tRNA[Ser]Sec (for review, see Hatfield and Gladyshev 2002) plays a key role in determining which selenoprotein mRNAs are translated.

2 Sec tRNA[Ser]Sec Sec tRNA[Ser]Sec has many unique characteristics as it is the longest eukaryotic tRNA sequenced to date and it is highly undermodified compared to other tRNAs (Hatfield and Gladyshev 2002). It has only four modified bases and biosynthesis of the base modifications have been characterized in Xenopus oocytes (Choi et al. 1994; Sturchler et al. 1994). The Sec tRNA[Ser]Sec population in mammals consists of two isoforms that differ from each other by a single methyl modification on the ribosyl moiety at position 34 (Um34; Fig. 1). Addition of the methyl group is a highly specialized last step in maturation and determines both the structure and function of tRNA[Ser]Sec (reviewed in Hatfield and Gladyshev 2002). This methylation is dependent on the prior synthesis of four other modified bases and on an intact tertiary structure, whereas synthesis of the other modified bases is less stringently connected to primary and tertiary structure (Kim et al. 2000). Furthermore, methylation of U34 is enhanced by enriched selenium levels (reviewed in Hatfield and Gladyshev 2002) and its presence dramatically affects secondary and tertiary structure (Diamond et al. 1993). In addition, as shown in Figure 1, the occurrence of the Um34 isoform correlates with the expression of several selenoproteins (see also Chittum et al. 1997; Moustafa et al. 2001; and below). Recently, the selenium-induced, Sec tRNA[Ser]SecmcmUm isoform has been reported to have a specialized role in selenoprotein biosynthesis in that it is likely the major isoacceptor used in expressing this protein class (Jameson and Diamond 2004). One of our major goals, therefore, has been to better understand the role of this methyl group in selenoprotein expression. Changing A to G at position 37 results in a tRNA[Ser]Sec that lacks both isopentenyladenosine (i6A) at this position (i6A37) and Um 34 (Kim et al. 2000). This observation provided an opportunity to generate a mutant tRNA[Ser]Sec without this highly specialized methyl group and examine its role in selenoprotein synthesis.

Um34 in selenocysteine tRNA 433

Sec

5’ 1G C C C G G 5b A U G A U U GA C U C C 10 G G

OH

3’

U

CU GGGG U 20 G C A 30G G C U

mcm5U

Sec

3’ A C C G C G G 70 G C U U 67a . U 60 UU 1 C C A C C mA A G U G G U Ψ C A 47l 45 G A 47j G A C C A G C G U U G G 47a U U A U U C 40 47d C A

5’ 1G C C C G G 5 A U G A U U GA C U CC 10 G G

i6 A

C

A

AGU

OCH3

5’

TR1, TR3

3

U

CU GGGG U 2 G C A 3 G G C U

mcm5Um

3 A C C G C G G7 G C U U 67 . U 6 UU 1 C mA C AC C A G UGG U Ψ C A 47l 4 GA 47j G A C C A U G C G U G G 47 U U A U U C4 47d C A

C

i6 A

A

AGU

5

GPx1, GPx3, SelR, SelT

Fig. 1. Secondary structures of the two Sec tRNA[Ser]Sec isoforms that correspond to the selenoproteins they synthesize. Both isoforms are 90 nucleotides in length and contain four modified nucleosides as follows: at position 34, methylcarboxylmethyl-5’-uridine (mcm5U; left isoform) or methylcarboxylmethyl-5’-uridine-2’-O-methylribose (mcm5Um; right panel), position 37, isopentenyladenosine (i6A), position 55, pseudouridine (Ψ), and position 58, 1-methyladenosine (m1A). Structures of mcm5U and mcm5Um are shown within the circles immediately to the left of position 34 wherein the 2’-O-hydroxy and 2’-Ohydroxymethyl positions are enlarged and bolded. Selenoproteins specifically expressed by both isoforms from selenoprotein mRNA encoding a UGA codon (drawn in the 3’ to 5’ direction) are shown at the bottom of the figure.

3 Generation of mouse models Selenoproteins are the only known class of proteins that are dependent on the presence of a single tRNA, tRNA[Ser]Sec, for their expression. This unique characteristic of selenoprotein biosynthesis provides us with a novel means of perturbing their expression. By altering tRNA[Ser]Sec levels and making tRNA[Ser]Sec mutants, we have generated models for elucidating the cellular roles of selenoproteins as well as their roles in health (Moustafa et al. 2003). These models include a transgenic mouse line carrying either a wild type tRNA[Ser]Sec transgene (designated trspt) or a mutant tRNA[Ser]Sec transgene (designated trspti6A-) that lacks i6A37 and

434 Bradley A. Carlson, Xue-Ming Xu, Vadim N. Gladyshev, and Dolph L. Hatfield Table 1. Rescue of selenoprotein expression in Sec tRNA transgenic-knockout mice.1 Selenoprotein2 GPx1 GPx2 SelR SelT TR1 TR3

Tissue3 Liver Kidney Intestine Liver Kidney Liver Liver Kidney Liver Kidney Brain

Western4 trsp +++ +++ +++ +++ +++ +++ +++ +++ +++ +++ +++

trspt +++ +++ ++++ +++ +++ +++ +++ +++ +++ +++ +++

trspti6A+ +++ +++ +++ +++ +++

1 Mouse lines are trspfl (wild type, homozygous for floxed trspfl [Kumaraswamy et al. 2003]), trspt (homozygous for wild type trspt transgene [Moustafa et al. 2001] and homozygous ∆trsp knockout [Kumaraswamy et al. 2003]) and trspti6A- (homozygous for mutant trspti6A- transgene [Moustafa et al. 2001] and homozygous trsp knockout [Kumaraswamy et al. 2003]). Labeling of mice with 75Se and analysis of the resulting labeled selenoproteins from various tissues (Moustafa et al. 2001; Kumaraswamy et al. 2003; Carlson et al. 2004a and 2004b) demonstrated that several selenoproteins were fully (thioredoxin reductase 1 and 3) or partially (glutathione peroxidase 4, selenoprotein 15 and selenoprotein P) rescued, while others (GPx1 and GPx3) were poorly rescued as discussed in the text. 2 Selenoproteins are: GPx1 and GPx2, glutathione peroxidase 1 and 2; SelR, selenoprotein R; SelT, selenoprotein T; and TR1 and TR3, thioredoxin reductase 1 and 3 (for further details on these selenoproteins, see Kryukov et al. 2003). 3 Extracts were made from tissues shown and prepared for western blotting, and western analysis carried out as described in Moustafa et al. (2001) and Kumaraswamy et al. (2003). 4 Antibodies against GPx1 were obtained from Qichang Shen and GPx2 from Regina Brigelius-Flohé, and those against SelR, SelT, TR1, and TR3 were from our laboratories.

Um34 (Moustafa et al. 2001). In addition, a very useful model for elucidating the roles of selenoproteins in health was generated by preparing a conditional knockout mouse line wherein trsp is flanked by loxP sites (designated trspfl) and its targeted removal is dependent on loxP-Cre technology (Kumaraswamy et al. 2003). This mouse line has permitted us to selectively remove trsp in different tissues and examine in more detail the role of selenoproteins in development and health. Finally, a standard knockout mouse line carrying a deletion of trsp (designated ∆trsp) has been prepared (Bösl et al. 1997; Kumaraswamy et al. 2003). Although this mouse line is embryonic lethal, it has provided a means of rescuing selenoprotein expression with wild type or mutant Sec tRNA[Ser]Sec transgenes.

Um34 in selenocysteine tRNA 435

X Genotype: trsp/∆trsp trspti6A-/ trspti6A-

Genotype: trsp/trsp trspti6A-/ trspti6A-

Genotype: trsp/∆trsp trspti6A-/ trspti6A-

Genotype: trsp/∆trsp trspti6A-/ trspti6A-

Genotype: trsp/∆trsp trspti6A-/ trspti6A-

Genotype: ∆trsp/∆trsp trspti6A-/ trspti6A-

Fig. 2. Scheme of mouse matings for rescuing selenoprotein expression. Mice that were heterozygous for the wild type (trsp) and knockout (∆trsp) tRNA[Ser]Sec genes and homozygous for mutant i6A37 transgenes (trspti6A-) were obtained as described (Kumaraswamy et al. 2003) and mated as shown yielding the rescued mice that were homozygous for ∆trsp and trspti6A-. Mice that were rescued with the wild type trspt transgene were obtained in the same manner as shown in the figure for the mutant trspti6A- transgene. These mice were analyzed for selenoprotein expression by 75Se-labeling and western blotting and the expression data are given in Table 1.

3.1 Selective rescue of selenoprotein expression By mating mice that are heterozygous for trsp knockout (∆trsp) with transgenic mice that are homozygous for trspt or trspti6A- transgenes, we have been able to generate a mouse line that is homozygous for either wild type trspt or trspti6A(Carlson et al. 2004b). Matings among the latter mouse line yielded mice that were dependent on the transgenes for survival (Fig. 2). Transgenic-knockout mice in which selenoprotein expression has been rescued with trspt synthesize selenoproteins in normal amounts (see Table 1). However, transgenic-knockout mice in which selenoprotein expression has been rescued with trspti6A- synthesize many, but not all selenoproteins (see legend 1 to Table 1 and last column on the right in the Table). Interestingly, GPx1, GPx2, GPx3, SelR, and SelT are either not rescued or are poorly rescued (Table 1 and legend). The known functions of these selenoproteins suggest that they are involved with stressrelated phenomena (Kryukov et al. 2003). In addition, administering 75selenium to the transgenic-knockout mice and analyzing the resulting labeled selenoprotein population (see Carlson et al. 2004b and

436 Bradley A. Carlson, Xue-Ming Xu, Vadim N. Gladyshev, and Dolph L. Hatfield

references therein) in various tissues (e.g. liver, kidney, heart, testis, intestine, plasma, and spleen) confirmed the absence of the non-rescued proteins identified in Table 1 (see also table legend and Carlson et al. 2004b). Furthermore, northern analysis of mRNA levels of the non-rescued selenoproteins demonstrated that they were expressed in sufficiently high levels for their expression, but the corresponding selenoproteins were not expressed. Thus, the observed phenomenon appears to be a defect at the translation step.

4 Discussion and concluding remarks The results in Table 1 demonstrate that base modification(s) in the anticodon loop of tRNA[Ser]Sec is (are) involved in selenoprotein expression. The strain of transgenic-knockout mice wherein the null trsp gene was replaced with trspti6A- shows that several selenoproteins were poorly rescued by the mutant transgene, whereas several additional selenoproteins were expressed in normal amounts. Since the mutant isoform of tRNA[Ser]Sec lacks both i6A37 and Um34, questions may be asked as to which modification, or whether an interplay between both modifications, influence(s) selenoprotein synthesis in the manner observed in this study. Clearly, the mutant isoform efficiently supports the synthesis of housekeeping selenoproteins, such as TR1 and TR3 (see Table 1 and Carlson et al. 2004b), demonstrating that this isoform is used effectively in protein synthesis. In fact, it supports the synthesis of those selenoproteins that the non-Um34 isoform, tRNA[Ser]SecmcmU, synthesizes in mammalian cells and tissues (Chittum et al. 1997; Moustafa et al. 2001; Carlson et al. 2004b). The critical modification that is missing from the Sec tRNA[Ser]Sec population normally found in mammalian cells is Um34. Not only does the absence of Um34 correlate with a decrease in the expression of several stress-related selenoproteins, but the synthesis of Um34 on tRNA[Ser]Sec is also a selenium dependent reaction. Selenium deficient mice have reduced amounts of this methylated isoform, and in addition have reduced amounts of several of the selenoproteins identified herein (reviewed in Hatfield and Gladyshev 2002). Therefore, a strong correlation exists between similar phenotypes observed in selenium deficient mice and in mice carrying i6A-trsp. Furthermore, as noted in section 15.2, the presence of this methyl group in tRNA[Ser]Sec is a highly specialized event in tRNA[Ser]Sec structure and function. Finally, we have identified a factor that influences U34m formation (unpublished results). Knockdown of this factor in mammalian cells using RNAi technology demonstrates that the resulting phenotype has reduced levels of GPx1 as is observed in the transgenic-knockout mice described herein (see also Carlson et al. 2004b). Overall, the evidence strongly supports a major role of Um34 in the synthesis of selenoproteins that are involved in stress-related phenomena. Furthermore, the data suggest that the observed loss in selenoprotein expression is unlikely due to the absence of i6A or to an interplay between i6A and Um34. An examination of many different parameters of selenoprotein mRNAs (summarized in Kryukov et al. 2003) such as nucleotide context of the UGA Sec

Um34 in selenocysteine tRNA 437

codon, SECIS element class and location of the UGA Sec codon within the open reading frame has not revealed any clear pattern that would explain why Um34 may be responsible for the expression of only certain selenoproteins. The precise mechanism of how Um34 is responsible for the synthesis of a subset of selenoproteins must await further investigation. It should be noted, however, that this study provides the first example of the translation of several proteins being dependent on the recoding of a nonsense codeword involving Um34 and provides a novel role of tRNA in protein expression (see also Carlson et al. 2004b).

References Bösl MR, Takaku K, Oshima M, Nishimura S, Taketo MM (1997) Early embryonic lethality caused by targeted disruption of the mouse selenocysteine tRNA gene (Trsp). Proc Natl Acad Sci USA 94:5531-5534 Carlson BA, Novoselov SV, Kumaraswamy E, Lee BJ, Anver MR, Gladyshev VN, Hatfield DL (2004a) Specific excision of the selenocysteine tRNA[Ser]Sec (trsp) gene in mouse liver demonstrates an essential role of selenoproteins in liver function. J Biol Chem 279:8011-8017 Carlson BA, Xu Xue-Ming, Gladyshev VN, Hatfield DL (2004b) Selective rescue of selenoprotein expression in mice lacking a highly specialized methyl group in selenocysteine tRNA. J Biol Chem (in press) Chittum HS, Hill KE, Carlson BA, Lee BJ, Burk RF, Hatfield DL (1997) Replenishment of selenium deficient rats with selenium results in redistribution of the selenocysteine tRNA population in a tissue specific manner. Biochim Biophys Acta 1359:25-34 Choi IS, Diamond AM, Crain PF, Kolker JD, McCloskey JA, Hatfield, DL (1994) Reconstitution of the biosynthetic pathway of selenocysteine tRNAs in Xenopus oocytes. Biochem 33:601-605 Diamond AM, Choi IS, Crain PF, Hashizume T, Pomerantz SC, Cruz R, Steer CJ, Hill KE, Burk RF, McCloskey JA, Hatfield DL (1993) Dietary selenium affects methylation of the wobble nucleoside in the anticodon of selenocysteine tRNA[Ser]Sec. J Biol Chem 268:14215-14223 Diwadkar-Navsariwala V, Diamond AM (2004) The link between selenium and chemoprevention: a case for selenoproteins. J Nutr 134:2899-2902 Driscoll DM, Copeland PR (2003) Mechanism and regulation of selenoprotein synthesis. Annu Rev Nutr 23:17-40 Hatfield DL (2001) Selenium: Its Molecular Biology and Role in Human Health. Kluwer Academic Publishers, Norwell, MA Hatfield DL, Gladyshev VN (2002) How selenium has altered our understanding of the genetic code. Mol Cell Biol 22:3565-3576 Jamenson RR, Diamond AM (2004) A regulatory role for Sec tRNA[Ser]Sec in selenoprotein synthesis. RNA 10:1142-1152 Kim LK, Matsufuji T, Matsufuji S, Carlson BA, Kim SS, Hatfield DL, Lee BJ (2000) Methylation of the ribosyl moiety at position 34 of selenocysteine tRNA[Ser]Sec is governed by both primary and tertiary structure. RNA 6:1306-1315 Kryukov GV, Castellano S, Novoselov SV, Lobanov AV, Zehtab O, Guigo R, Gladyshev VN (2003) Characterization of mammalian selenoproteomes. Science 300:1439-1443

438 Bradley A. Carlson, Xue-Ming Xu, Vadim N. Gladyshev, and Dolph L. Hatfield Low SC, Berry MJ (1996) Knowing when not to stop: selenocysteine incorporation in eukaryotes. Trends Biochem Sci 21:203-208 Kumaraswamy E, Carlson BA, Morgan F, Miyoshi K, Robinson GW, Su D, Wang S, Southon E, Tessarollo L, Lee BJ, Gladyshev VN, Hennighausen L, Hatfield DL (2003) Selective removal of the selenocysteine tRNA[Ser]Sec gene (Trsp) in mouse mammary epithelium. Mol Cell Biol 23:1477-1488 Moustafa ME, Carlson BA, El-Saadani MA, Kryukov GV, Sun QA, Harney JW, Hill KE, Combs GF, Feigenbaum L, Mansur DB, Burk RF, Berry MJ, Diamond AM, Lee BJ, Gladyshev VN, Hatfield DL (2001) Selective inhibition of selenocysteine tRNA maturation and selenoprotein synthesis in transgenic mice expressing isopentenyladenosinedeficient selenocysteine tRNA. Mol Cell Biol 21:3840-3852 Moustafa ME, Kumaraswamy E, Zhong N, Rao M, Carlson BA, Hatfield DL (2003) Models for assessing the role of selenoproteins in health. J Nutr 133:2494S-2496S

Carlson, Bradley A. Molecular Biology of Selenium Section, Laboratory of Cancer Prevention, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA Gladyshev, Vadim N. Department of Biochemistry, University of Nebraska, Lincoln, NE 68588 USA Hatfield, Dolph L. Molecular Biology of Selenium Section, Laboratory of Cancer Prevention, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA [email protected] Xu, Xue-Ming Molecular Biology of Selenium Section, Laboratory of Cancer Prevention, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA