Progress in Nucleic Acid Research and Molecular Biology, Volume 52

PROGRESS IN Nucleic Acid Research and Molecular Biology edited by WALDO E. COHN KlVlE MOLDAVE Biology Dioision Ouk R...

Author: E. Waldo Cohn | Kivie Moldave

13 downloads 859 Views 23MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

PROGRESS IN

Nucleic Acid Research and Molecular Biology edited by

WALDO E. COHN

KlVlE MOLDAVE

Biology Dioision Ouk Ridge Nutional Luhorutory Ouk Ridge, Tennessee

Depurtment of Molecular Biology and Biochemistry Unioersity of Culiforniu, Zruine Zruine, Calijiomia

Volume 52

(#)

ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto

This book is printed on acid-free paper.

@

Copyright 0 1996 by ACADEMIC PRESS, INC

All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.

Academic Press, Inc.

A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495

United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NW I 7DX

International Standard Serial Number: 0079-6603 International Standard Book Number: 0-12-540052-7 PRINTED IN THE UNITED STATES OF AMERICA 96 97 9 8 9 9 00 01 BB 9 8 7 6 5

4

3 2

1

Abbreviations and Symbols

All contributors to this Series are asked to use the terminology (abbreviations and symbols) recommended by the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and approved by IUPAC and IUB, and the Editors endeavor to assure conformity. These Recommendations have been published in many journals ( 1 , 2 )and compendia (3);they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations ( 1 ) and subsequently revised and expanded (2, 3), are given in condensed form in the frontmatter of Volumes 9-33 of this series. A recent expansion of the oneletter system (5) follows. SINGLE-LETTER CODE Symbol

&.COMMENDATIONSo

(5)

Origin of symbol

Meaning

G

G

Guanosine Adenosine (ribo)Thymidine (Uridine) Cytidine

R

G or A T(U) or C A or C G or T(U) G or C A or T(U)

puRine pyrimidine aMino Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds)

A or C or T(U) G or T(U) or C G or C or A G or A or T(U)

not not not not

N

G or A or T(U) or C

aNy nucleoside (i.e., unspecified)

Q

Q

Queuosine (nucleoside of queuine)

Y

M K S

Wb

H

B V DC

G; H follows G in the alphabet A; B follows A T (not U); V follows U C; D follows C

UModified from Proc. Natl. Acad. Sci. U . S . A . 83, 4 (1986). bW has been used for wyosine, the nucleoside of “base Y” (wye). CDhas been used for dihydrouridine (hU or H,Urd).

Enzymes

In naming enzymes, the 1984 recommendations of the IUB Commission on Biochemical Nomenclature (4)are followed as far as possible. At first mention, each enzyme is described either by its systematic name or by the equation for the reaction catalyzed or by the recommended trivial name, followed by its EC number in parentheses. Thereafter, a trivial name may be used. Enzyme names are not to be abbreviated except when the substrate has an approved abbreviation (e.g., ATPase, but not LDH, is acceptable).

ix

ABBREVIATIONS AND SYMBOLS

X

REFERENCES 1 . JBC 241,527 (1966);Bchetn 5, 1445 (1966); BJ 101,l(1966);ABB 115, 1 (1966),129,l(1969);

and elsewhere. General.

2. EJB 15, 203 (1970);JBC 245, 5171 (1970);J M B 55, 299 (1971),and elsewhere.

3. “Handbook of Biochemistry” (G. Fasman, ed.), 3rd ed. Chemical Rubber Co., Cleveland, Ohio, 1970, 1975, Nucleic Acids, Vols. I and 11, pp. 3-59. Nucleic acids. 4. “EnLyme Nomenclature” [Recommendations (1984)of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5. EJB 150, 1 (1985).Nucleic Acids (One-letter system). Abbreviations of Journal Titles

Journals

Abbreviations used

Annu. Rev. Biochem Annu. Rev. Genet. Arch. Biochem. Biophys. Biochem. Biophys. Res. Commun. Biochemistry Biochem. J. Biochim. Biophys. Acta Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quant. Biol Eur. J. Biochem. Fed. Proc. Hoppe-Seyler’s Z. Physiol. Chein. J. Amer. Chem. Soc. J. Bacteriol. J. Biol. Chem. J. Chem. Soc. J. Mol. Biol. J. Nat. Cancer Inst. Mol. Cell. Biol. Mol. Cell. Biochem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research Proc. Natl. Acad. Sci. U.S.A. Proc. Soc. Exp. Biol. Med. Progr. Nuel. Acid. Res. Mol. Bid.

ARB ARGen ABB BBRC Bchem BJ BBA CSH CSHLab CSHSQB EJB FP ZpChem JACS J. Bact. JBC JCS JMB JNCI MCBiol MCBchem MGG Nature NB NARes PNAS PSEBM This Series

Structure, Reactivity, and Biology of DoubleStranded RNA’ ALLEN W. NICHOLSON Department of Biological Sciences Wayne State University Detroit, Michigan 48202

I. Biological Origins of dsRNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structure and Dynamics of dsRNA . , . . . . . . . . . . , . . . . . . . . . . . Protein Recognition of dsRNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chemical Stability of dsKNA . . . . . . , . . . . . . . . . . . . . . . . . , . . . . . . . . . . Enzymatic Cleavage of dsRNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Ribonuclease I11 . . . . . . . . . . . . . , . . . , . . , . . . , B. Cobra Venom Rihonuclease (RNase V,) . , . . . . . . . . . . , . . . . . . . . . .

11. Experimental Criteria for dsRNA

111. IV. V. VI.

C. dsRNase Activities Mechanistically Related to Pancreatic RNase VII. dsRNA Function in Prokaryotes . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Gene Regulation by Ribonuclease I11 . . . . . . . . . . . . . . . . . . . . . . . . .. B. dsRNA and Antisense Regulation . . , . . . , . .. VIII. dsRNA Function in Eukaryotes . . . . . . . . . . . . . A. dsRNA and hnRNA . . . . . . . . . . . , . . , . , . . . . . , . . . . . . , . . . . . . , . . . B. dsRNase Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Other dsRNA-specific Activities . , . , . . . . . . . . . . , , . . , . . . . . . , . . . IX. dsRNA and the Interferon System . , . . . . . . , , . . . . . . . . . . A. The dsRNA-activated Protein Kinase . . . . . . . . . . . . . . . . . . . . . . . . . B. The dsRNA-activated 2‘-5’A Synthetase . . . . . . . . C. dsRNA and Mammalian Cell Signal Transduction , . . , , . . . . . , , . . X. Cellular and Physiological Effects of dsRNA, and Therapeutic Applications . . . . . . . . . . . . . . . . . . . . , , . . . , . , . , , , , , . , . . , . , . , , . , , , , XI. Conclusions and Prospects . . . . . . . . . . . , . . . . . . , . . . . . . . , . . , . . . , . . . References . . . . .............................. Note Added in P r o o f . . . . . . . . . . . . . , . . . , , . , , . , . , . , . , . , , , , , , . , , , .

2 3 5 13 17 18 18 24 24 26 26 28 34 34 36 42 46 46 49 51 53 56 58 65

Abbreviations: AFM, atomic force microscopy; Da, dalton; ds, double-stranded; dsRBD, double-stranded RNA-binding domain; hnRNA, heterogeneous nuclear RNA; hnRNP, heterogeneous nuclear ribonucleoprotein; HIV, human immunodeficiency virus; IFN, interferon; IL, interleukin; M-MuLV, Moloney murine leukemia virus, RSV, Rous sarcoma virus; RTase, reverse transcriptase; SD, Shine-Dalgarno; snRNA, small nuclear RNA; ss, single-stranded; TIR, translation initiation region; ts, triple-stranded; 5’-UTR and 3’-UTR, 5’ and 3’ untranslated regions, respectively; UV, uItraviolet. Progress in Nuclcic Acid Rescarch and Moleculdr Biology, Vol 52

1

Copyright 0 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.

2

ALLEN W. NICHOLSON

The RNA double helix is an ubiquitous structural motif in living organisms. Double-stranded (ds)RNAZ is created by a number of biosynthetic pathways, and is subsequently degraded, denatured, or specifically modified by enzymatic activities. It also serves as a stable repository of genetic information for many viruses. The diverse functional roles of dsRNA have spurred intensive studies on the biochemical processes that involve dsRNA. dsRNA is also being examined as an agent that changes gene expression patterns and alters cell physiology, as well as a potential therapeutic agent in fighting disease. In addition to providing answers to intriguing biological phenomena, ongoing studies on dsRNA have prompted new questions. How do the physical properties of the RNA double helix establish biological function? How is dsRNA specifically recognized by proteins? What are the pathways of dsRNA formation and breakdown in uiuo? How does dsRNA participate in signal transduction pathways? I intend to address these questions, and to frame new ones prompted by recent findings. I focus on the structure and physicochemical properties of dsRNA; on the enzymes that degrade, modify, or otherwise modulate dsRNA structure and function; and on protein motifs that specifically recognize dsRNA. The metabolism and regulatory functions of dsRNA in the prokaryotic cell are discussed, as are the functions of dsRNA and dsRNA-specific enzymes in eukaryotic cells. Finally, the mammalian cellular and physiological response to dsRNA and the prospects of dsRNA as a therapeutic agent are considered. Due in part to space limitations, this review does not examine the role of dsRNA as a structural component of macromolecular complexes, nor (except for antisense RNA) does it discuss the myriad of short, transiently formed dsRNA segments that are essential features of many biological processes (for example, the base-pairing of the prokaryotic mRNA translation initiation region with the 3' end of 16-S rRNA, or between eukaryotic U1 snRNA and the 5' splice site of group 11 introns). I also do not discuss the structures, genetic organization, and replication strategies of viruses with dsRNA genomes, nor summarize the extensive studies on dsRNA isolated from virusinfected plants. Specific aspects of the structure and biological properties of dsRNA have been examined in several previous reviews (1-4).

1. Biological Origins of dsRNA Double-stranded RNA appears in many biological processes. Many viruses have dsRNA chromosomes, which on infection express their encoded 2

The term double-stranded (ds) RNA refers to the antiparallel right-handed double helix,

in which the two Watson-Crick base-pairs (G-C and A U) are predominantly, if not exclusively,

present.

DOUBLE-STRANDED RNA

3

genes, undergo amplification, and are subsequently encapsidated and transmitted to other cells. Following single-stranded (ss) RNA virus infection, dsRNA is generated as a probable by-product of replication. dsRNA can also arise from the symmetrical transcription of viral DNA, followed by RNARNA annealing. There is no strong evidence in the latter two instances that dsRNA production is essential to the viral infection strategy; in fact, intracellular viral dsRNA in nonsequestered form can provoke the interferonmediated antiviral response (see Section IX). Cells produce dsRNA in the normal course of gene expression. dsRNA structures can occur within primary transcripts, which either persist in the mature species or are removed by RNA processing. Intramolecular dsRNA elements are present within local hairpin structures, or created through long-distance base-pairing. The latter situation is seen in the primary ribosomal RNA transcript of Escherichia coli, where complementary sequences thousands of nucleotides apart engage to form specific processing sites for RNase 111 (Section VI1,A). dsRNA structures can also arise through base-pairing between independent transcripts, such as the binding of antisense RNAs to their targets (Section VI1,B).

II. Experimental Criteria for dsRNA A number of experimental protocols can distinguish dsRNA from less structured species (5, 6). Several physicochemical methods are informative, the availability of sufficient material permitting. The base composition of a dsRNA preparation should exhibit equivalent amounts of A and U, and G and C, which reflects the presence of Watson-Crick base-pairs. dsRNA also exhibits a distinct temperature-dependent UV absorbance profile, wherein a sharp hyperchromism at the wavelength of peak absorbance occurs over a narrow temperature range. The transition reflects the highly cooperative melting of the double helix to yield separated single strands (7).The midpoint for the dsRNA + ssRNA transition is characterized by a temperature value (T,) that is sensitive to the salt concentration. I n contrast, the absorbance-versus-temperature profile of less structured RNAs exhibits a significantly lower hyperchromicity and cooperativity. Chromatographic fractionation can be used to separate and punfy dsRNA, or RNA species that contain double-stranded regions. dsRNA preferentially binds to cellulose CF-11 in ethanol-containing buffers, such that ssRNA is eluted first as the ethanol concentration is lowered (8).The exact nature of the interaction of dsRNA with the cellulose matrix is not understood, but it may involve hydrogen bonds between the hydroxyl groups in dsRNA and cellulose. RNA purification procedures that include a cellulose CF-11 step can remove trace amounts of dsRNA from RNA preparations (9, 10).

4

ALLEN W. NICHOLSON

Enzymatic analysis of dsRNA is relatively rapid, and uses much smaller amounts of material, usually in radiolabeled form. A well-known enzymatic test is the resistance of dsRNA to degradation by pancreatic ribonuclease (RNase A) in high (>0.15 M ) salt, and a corresponding sensitivity in low salt (5).The molecular basis for the differential reactivity is discussed in Section V1,C. Another enzymatic test uses E. coEi RNase 111, which degrades dsRNA species that are 2 20 bp, but does not cleave ssRNA, or dsRNA containing a significant amount of mismatches or other structural irregularities (6) (Section V1,A). Cobra venom ribonuclease (RNase VJ, cleaves dsRNA endonucleolytically, although helical ssRNA is also a substrate (11, 12) (Section VI,B). A sensitive biological test is provided by the ability of dsRNA (280 bp) to inhibit protein synthesis in reticulocyte lysates (6), due to the activation of the endogenous dsRNA-dependent protein kinase, whose action blocks an essential step in translation initiation (Section IX,A). Establishing the existence of dsRNA species in vivo has been more problematic, and careful consideration must be given to the experimental protocol. For example, phenol extraction can promote dsRNA formation (13). Gentle fractionation procedures that omit phenol may atrord an RNA preparation that retains much of its original secondary structure, and is largely devoid of artifactually generated dsRNA. The ssRNA component of an RNA preparation can be removed by RNase A digestion in high salt, and CF-11 cellulose chromatography can purify the dsRNA fraction. RNA fingerprinting or nucleotide sequence analysis would then be required to determine the complexity of the dsRNA preparation. Polyclonal antibodies have been used to detect dsRNA in cells and biological preparations (14). dsRNA-specific monoclonal antibodies that are largely insensitive to base-pair sequence have also been developed (15). Photoreactive reagents such as psoralens, which form intermolecular crosslinks within a double helix, can detect and “freeze” dsRNA structures in vivo (16). However, these approaches are not expected to be successful in detecting dsRNAs that have a transient existence, and that therefore have a low steady-state concentration in uioo. H a mutational approach is feasible, nucleotide sequence changes expected to disrupt predicted base-pairs-and secondary mutations that compensate for the initial disruption-can b e used to verify dsRNA structures otherwise inaccessible to other types of analysis. dsRNA molecules can be directly visualized by electron microscopy or by atomic force microscopy (AFM). AFM involves measuring the local contact forces between the scanning probe and the biological sample, which is stably &xed to a flat surface (17). AFM can provide images of dsRNA of a quality comparable to that obtained by electron microscopy, and can allow accurate length measurements of dsRNA without prior staining, shadowing, or other modifications (Fig. 1).

DOUBLE-STRANDED KNA

5

FIG. 1. Atomic force microscopy (AFM) image ofpurified dsRNA from reovirus. The scale is given in the lower right corner. Reprinted by permission of Oxford University Press from Ref.

275.

111. Structure and Dynamics of dsRNA As with any macromolecule and its attendant physical complexity, the function of dsRNA is best understood through knowledge of its structure. By definition, the secondary structure of an RNA is its ensemble of base-paired elements. The secondary structure provides the framework for additional RNA folding, creating tertiary interactions that establish and stabilize the three-dimensional shape (for recent reviews, see Refs. 4 and 18). Regarding dsRNA as a canonical double helix is sufficient for many first-order analyses. Nevertheless, is dsRNA capable of displaying a range of conformations? This question has been prompted in part by the large body of evidence that DNA double helices exhibit pronounced conformational plasticity. The polymorphism of DNA is manifested within the structural context of antiparallel, complementary strands, and is influenced by specific base-pair sequence

6

ALLEN W. NICHOLSON

and physical environment (7). Many of the original investigations of dsRNA structure detected no pronounced conformational diversity, which prompted the conclusion that the RNA double helix is structurally conservative (1, 7, 19). However, these studies were limited by low resolution, and recent investigations are now revealing a significant degree of polymorphism.

A. Structure of dsRNA at the Atomic Level The first structural information on dsRNA came from X-ray diffraction analyses of synthetic or naturally occurring dsRNA fibers (reviewed in Refs. 1 and 7). These studies confirmed the prediction that dsRNA consists of two antiparallel strands engaged in a right-handed double helix. In contrast to the various families of double-helical DNAs, dsRNA displays the A-helix motif, which exhibits an 11-fold helical pitch (Fig. 2). Raising the salt concentration in the fiber preparations causes a minor structural change to the A' double helix, which has a 1Zfold helical symmetry. Because the noncrystalline nature of the RNA fibers limited the resolution to approximately 3 A, no detailed information at the atomic level could be obtained. X-Ray diffraction analyses of crystals of two self-complementary dinucleoside phosphates, ApU and GpC, provided the first high-resolution structural information on the RNA double helix (20,21).The structures were refined to 0.8 A resolution and displayed Watson-Crick base-pairing, with the ribose sugar in the W - e n d o conformation and the nucleobase torsion angles in the anti range. Extrapolation of the structures to infinite length yielded a right-handed double helix with 11-fold symmetry, in good agreement with the fiber diffraction studies. Because the dinucleoside phosphates are heavily hydrated, the crystal structures are defined by local interactions (e.g., sugar-phosphate backbone constraints), rather than by crystal packing forces. Two classes of sodium ion binding sites were observed: one site is positioned between adjacent phosphate groups and the other is close to the 0 2 atom of the uracil residue in the minor groove. The latter interaction provided the first example of specific ligand binding to the dsRNA minor groove. X-Ray diffraction analysis of tRNA crystals also provided information on short double-helical structures within the context of a more complex tertiary structure (22).A statistical analysis did not find a correlation between the type of base-pair and local structural parameters of double-helical regions in tRNA (19), suggesting that specific base-pairs have a minor influence on dsRNA conformation. The A-form double helix is distinguished in several ways from the other helix families (I,4 , 7).The two antiparallel strands wrap around the helix axis in a ribbonlike manner, and the base-pairs are tilted away from the axis. The base-pairs also exhibit a forward displacement from the helix axis, creating a hollow cylindrical core with a van-der-Waals diameter of approximately 3.5A

7

DOUBLE-STRANDED RNA

B

C

FIG.2. Structure of the A-form RNA double helix. I n B, the doultle helix is tilted by 32" with respect to the helix in A, in order to show more clearly the major (M) and minor (m) grooves. In C , the helix is rotated by 90", and displays t h e central channel and extensive basepair stacking. Reprinted with permission from Ref. 7.

8

ALLEN W. NICHOLSON

(Fig. 2). The combination of base-pair tilt and forward displacement allows interstrand as well as intrastrand base stacking and creates a narrow, deep major groove and a shallow minor groove. The ribose conformation is C3'-endo, which reflects the necessity of accommodating the bulky 2'-hydroxyl group. The CZ-endo conformation causes the A-helix to be underwound with respect to the B-helix, and shortens the intrastrand phosphatephosphate distance to 5.9 A. dsRNA is therefore more compact than DNA, with a helical rise of 2.74 A, compared to 3.4 A for DNA, and exhibits a higher molecular mass per length (241 Da/hi) compared to DNA (195 DalK). The compact nature of dsRNA has a major influence on its gel electrophoretic mobility (Section 111,B). The V - e n d o ribose conformation places the 2'-hydroxyl groups at the edge of the minor groove and within hydrogen bonding distance of the 0 4 ' oxygen of the 3' neighboring nucleotide. This network of hydrogen bonds may give additional stability to the A-helix. A computer-assisted analysis of the solvent-accessible surface of the RNA double helix gave further support to the exposed nature of the minor groove (23).A molecular modeling study of the A-form RNA double helix emphasized the depth and narrowness of the major groove and the shallow, exposed nature of the minor groove (24). With its border of 2'-hydroxyl groups, and accessible bases, the minor groove provides a richly interactive molecular surface that can confer specificity and binding energy for proteins, other nucleic acids, and small ligands. The development of efficient methods to synthesize RNA chemically and enzymatically has allowed determination of the crystal and solution structures of dsRNAs of specific sequence and larger size. X-Ray diffraction analysis (2.25 hi resolution) of the self-complementary oligoribonucleotide U(UA)& revealed novel structural features, and provided an important model with which to understand how the RNA double helix engages in specific intermolecular interactions (25, 26). The [U(UA),A], structure displays the overall features of the A-form double helix, but also exhibits local discontinuities (Fig. 3). The double helix is kinked at two specific sites, which define a central and two flanking helical domains. The central domain displays the structural features of the canonical A-form helix, whereas the terminal domains show a significant deviation. The angles defined by the helix axes of adjacent domains are 13"and 11".The two kinks are not coplanar, and create a torsion angle of 70" between the helix axes of the terminal domains. Because the highly hydrated nature of the unit cell effectively minimizes crystal packing forces, it was argued that the kinks are an inherent feature of the [U(UA),A], duplex (25, 26). Both intramolecular and intermolecular hydrogen bonds are observed, all of which involve 2'-hydroxyl groups. The intramolecular interactions include 2'-hydroxyl group bonding, via a bridging water molecule, to either the 3' neighboring ribose 0 4 ' oxygen, with an

9

DOUBLE-STRANDED RNA

A

B

*

PI4 P28

FIG.3. Crystal structure of the [U(UA),A], duplex, displayed in stereo view. The! vertical lines indicate the three axes (see text for additional discussion). In A, the minor groov'e is emphasized, whereas in B, the major groove is displayed. Reprinted with permission from Ref. 26.

10

ALLEN W. NICHOLSON

average distance of 3.3 A, or to the minor groove-localized 0 2 or N3 atom of the adjacent base. These hydrogen bonds may stabilize the sugar C3’-endo conformation. The intermolecular hydrogen bonds are either direct or water mediated. One intermolecular interaction involves the 2’-hydroxyl group of the terminal ribose and the 0 2 atom of uracil in the minor groove of the neighboring duplex. A crystallographic study of an irregular dsRNA revealed the ability of the RNA double helix to accommodate noncanonical base-pairs and provided additional insight into the role of 2’-hydroxyl groups in mediating intermolecular interactions. The ribo dodecamer, GGACUUCGGUCC, which exists as a monomeric hairpin in solution, crystallizes as a duplex containing two copies each of the noncanonical G.U and U.C base-pairs (27). The four base-pairs, which are adjacent and in the center of the duplex, are apparently stabilized through additional hydrogen bonds involving water molecules in the major and minor grooves. The dsRNA crystallizes as a pseudoinfinite helix, in which the unit duplexes are linked by four direct hydrogen bonds. Additional interactions between adjacent duplexes involve several water-mediated hydrogen bonds. Thus, the intermolecular interactions in the crystal lattice are established through hydrogen bonds involving 2‘-hydroxyl groups, similar to what is seen in the [U(UA),A], structure. High-resolution nuclear magnetic resonance (NMR) analyses have provided information on the structure of dsRNA in solution and have provided support for the occurrence of significant sequence-dependent differences in local structure. A proton NMR study of the self-complementary hexamer, GCAUGC, was assisted by restrained dynamic molecular structure refinement to reveal an A-form double-helical geometry (28).The dsRNA exhibits local variations in structural parameters, including helix twist, as well as base-pair roll, slide, and propellor twist. There is extensive intramolecular base stacking, involving R-Y steps, as well as interstrand stacking of the purine rings. The extensive base-pair stacking provides a significant stabilizing force. The dsRNA is bent by approximately 20°, which is less than the bending of the corresponding DNA duplex (approximately SO0) (28).NMR analysis of two self-complementary RNA dodecamers, CGCGAAUUCGCG and CGCGUAUACGCG, also revealed the canonical A-form helix and a significant amount of interstrand base overlap, but in addition uncovered several sequence-dependent variations in the roll angle between adjacent bases (29). There is now a clearly defined example of an RNA double helix that exists outside the M A ‘ family. Incubation of poly(C-G) at 45 in high salt causes a conformational transition from a right-handed to a left-handed Z-helix (30). The B -+Z conversion is highly cooperative and the transition temperature

DOUBLE-STRANDED

RNA

11

increases with decreasing salt concentration (31). NMR, circular dichroic, and Raman spectroscopic analyses support the assignment of a left-handed structure (32, 33). Broinination of guanine Cx stabilizes the Z-RNA structure, which can be recognized by antibodies raised to Z-DNA (32). Methyl substitution at cytosine C5 destabilizes Z-RNA, in contrast to its stabilizing effect on Z-DNA (31). A glimpse of the structure of Z-RNA at the atomic level is provided by an X-ray analysis of the self-complementary, hexameric DNA-RNA copolymer d(CG)r(CG)d(CG),which crystallizes in the duplex Z-form (34).The cytosine 2’-hydroxyl groups within the two central r(C.G) base-pairs engage in intramolecular hydrogen bonds with the N2 atom of the 5’ neighboring guanine residue, apparently stabilizing the purine syn conformation. Immunocytochemical experiments provided evidence for the existence of Z-RNA in the eukaryotic cell cytoplasm (3.5);however, it remains to be shown whether the Z-RNA has a function in vivo.

9. Molecular Properties of dsRNA The macroscopic behavior of dsRNA derives from its microscopic features. A number of studies have shown that dsRNA is relatively inflexible, compared to DNA. The greater stiffness reflects the conformational rigidity of the ribose ring, which is imposed by the 2’-hydroxyl group. However, there has been some disagreement about the magnitude of the inflexibility. Analysis of the sedimentation coefficients of dsRNAs of defined size yields a persistence length (P value)3 of 1125 (2100)A (36), compared with the P value of 500-600 A for DNA. The dsRNA sedimentation coefficients were determined in high salt buffer, so that the P values would reflect the internal structural features of dsRNA, such as base stacking and hydrogen bonding, with minimal contribution from phosphate-phosphate repulsive electrostatic forces. The hydrodynamic behavior provides a qualitative description of the dsRNA molecule as an elastic cylinder, having a hydrated diameter of 30 A. Gel electrophoresis has been applied to determine a P value for dsRNA of 1050 A, which is approximately twice the value determined for DNA (37). In contrast, transient electric birefringence measurements (38) yielded a dsRNA persistence length of 500-700 A, only slightly greater than that of DNA. The authors of the latter study remarked that the previously determined dsRNA P value (37)may have been an overestimate, due to the use of

3 The persistence length is defined as the tangential distance over which a double helix maintains its direction before a significant change occurs, caused by external or internal bending forces (36).

12

ALLEN W. NICHOLSON

an S200,w value (36)that was itself too large. The relatively constrained flexibility of dsRNA was demonstrated by hydrodynamic analysis as well as gel electrophoresis, which also showed that phased adenine tracts-known to induce DNA bending-do not bend dsRNA (38). An electron-microscope study also demonstrated the inability of adenine tracts to induce dsRNA bending (39). However, these studies do not rule out the possibility of intrinsic dsRNA bending by other base sequence elements. The flexibility of dsRNA can be increased by introducing local structural discontinuities. Viroids, for example, are highly base-paired, circular RNA pathogens of plants and contain many internal loops, bulges, and base-pair mismatches. The P value of a specific viroid species is about 300 A, compared to > 1000 hi for dsRNA (40). Site-specific bulge loops can kink dsRNA (41, 42, 42a). The kinking originates from a specific structural discontinuity at the bulge site, and analyses of RNA duplexes with two or more sitespecific bulges show that kinks can exhibit phasing (41, 42). The kinkdependent phasing provided an alternative method of determining the helical pitch of dsRNA, measured as 11.3 bp in one study (42),whereas another analysis yielded 11.8 bp (41). These values are in accord with the A- and A'-form double-helical parameters, but because they are average measures, any local variation in pitch could not be discerned. A recent study on dsRNA with bulge loops has tentatively revealed a natural curvature for the RNA double helix (approximately 30-40" over 80 bp (43). The curvature was proposed in order to account for the measured helical repeat value of 10.2 bp, which is significantly smaller than observed in the other investigations. The persistent curvature of dsRNA is also seen in X-ray diffraction studies (Section 111,A). Double-stranded RNA electrophoreses more slowly than the corresponding DNA (38, 42, 44, 4 4 4 . The slower mobility may arise from the greater amount of counterion condensation with dsRNA, compared to DNA. The smaller axial distance between phosphates (the A-form helix has a 1.4A rise/phosphate, whereas the B-form has a 1.7 rise per phosphate) results in less residual negative charge density following counterion condensation. The smaller charge-to-mass ratio causes a reduced gel electrophoretic mobility (38).Introducing internal loops and bulges significantly reduces gel electrophoretic mobility (442).The differential dependence of the gel mobilities of regular and irregular dsRNA species on gel concentration and percent crosslinking results from specific but poorly understood interactions between RNA and the gel matrix (44a). The formation of triple-stranded RNA (tsRNA)represents an important mode of interaction of a nucleic acid chain with dsRNA, and triple-helix structures are observed in biological RNAs (4,45).Is the RNA triple helix similar to dsRNA?

DOUBLE-STRANDED RNA

13

The tsRNA structure can readily be formed, and the physical properties of short tsRNAs of defined sequence have been analyzed (46).The all-RNA triple helix is thermodynamically the most stable species, compared to the corresponding all-DNA form, or the several hybrid RNA-DNA species (46).Vibrational circular dichroism shows that under defined salt and polynucleotide concentrations, the thermal denaturation of poly(rA).poly(rU)involves formation of an intermediate triple-stranded species, poly(rA).poly(rU).poly(rU) (47). Alternatively, poly(rA).poly(rU)can undergo isothermal disproportionation in high salt, forming poly(rU).poly(rA).poly(rU),where the adenine residues simultaneously engage in Watson-Crick and non-Watson-Crick base-pairing with the two poly(rU) strands (7). The second poly(rU) strand fits into the major groove of the poly(rA).poly(rU)double helix, running parallel to the poly(rA) strand. To accommodate the third strand, the duplex base-pair tilt is decreased, creating a greater axial rise per base-pair, which widens the major groove (7). One may anticipate the existence of Hoogsteen hydrogen bonds as a stabilizing force for tsRNA, such as seen in other situations (e.g., see Ref. 48).

IV. Protein Recognition of dsRNA The biological activity of dsRNA is manifested through its specific interactions with other nucleic acids, small molecules, and proteins. A growing body of experimental evidence shows that dsRNA associates with other nucleic acid chains through (i) 2’-hydroxyl-group-mediated hydrogen bonding, (ii) intermolecular coordination of phosphates by divalent metal ion bridges, or (iii) base-base interactions within the major groove (for reviews see Refs. 4 and 7). Regarding the binding of small molecules, numerous investigations provide a detailed picture of the intercalative binding of planar dye molecules to the double helix (7). Other modes of small-molecule binding can be anticipated; these would involve hydrogen bonding to ribose 2‘-hydroxyl groups as well as ionic bonds with phosphate oxygens. The specific binding of protein to dsRNA is not well understood, but recent studies provide insight into this important interaction. In principle, the twofold symmetry of dsRNA provides a surface appropriate for recognition by a twofold symmetric protein (e.g., a homodimer). However, asymmetric binding modes are also possible: recognition of one or both strands of the duplex could be accomplished by a single polypeptide. Sequenceindependent protein binding would occur through recognition of the general features of the A-form helix, including the regular array of phosphate oxygens and 2’-hydroxyl groups, and the relatively nonpolar minor groove sur-

A

Ilmstau-1 Ilmstau-3 Hetrbp-1 Xlrbpa-1 HSDAI-1 aTIK-1 7hrE3L Hetrbp-2 Xlrbpa-2 Heeona EcruaC Dmetau-4 coneenaue

Pv

m

B

LnEy

qk

p

Y 1 f v i

sGPaH k FTf v v r 1 i m

1

1EEY S IV T E T S CT ITS L EFPE EFGE VF D K ER F I E I E T F V ET G S AY Q P S G L S E P VV ASG STAR g

G G SKK AK a rr r

AAe AL V

i

,ETMY K H .LP GS DT EK EN YV VP IS VP E AV L

-

Dmstau 2 HSDAX-2 -TIK-2 PrVIlS34 Sppacl Hatrbp-3 Xlrbpa-3 Wlstau-5

search motif

KLSVLIE IDIICRF

TQA SEE KSP L*

Q -

KDY IMA

IMC KLG GxGxSKKxAKxxAAxxALxxL A

FIG.4. (A) The double-stranded RNA binding domain (dsRBD) motif. In A, proteins that exhibit the full-length dsRBD are listed. “Ecrnac” is the sequence from RNase 111. The conserved residues are highlighted, and the consensus sequence is provided at the bottom. In (B), proteins are listed that contain mainly the C-terminal portion of the dsRBD. (C) The location of the dsRBD in nine proteins. The larger, darker boxes indicate the occurrence of the full-length dsRBD; the shorter, lighter boxes indicate the presence of the shorter dsRBD motif. Reprinted with permission from Ref. 52.

15

DOUBLE-STRANDED RNA

C

Human DAI (551 aa) & Mouse TIK (518 aa) Vaccinia E3L (190 aa) Human TAR binding protein (345 aa) Xenopus rbpa (299 aa) Drosophila Staufen (1026 aa)

E . coli RNase III (226 aa) Human son-a (1523 aa)

*

S. pombe pacl (363 aa)

Porcine rotavirus ns34 (403 aa)

FIG. 4. (Continued)

face. The ordered spine of water molecules in the minor groove may also participate in hydrogen bonds with bound protein (7). The 2'-hydroxyl groups would also serve to distinguish dsRNA from A-form DNA, or an RNA-DNA hybrid. Several problems are posed in principle by the sequence-specific recognition of dsRNA by protein. The major groove formally provides unambiguous sequence information, because each of the four base-pair arrangements presents a unique array of hydrogen bond acceptor and donor groups (49). However, the A-helix structure renders these groups sterically inaccessible, due to the narrowness and depth of the major groove. In the absence of any confoi-mational change that would widen the major groove, the sequence-specific binding of protein would be expected instead to depend on information provided by base-pair groups in the minor groove. However, only the AU.UA base-pair can be distinguished from the GC.CG base-pair set, in a recognition mechanism involving only minor groove-directed hydrogen bonds (49).The protein must therefore depend on additional interactions in order to read unambiguously the base-pair sequence. The major groove may nevertheless enable sequence-specific recognition. Because the degree of base-pair tilt establishes the helix rise value, it can dictate whether the major groove remains narrow and only accessible to water or metal ions, or whether it can widen to accommodate a protein structure (or another

16

ALLEN W. NICHOLSON

nucleic-acid strand). In this regard, internal loops or bulge-loops can promote partial unwinding of adjacent double-helical regions, allowing specific protein-dsRNA contacts in the major groove (50). Are there specific protein motifs that recognize dsRNA? An early molecular modeling study revealed a natural structural complementarity of the antiparallel P-sheet with the RNA double helix (51). The protein secondary structure motif displays a right-handed double-helical shape, which affords a precise interaction with dsRNA. Specific protein-RNA contacts can be established in the minor groove involving hydrogen bonds between the 2’-hydroxyl groups and peptide-bond carbonyl oxygens. Although not further explored, it was also suggested that sequence-specific recognition could be accomplished through the interaction of amino-acid side-chains with the base-pair groups exposed in the minor groove (51). Whether this protein motif is used in dsRNA recognition is not known. Protein sequence databanks have uncovered a motif that specifically recognizes dsRNA. Sequence comparison of proteins that bind dsRNA exposed an approximately 65- to 70-amino-acid sequence that contains about 36 conserved amino acids (Fig. 4) (52, 53).The consensus element, termed the dsRNA-binding domain (dsRBD), is present in E . coli ribonuclease 111 and the mammalian dsRNA-activated protein kinase (Sections VI,A and IX,A). In vitro assays demonstrated that the dsRBD can directly bind dsRNA (52).The extended length of the motif suggests that structure as well as sequence is important for dsRNA recognition by the dsRBD. There is no current evidence that a dsRBD exhibits sequence specificity in binding, but it is possible that specific nonconserved amino acids either within or outside the domain could confer such an ability. The zinc finger provides a motifwherein specific amino acids, adjacent to conserved amino acids within a local structure, can confer sequence specificity (54). Direct information on protein recognition of dsRNA has been provided by the X-ray structural analysis of glutaminyl-tRNA bound to its cognate synthetase (55). The minor groove of the tRNA acceptor helix engages in several specific contacts with two p-turn motifs. A proline residue (Pro-181), present in a p-turn that separates two p-strands, engages in a hydrogen bond through its peptide carbonyl oxygen with the purine exocyclic amine group in the G2.C71 base-pair. The peptide bond of isoleucine 183 is hydrogenbonded to a “buried water molecule that is itself hydrogen-bonded to both the keto oxygen of C71 and the G2 exocyclic amine. The buried water molecule is also hydrogen-bonded to a carboxylate oxygen of aspartic acid 235, which is within a second p-turn. Asp-235 also engages in a hydrogen bond with the G3 exocyclic amine. In summary, three protein side-chains and a water molecule engage in a complex, highly specific hydrogen-bond pattern with nucleic-acid base groups in the minor groove. Both direct and

DOUBLE-STRANDED RNA

17

water-mediated hydrogen bonds are present. The intricacy of this interaction may hint at the general complexity of protein-dsRNA contacts.

V. Chemical Stability of dsRNA RNA chains break down in solution under conditions wherein DNA is stable. The 2'-hydroxyl group acts as an internal nucleophile, attacking the vicinal phosphodiester linkage and displacing the 5' oxygen of the neighboring 3' nucleotide. The breakdown proceeds by an in-line mechanism, wherein the nucleophilic 2'-oxygen and 5'-oxygen leaving group occupy apical positions within the trigonal bipyramidal phosphorane intermediate (56). An RNA strand in a helical conformation, whether single-stranded or engaged in a double helix, is more resistant to this reaction than the corresponding random coil. It was noted that a right-handed, antiparallel double helix is particularly well suited toward protecting the 3'-5' internucleotide linkage from 2'-hydroxyl attack (57). The RNA double helix imposes a significant structural constraint in that the attacking and leaving groups cannot simultaneously occupy the required apical positions. The reaction is therefore inhibited, due to a disfavored stereochemical arrangement. However, if the RNA double helix undergoes localized unwinding and strand separation, the stereochemical barrier would be lost, and chain scission readily proceed. Disruption of the double helix may be important in the degradation of dsRNA by ribonucleases related to the pancreatic RNase family (Section VI, C), whose catalytic mechanism requires the 2'-hydroxyl group. It is predicted that a 2'-5' phosphodiester linkage within a right-handed double helix should undergo more facile cleavage, because the attacking 3'-oxygen is in line with the 5'-oxygen leaving group. An experimental study using model oligonucleotides revealed an approximately 900-fold relative stability of the 3'-5' linkage over the 2'-5' linkage within the RNA double helix (58).The hydrolytic lability of the 2'-5' linkage is consistent with its facile formation when 3'-activated oligonucleotides are nonenzymatically polymerized on a complementary oligonucleotide template (59).The preferential formation of the 2'-5' linkage is predicted, because the pathway is formally the reverse of 3'-oxygen attack on the 2'-5' linkage. It also was noted that the use of 5'-activated nucleosides would favor nonenzyinatic formation of the 3'-5' linkage over the 2'-5' linkage (57). The hydrolytic stability of dsRNA has so far been considered from the standpoint of its relative resistance to 2'-hydroxyl group-mediated chain cleavage. dsRNA breakdown could occur instead by a hydrolytic mechanism, where the nucleophile is an activated water molecule. Depending on the

18

ALLEN W. NICHOLSON

identity of the leaving group (3’ or 5‘ oxygen), hydrolysis would create RNA products with 5’ phosphate or 3’ phosphate termini, respectively. In either case, the double-helical structure does not have to be disrupted to permit the requisite stereochemistry for an in-line SN2(P)mechanism. However, as the uncatalyzed hydrolysis of dsRNA is very slow, enzymatic assistance is required to provide the necessary rate.

VI. Enzymatic Cleavage of dsRNA Intracellular dsRNA species must turn over to avoid excessive accumulation, and to provide precursors for new RNA. There are several pathways by which dsRNA may be degraded (see also Ref. 60): (i) an enzyme could carry out a coordinated and nonspecific double cleavage of the double helix; (ii) an enzyme could bind directly and introduce random nicks in either strand, ultimately providing small, unstable dsRNA fragments; (iii) a ssRNA-specific enzyme could bind reversibly to locally melted regions, then cleave the single-stranded segments; (iv) an exonuclease (3’ -+ 5‘ or 5’ + 3’) could attack the ends of the duplex and degrade each strand; (v) an RNA helicase could convert dsRNA to single-stranded species, which then would be degraded by ssRNA-specific exo- or endonucleases; and (vi) the dsRNA can be enzymatically modified (Section VIII, C), thereby weakening or destroying duplex structure, or providing a recognition signal for specific exo- or endonucleases. This section analyzes several enzymatic activities that degrade dsRNA directly, and compares and contrasts their mechanisms.

A. Ribonuclease Ill Ribonuclease 111was the first dsRNA-specificendoribonuclease to be discovered, and it has received continuous attention since its original characterization as a potent activity in E . coli cell-free extracts. RNase I11 was later identified as a prominent member of a group of enzymes involved in RNA maturation and decay (for a recent comprehensive review, see Ref. 61). RNase 111 exhibits a homodimeric structure and requires a divalent metal ion, preferably Mg2+, as an essential cofactor for its phosphodiesterase activity. Exhaustive digestion of synthetic dsRNAs yields double-stranded species, ranging in size from approximately 12 to 15 bp. RNase I11 creates 5’-phosphate, 3‘-hydroxyl product termini, which exhibit two-nucleotide 3’ overhang. RNase 111-catalyzed hydrolysis of dsRNA apparently proceeds through coordinated (but probably not concerted) double cleavage. Many of the natural RNase 111 substrates, also termed processing signals, exhibit specific deviations from regular dsRNA structure at or near the cleavage site. These irregularities can determine the specific pattern ofprocessing (Section VII,A).

DOUBLE-STRANDED RNA

19

The structural gene for RNase 111 (mc) lies between 55 and 56 minutes on the E. coli chromosome, and has been cloned and sequenced. The rnc polypeptide contains 226 amino acids and has a molecular mass of 25.6 kDa (62). Mutations in the m c gene that exert specific effects on RNase I11 activity have been identified. The mc70 mutation changes glutamic acid at position 117 to lysine, and blocks cleavage without inhibiting binding (61) (H. Li and A. W. Nicholson, unpublished). Changing the same residue to an alanine has essentially the same effect (H. Li and A. W. Nicholson, unpublished). Further evidence for carboxyl group involvement in the catalytic mechanism is provided by the observation that treatment of RNase I11 with a water-soluble carbodiimide abolishes cleavage, but does not affect substrate binding (H. Li and A. W. Nicholson, unpublished). The mc97 mutation changes glycine at position 97 to a glutamic acid, and inhibits processing activity in vivo (63). The mc97 mutation may weaken divalent-metal-ion binding, because elevated Mg”+ concentrations rescue processing activity in uitro (63).The mc105 mutation also inhibits processing activity in vivo, and represents a glycine-to-serine change at position 44 (62).This residue occurs within a 10-aminoacid segment (NERLEFLGDS) that is also present within the yeast dsRNase, Pacl, and the RNase-111-like enzyme of Coxiella burnettii (64, 64a). The role of this conserved sequence in RNase I11 function has not been defined. The C-terminal third of the m c polypeptide contains a consensus dsRBD motif (52, 61). The rev3 mutation changes alanine to a valine at position 211 (62), which corresponds to a conserved residue within the dsRBD. The rev3 mutation does not noticeably affect RNase I11 processing in viuo, although it suppresses a specific mutation in ribosomal protein S12 that causes a cold-sensitive defect in 30-S ribosomal subunit assembly and/or function (65). The catalytic mechanism of RNase 111 is not known, but recent studies provide a framework for a description. RNase I11 is a low-abundance protein, but it is easily overexpressed and purified (66-68). RNase I11 processing obeys Michaelis-Menten kinetics, and its in vitro catalytic efficiency is comparable to that of other nucleic-acid processing enzymes, including E . coli RNase P, E. coli RNase H, and restriction endonuclease EcoRI (68). Given the requirement for a divaleiit metal ion and the apparent involvement of at least one carboxyl group in the chemical step, it is possible that RNase 111 utilizes the “two-metal-ion” mechanism (e.g., see Refs. 69 and 70). However, other mechanisms are equally likely. Because the 2‘-hydroxyl group adjacent to the scissile bond is not required for cleavage (71),the unreactivity of DNA or RNA-DNA hybrids does not reflect the specific absence of this group at the scissile bond. Biological processing substrates of KNase 111 undergo precise enzymatic cleavage. A necessary but not sufficient requirement for reactivity is the

20

ALLEN W. NICHOLSON

presence of approximately 20 bp of dsRNA ( i e . , two turns of the A-form double helix), within which occur(s) the cleavage site(s). To rationalize the cleavage specificity, one model proposed that RNase I11 acts as a “molecular ruler,” whereby the scissile bond is selected by its distance from one end of the dsRNA element (72, 73). However, mutational analysis of a T7 phage processing signal showed that the length of the dsRNA element does not dictate cleavage site choice, although it does determine overall reactivity (74). Other structural features can determine the reactivity pattern. For example, asymmetric internal loops can enforce single cleavage, whereas altering the internal loop to fully Watson-Crick base-paired form restores double cleavage (74)(Fig. 5). It was proposed that the internal loop folds into the major grooves of the adjacent double helices, forming a “dsRNAmimicry” structure, which allows only single cleavage (75).This model is not supported by mutational analysis and NMR studies of a representative substrate (74, 76). Internal loops in RNase I11 processing signals (and other RNAS) instead exhibit a more formal helical shape, which is most likely stabilized by non-Watson-Crick base-pairing interactions (76). The participation of base-pair sequence in establishing RNase I11 processing signal reactivity has been controversial. RNase I11 is not a baseA

CA G A CG UA UG AU CG UG GC GC AU

B

U

f$Jd

c A

AU AU CG A A GC GC GU AU GC AU s...u u . . . 3

C

CA G A CG UA UG AU CG UG GC GC AU AU

CA G A CG UA UG AU CG UG GC GC

:J

A ‘

A

AU AU CG A A GC GC GU AU GC AU s...u U . . . 3 ‘

AU AU CG A A GC GC GU AU GC AU

s...u u . . . 3’

FIG.5. Structure of the bacteriophage T7 R 1 . l RNase 111 processing signal (B), which undergoes single enzymatic cleavage in the internal loop. Also shown are two R 1 . 1 variants that exhibit fully Watson-Crick base-paired internal loops, and that undergo coordinate double cleavage.

21

DOUBLE-STRANDED RNA

A

RNase I11 11 bp

I

I #

W N A 0 W Q N N C W W ( N N N N), A 118 B N N C W C O N W 3 ' - W ' N ' W 6 W o e N’N’B W p % ( n n n n ) y U Ww'C N’N’B WOB €A N’W’5'-

4 1

B 5'-

3'-

11 b p

I

-

3' 5'

Drd I I

e

#

I 6 N N N N N N B P C P B N"""""'C A @

Q Q

6bp

41

6 bP

I

-

3' 5'

FIG. 6. (A) The consensus model for an RNase 111 processing signal (see also Refs. 61 and 73). The overall length is approximately 22 bp, or two turns of the RNA double helix. The nucleotides in outlined form represent the conserved base-pairs; the N,N' pairs represent any base-pair combination; the W,W' pairs indicate U . A or A.U base-pairs, whereas the N,n pairs indicate that Watson-Crick lrase-pairing is not a strict requirement. "(NNNN)," and "(nnnn)," are used to indicate that the two opposed segments are not necessarily equal in length, nor necessarily complementary. For example, in the R 1 . l processing signal, x = 5 and y = 4 (see Fig. 5 ) (B) The recognition sequence for restriction endonuclease DrdI. Note the similar pattern of cleavage and placement of the conserved base-pairs, which in this case spans one turn of the B-DNA helix.

specific enzyme, because the nucleotides that immediately flank the scissile bond are not conserved. A number of substrates exhibit a short, conserved base-pair sequence element (CUU.GAA)proximal to the cleavage site. However, base-pair substitutions within this element do not block accurate cleavage of a T7 phage substrate (77).It was therefore proposed that the processing signal identity elements-whether or not specific base-pairs are involvedare spatially dispersed and degenerate in nature (77).There now is evidence for base-pair sequence involvement in processing substrate reactivity. Alignment of the sequences of RNase I11 substrates with respect to their cleavage sites revealed a more extensive, albeit loosely conserved base-pair consensus motif (73) (Fig. 6A). The consensus base-pair set spans approximately two turns of the double helix, and exhibits a hyphenated dyadic symmetry centered about the cleavage sites. A single turn of the double helix would therefore contain one copy of the consensus base-pair set. The variability in

22

ALLEN W. NICHOLSON

base-pair sequence establishes the degenerate character of the identity elements. Preliminary studies indicate that base-pair substitutions within the conserved sequence set can inhibit cleavage by weakening enzyme binding (K. Zhang and A. W. Nicholson, unpublished). The studies summarized above provide a preliminary structure-function model of RNase 111, and a qualitative description of the processing pathway. RNase I11 contains substrate-binding, catalytic, and subunit dimerization domains. The substrate-binding and catalytic domains are physically and functionally separable (Fig. 7). The C-terminal third of the m c polypeptide, containing the dsRBD, is involved in substrate binding. Preliminary results indicate that the isolated dsRBD of RNase I11 can bind substrate, but cannot catalyze cleavage (A. Amarasinghe and A. W. Nicholson, unpublished). The location of point mutations that abolish cleavage suggests that the catalytic domain is contained within the N-terminal two-thirds of the enzyme. The separability of substrate-binding and catalytic domains also implies that recognition is not necessarily coupled to catalysis, and that under certain circumstances, RNase I11 may act as a dsRNA-binding protein. There is preliminary evidence for such an alternative function of RNase I11 in which specific RNA structures allow RNase I11 binding, but block cleavage (61, 78). The twofold symmetries of RNase I11 and dsRNA imply that processing can occur within a symmetrical enzyme-substrate complex. The model proposes that the dsRBD of each subunit binds a substrate half-site (one turn of dsRNA), which contains a single consensus base-pair set (see Fig. 6A). Substrate binding is accompanied by a change in the enzyme-substrate complex, such that the catalytic site (one per subunit) is positioned next to one of the two scissile bonds (Fig. 8). The chemical step then occurs, followed by product release. The involvement of two catalytic sites in the processing reaction means that each strand is cleaved independently. Thus, a substrate half-site may be

1

3646

97 Catalytic Domain

117

152

211 226

dsRNA-Binding Domain

FIG.7. The primary structure of RNase 111 polypeptide, indicating the dsRBD (shaded area) and catalytic domain. The black bars indicate sequence identity with the yeast PacI nuclease. The sites of specific mutations in RNase 111 are indicated, and the exact positions (amino-acid number) are given below the diagram. This model predicts that each subunit of the RNase 111 dimer has a separate substrate binding site and catalytic center (see Section VI,A for further discussion).

23

DOUBLE-STRANDED RNA

0

Y? -

FIG. 8. The RNase 111 processing reaction, indicating that double cleavage of dsRNA is a coordinated but not a necessarily concerted reaction.

sufficient to confer substrate reactivity, if the corresponding scissile bond is appropriately positioned in the active site of the bound subunit. This model can rationalize the influence of substrate structure on reactivity. Disruption of secondary structure immediately surrounding the cleavage site (for example, by the presence of an asymmetric internal loop) abolishes the local twofold symmetry in the enzyme-substrate complex. This would allow the placement of only one of the two scissile phosphodiesters in an active site, resulting in single-strand cleavage. Are there other nucleic-acid-processing enzymes whose mechanisms are relevant to consider in thinking about the RNase I11 processing reaction? It has been useful to regard RNase I11 in light of what is known of the DNA restriction endonucleases (see also Ref. 61). Restriction enzymes can cleave at noncanonical sites (i.e., exhibit “star” activity) in low-salt buffers, in the presence of organic cosolvents, or in the presence of divalent metal ions other than Mg2+ (e.g., MiG+ or Co2+) (79). The noncanonical sites are usually degenerate forms of the recognition sequence. RNase I11 exhibits star-cleavage activity under comparable conditions, in which secondary sites are cleaved in addition to the primary processing sites (68, 80, 81). Secondary cleavage sites are not norinally used in uiuo; they usually contain a

24

ALLEN W. NICHOLSON

smaller dsRNA element, and often exhibit base-base mismatches or other deviations from regular dsRNA. Restriction enzymes show a diversity of primary structure, and it has been argued that the type of recognition site (e.g., the occurrence of hyphenated symmetry) and cleavage pattern (e.g., 5' or 3' overhang of one, two, three, or four nucleotides) dictates the relative placement and structures of the substrate-binding and catalytic sites (79).Therefore, assuming an involvement of base-pair sequence in RNase I11 substrate recognition, a formal relative of RNase I11 would be the restriction endonuclease DrdI. This enzyme recognizes the hyphenated sequence, GACNNNN/NNGTC, and cleaves to provide product ends with two-nucleotide 3' overhangs (Fig. 6B). It may be informative to compare and contrast the structures and mechanisms of RNase I11 and DrdI, with due attention given to the fundamental structural differences between the respective substrates.

B. Cobra Venom Ribonuclease (RNase V,) RNase V, is one of several nuclease activities present in the venom of the central Asian cobra, Naja naja oxiana (1 I , 82). RNase V, preferentially degrades dsRNA, but also cleaves helical ssRNA, whereas DNA is not a substrate (12). The physical properties of RNase V, are unknown, because the enzyme has not been purified to homogeneity. Studies using partially purified enzyme demonstrated that RNase V, is a phosphodiesterase that requires Mg2+, creates 5'-phosphate termini, and is inhibited at salt concentrations above 100 mM ( 1 1 , 12). Specific nucleotide sequences are not important for recognition (83).The minimum size for an RNase V, substrate is approximately four to six nucleotides, which corresponds to the number of ionic contacts established on enzyme binding (12).To reconcile the ability of RNase V, to cleave dsRNA as well as helical ssRNA, it was proposed that the enzyme recognizes the helical sugar-phosphate backbone (12). RNase V, has been used to map helical or double-helical regions in RNA. Careful interpretation of RNase V, structure mapping results is required because studies on tRNA reveal that RNA regions not engaged in a canonical double helix are sensitive to RNase V,, and that double-stranded regions are not uniformly reactive ( 1 1 , 83). It is clear that the interaction of RNase V, with its substrates depends on additional parameters that as yet are not well understood.

C. dsRNase Activities Mechanistically Related to Pancreatic RNase

As discussed in Section 11, a key diagnostic feature of the RNA double helix is its resistance to RNase A in high salt, and a corresponding sensitivity in low salt. How can a ssRNA-specific nuclease degrade dsRNA? It was proposed that low salt increases interstrand coulombic repulsion between phos-

DOUBLE-STRANDED RNA

25

phate oxygens, such that the dsRNA is denatured to single-stranded form. However, the RNA double helix is stable under these conditions. A series of investigations analyzed the degradation of dsRNA by RNases mechanistically related to RNase A (i. e., the cyclizing-decyclizing phosphotransferases), including bovine seminal plasma ribonuclease (RNase BS-1) (2, 84, 85). It was initially proposed that the homodimeric structure of RNase BS-1 confers efficient recognition and cleavage of dsRNA, wherein each subunit cleaves one of the two RNA strands. In support of this hypothesis, it was shown that artificially dimerized RNase A can degrade dsRNA under conditions where the monomeric form is inactive (86).However, it was subsequently shown that the monomeric form of RNase BS-1, obtained through reduction/alkylation, exhibits a dsRNase activity comparable to that of the native dimer (87). Examination of the primary structures and dsRNase activities of a number of RNase A-related ribonucleases revealed a correlation between polypeptide basicity and dsRNA cleavage ability. Specifically, the more basic ribonucleases possess a more efficient dsRNase activity. Moreover, the dsRNase activity of RNase A is greatly enhanced by the covalent linkage of spermine residues (84). Studies on RNase A binding to double-helical DNA (which permits measurement of enzyme binding without cleavage) demonstrated that RNase A binds and stabilizes local single-stranded regions. RNase A and its relatives can therefore be regarded as nucleic-acid-melting proteins which can bind dsRNA by taking advantage of the dynamic “breathing” of the double helix. Binding to the ssRNA regions would be followed by cleavage. The low-salt enhancement of dsRNA cleavage by RNase A and its relatives would derive from an increased dsRNA breathing rate, due to increased internal electrostatic repulsion. The two-step mechanism for dsRNA degradation by RNase A and RNase BS-1 is also consistent with the stereoelectronic restraints on the cyclizatiodcleavage pathway. Attack of a phosphodiester linkage by the adjacent 2’-oxygen would ordinarily be disallowed within the context of the double helix (Section V), but would proceed when a single-stranded segment is produced on enzyme binding. A study of the dsRNA-binding properties of catalytically inactive mutants of RNase A or BS-1 could determine how enzyme binding participates in helix destabilization, how the salt concentration influences the binding and cleavage of dsRNA, and how specific posttranslational modification (84) may stimulate the dsRNase activity of otherwise ssRNA-specific activities. In contrast to RNase A and its relatives, such phosphodiesterases as RNase 111 would not necessarily require a singlestranded segment as substrate. Because these enzymes employ an activated water molecule as the nucleophile, the phosphodiester linkage can be cleaved through an in-line mechanism, which would be stereoelectronically allowed within a double-helical structure.

26

ALLEN W. NICHOLSON

VII. dsRNA Function in Prokaryotes

A. Gene Regulation by Ribonuclease Ill Insight into the role of RNase I11 in E . coli RNA metabolism was provided by the isolation of the mc105 mutation, which abolishes RNase 111 processing in uivo (88). The 3 0 4 RNA species that accumulates in mc105 mutant strains represents the primary transcription product of the rRNA operons. RNase I11 processing of the primary transcript creates the immediate precursors to the 1 6 3 and 2 3 3 rRNAs (61).The viability of RNase IIIstrains indicates that other processing activities provide alternate rRNA maturation pathways (89). A number of cellular mRNAs also are processed by RNase I11 (Table I). Although the list of RNase I11 targets is undoubtedly incomplete, their encoded functions indicate that RNase I11 regulates expression of components involved in the flow of genetic information (i.e., the synthesis, maturation, function, and decay of RNA). In addition to its role in the metabolism of specific cellular RNAs, RNase 111 processes transcripts expressed by a wide range of phage and accessory genetic elements. RNase 111 cleaves RNAs encoded by phage T7 and its relatives, as well as transcripts of phages T4 and lambda (61). Plasmids and transposons express RNAs that contain RNase I11 processing signals, and antisense RNA binding to their targets provides RNase I11 substrates (Section VII,B).4 RNase I11 processing can control gene expression by altering mRNA translational activity. The translation of most prokaryotic mRNAs depends on the accessibility of the mRNA Shine-Dalgarno (SD) sequence to the complementary (anti-SD) sequence at the 3’ end of the 1 6 3 rRNA. KNase I11 processing within the 5’ untranslated region (5’-UTR) of an mRNA can enhance translation by disengaging the SD sequence from secondary structure, promoting 30-S subunit binding. For example, RNase I11 cleavage within the 5’-UTK of the T7 polycistronic early transcript creates the mature 0.3 gene mRNA, and also stimulates the production of the 0.3 protein (90). RNase I11 cleavage within the 3’-UTR enhances translation of the T7 1.1/12 mRNA, apparently by disrupting a long-range RNA-RNA interaction (91).A

RNase 111 processing signals are relatively abundant in coliphages and accessory genetic elements. It was speculated that RNase 111 may protect the cell against infection by RNA phage (as well as other phage) by attacking dsRNA replicative intermediates or viral mRNAs (72). An original antiviral function of RNase 111 may have been subsequently subverted by phage and extrachromosomal elements to their advantage (61).To speculate further, RNase I11 may represent a modern version of a primitive cellular activity that restricted genetic exchauge at the RNA level. Such an activity would have been potentially toxic to the cell, given the ubiquity of dsRNA structures, and would need to have been tightly regulated, or cellular dsRNAs subtly altered to avoid cleavage.

27

DOUBLE-STRANDED RNA

TABLE I Escherichia coZi RIRONUCLEASE I11 PROCESSING SICNALS~ Operon

Encoded functions

No. of sites

Processing signal function

rrA-H

16-S, 23-S, 5-S rRNA; tRNAs

2

Maturation of rRNAs; tRNA

me-era-recO

RNase 111, Era, RecO proteins

1

Initiation of mRNA decay

rpsO-pnp

r-Protein S15, PNPase

1

Initiation of mRNA decay

rnetY-nusA-in@

tRNA’Met; NusA protein, IF2

1

Initiation of mRNA decay; tRNA maturation

rpZK,AJ,L-rpoB,C

r-Proteins L1, L7/L12, L10, L11; @, p’ RNA polymerase subunits

1

Modulation of mRNA expression (?)

secE-nusG

SecE, NusC proteins

1

Modulation of mRNA expression (?)

* See Section VII,A and Ref. 61 for further discussion of the structures, reactivities, and functions of the listed RNase 111 processing signals.

recent report describes an RNase I11 processing signal within an mRNA coding sequence (92), whose cleavage down-regulates expression of the encoded protein. RNase I11 may also control translation by binding to a specific site without concomitant cleavage (78). The binding event may induce an mRNA conforinational change that enhances translation initiation. RNase I11 processing can also control gene expression by altering mRNA stability. Cleavage within mRNA 3’-UTRs can provide a 3’ hairpin that blocks the action of 3’ + 5’ exonucleases, such as polynucleotide phosphorylase (61).The in vivo stabilities of the T7 phage early mRNAs is established in part by 3’ hairpins, created by RNase I11 processing (93). Alternatively, cleavage within a 3‘-UTR can remove an RNA hairpin or other secondary structure, thereby accelerating mRNA decay. For example, RNase I11 cleavage of the phage lambda sib regulatory element removes an RNA hairpin, thereby promoting 3‘ -+5’ exonucleolytic digestion into the upstream integrase coding region, suppressing protein production (61, 94). RNase I11 cleavage within a 5’-UTR can also initiate RNA turnover. In this instance, RNase I11 processing can facilitate subsequent cleavage by degradative endonucleases, such as RNase E (95).This mechanism is involved in the autoregulated production of RNase I11 (96), and the negative control of polynucleotide phosphorylase (PNPase) (97). With regard to the latter event, RNase 111- strains exhibit altered RNA metabolism (98); this may result in part from the elevated levels of PNPase, which would accelerate the degradation of PNPase-sensitive mRNAs.

28

ALLEN W. NICHOLSON

RNase I11 activity can be controlled through covalent modification. RNase 111 is phosphorylated on serine in the T7-infected cell by a phageencoded protein kinase, which enhances processing activity (99).Because T7 infection shuts off host protein synthesis, the T7-directed phosphorylation may allow the limited amounts of RNase 111 to process efficiently the large quantities of the T7 mRNAs, many of which have RNase I11 cleavage sites (93).The phosphorylation may confer an additional degree of stability to the T7 messages by enhancing PNPase mRNA cleavage, thereby suppressing PNPase production (97), which may be involved in T7 mRNA degradation. It is not known whether RNase I11 is a target for a cell-encoded protein kinase, but some form of regulation is feasible to consider, as RNase I11 can bind ATP (66), and may interact with other proteins (e.g., see Refs. 67 and 100).

B. dsRNA and Antisense Regulation Antisense RNAs bind to complementary sequences in target RNAs, forming specific RNA.RNA duplex structures, which can alter target function. Antisense RNAs can be generated through transcription of all or part of the target gene complementary strand, or expressed from an unlinked locus. Extensive studies on natural antisense RNAs have been spurred by the inherently interesting properties and mechanisms of action of these regulatory molecules, and in developing antisense technology for the directed control of gene expression (for recent comprehensive reviews, see Refs. 101-103). Prokaryotic antisense RNAs act primarily as negative regulatory elements. For example, antisense RNA binding may directly sequester an mRNA translation initiation region, or inhibit target RNA function through an allosteric mechanism. Prokaryotic antisense RNA action does not necessarily require full-length duplex formation, and moreover, although the dsRNA product is formally an RNase I11 substrate, enzymatic degradation is often not necessary for regulation. This section reviews several natural antisense RNA-mediated regulatory mechanisms, and the role of dsRNA in antisense action.

1. ColE 1 PLASMIDREPLICATION CONTROL Initiation of replication of plasmid ColE 1requires RNA primer formation, which is negatively controlled by an antisense RNA (104).The 3' end of the RNA primer for leading strand DNA synthesis is created through sitespecific RNase H cleavage of the precursor transcript, RNA 11. Cleavage is inhibited by RNA I, a plasmid-encoded antisense transcript of 108 nt. Specifically, RNA I base-pairs with RNA I1 within a specific segment upstream of the RNase H cleavage site. Duplex formation causes a conformationd change in RNA 11, which suppresses stable formation of the RNA.DNA duplex target for RNase H. The RNA-I.RNA-I1 duplex is ultimately degraded by RNase 111, but this event is not required for negative regulation (104).

29

DOUBLE-STRANDED RNA

Extensive investigations provide a detailed description of the specific structural features in RNA I1 and RNA I that promote duplex formation and a pathway for RNA I action (104). RNA I and RNA I1 initially engage in a “kissing” interaction, in which reversible base-pairing occurs between complementary hairpin-loop nucleotides in each RNA. The kissing reaction is the rate-limiting step for the association of the two RNAs, and mutations in the loops that abolish complementarity suppress negative regulation. A ColEl plasmid-encoded protein (Rom) enhances negative control by stabilizing the kissing complex (105).Formation of a stable dsRNA complex involves pairing of the single-stranded 5’ end of RNA I with the complementary sequence in RNA 11. The creation of a nucleation center for dsRNA formation at a location separate from the kissing site avoids the topological barrier to double-helix formation involving two closed, complementary loops (104). 2. R1 PLASMIDREPLICATION CONTROL The replication of plasmid R 1 depends on the synthesis of the plasmidencoded protein RepA, which participates in the initiation step (106). RepA protein production is negatively controlled at the translational level by the plasmid-encoded CopA RNA: an approximately 90-nt, constitutively synthesized antisense transcript (107). The steady-state levels of CopA RNA directly reflect plasmid copy-number, because CopA RNA has a short metabolic

loop I1

A

U

UA VA

binding of CopA

G

loop I

VA UA

no binding

-3

middle region

tail

of CopA

%-Ti ---+

FIG. 9. Mechanism of CopA antisense RNA action. (A) The secondary structure of CopA RNA, indicating hairpin loops I and 11. (B) The overall mechanism for CopA interaction with its target, leading to repression of RepA protein production (see Section VII,B for frirther discussion). Reprinted by permission of Oxford University Press from Ref. 135.

30

ALLEN W. NICHOLSON

half-life. CopA RNA exhibits two hairpins, the loop nucleotides of which are available for binding to complementary sequences within the repA mRNA 5‘ leader region (termed COPT)(Fig. 9). A stable kissing interaction between complementary loops in the CopA RNA and the CopT sequence is followed by dsRNA formation at a site separate from the kissing loops. The binding of CopA RNA to CopT sequesters a short upstream reading frame, tap, preventing its translation and therefore also that of repA, which is translationally coupled to tap (108, 109) (Fig. 9). The kissing interaction alone may be sufficient for inhibition of repA translation (110), and although the CopACopT duplex represents a target for RNase 111, the absence of RNase I11 has only a minor effect on the translational activity and metabolic stability of repA mRNA (110,111).

3.

REGULATION OF PLASMID

KILLER-GENEEXPRESSION

Several plasmids are maintained through expression of killer genes, whose products destroy plasmid-free segregants. Analysis of the R1 plasmid hoklsok system provides insight into the mechanism of action of plasmid killer genes, and the regulation of their expression by antisense RNA. The R1 plasmid hok mRNA encodes the Hok (host-killing) protein, which causes cell death by damaging the cytoplasmic membrane (112, 113). An antisense transcript, termed Sok (suppressor-of-killing) RNA, down-regulates Hok protein expression. Specifically, Sok RNA (67 nt) is complementary to the translation-initiation region (TIR) of the mok (modulator-of-killing) gene, which overlaps the hok coding sequence in a separate reading frame. Sok RNA binding creates a duplex that sequesters the mok TIR, and suppresses Hok protein production, because the hok and mok cistrons are translationally coupled (114). Sok RNA binding also accelerates the decay of the RNA, presumably through the action of RNase I11 (115). Sok RNA binding to Hok mRNA does not proceed through the interaction of complementary loops, but involves a single-stranded region at the 5‘ end of Sok RNA

(115). The killing of cells that lack the R1 plasmid depends on (i) the persistence of the sok and hoklmok RNAs in the segregants and (ii) differential RNA decay rates. In the absence of continued transcription in plasmid-free cells, the more rapid decay of sok RNA allows translation of hok mRNA and production of the toxic Hok protein. An important additional facet of this mechanism is that the Hok mRNA must undergo enzymatic cleavage within the 3’-UTR in order to become active translationally (115). Cleavage allows translation by apparently disrupting a long-range RNA.RNA interaction between the mok TIR and the 3’-UTR. The RNA processing activity has not been identified. The 3’-UTR sequence therefore provides an important negative regulatory element, not only in preventing the inappropriate synthesis

31

DOUBLE-STRANDED RNA

of Hok protein in plasmid-containing cells, but in preventing premature Hok mRNA degradation resulting from Sok RNA binding and RNase 111 attack.

4. CONTROL

OF

Islo

TRANSPOSASE

EXPRESSION

TnlO transposon movement is negatively regulated at the translational level by a 70-nt antisense transcript, termed RNA-OUT (116). RNA-OUT is complementary to the 5’ end portion of the transposase mRNA (RNA-IN). RNA-OUT binding to RNA-IN creates an approximately 35-bp duplex, which blocks translation by directly sequestering the TIR of the transposase cistron (117). The dsRNA segment is a substrate for RNase 111, although RNase I11 is not required for negative regulation (118). RNA duplex formation is initiated by a kissing interaction involving the hairpin loop of RNAOUT and the complementary sequence in RNA-IN. The secondary structure and mechanism of action of RNA-OUT is similar to several other plasmid antisense RNAs, the notable exception being that the kissing loop also serves as the nucleation site for full-length duplex formation (119).TnlO transposition exhibits multicopy inhibition, wherein transposition frequency decreases with increasing TnlO copy number. Effective multicopy inhibition is due to the metabolic stability of RNA-OUT, whose hairpin structure confers resistance to exo- and endoribonucleases (120).

5.

CONTROL OF

LYSOGENY IN

BACTERIOPHAGE

LAMBDA

Phage lambda expresses a 77-nt transcript (OOP RNA) that is complementary to a 55-nt segment containing the 3’ end of the lambda cII gene and the adjoining 22 nucleotides in the cII-0 gene intercistronic region. Overexpression of OOP RNA from a plasmid reduces cII gene expression to approximately +m, through destabilization of the cII coding sequence (121). OOP RNA binding to its target allows RNase I11 cleavage within the cII-0 intercistronic region, and the new 3’ end provides an initiation site for 3’ + 5’ exonucleolytic digestion into the cII coding sequence (73).This mechanism is similar to the sib-dependent retroregulation of lambda int mRNA expression (Section VI1,A). The precise pathway of RNA.RNA duplex formation is unknown, because the secondary structures of OOP RNA and the target cII0 sequence have not been determined. OOP RNA is not involved in the lysidlysogeny decision following infection (122). However, OOP RNA production following prophage induction antagonizes cII expression, thereby down-regulating cI repressor synthesis. The suppressed CI levels serve to enforce the lytic pathway (122). The specific involvement of OOP RNA in prophage induction is consistent with the dependence of OOP promoter activity on the LexA repressor (122).

32

ALLEN W. NICHOLSON

6. HIGHLIGHTSOF OTHERANTISENSE RNA-DEPENDENT REGULATORYMECHANISMS Fertility (F) plasmid conjugation requires expression of the plasmid tra (transfer) operon, which is controlled by the transcriptional activator protein, TraJ. TraJ production is negatively regulated by the product of the plasmidfinP (fertility inhibition) gene, a 78-nt antisense RNA that is complementary to the 5’ leader of the TraJ mRNA (123). Binding of FinP RNA occludes the TIR of the TraJ mRNA, repressing TraJ synthesis. The dsRNA segment formally provides a substrate for RNase 111, but it is not known whether repression requires cleavage. This mechanism is formally similar to the antisense regulation of IS10 transposase expression (see Section VII,B,4). FinP RNA is stabilized by the fin0 gene product, a protein that also enhances the binding of FinP RNA to TraJ mRNA (124). The c4 repressors of bacteriophages P1 and P7 are antisense RNAs of approximately 77 nt that regulate expression of the phage ant (antirepressor) gene (125). Upstream of and overlapping ant is an open reading frame, icd (formerly o f l ) , which is required for ant expression. c4 RNA binding to its complementary target sequence represses icd translation, which in turn represses ant expression through inducing early transcription termination (126).The c4 RNA is cotranscribed with icd and ant, and at least one processing event is required for the maturation of c4 antisense RNA (125). The E . coZi FtsZ protein is involved in the septation step of cell division. The FtsZ protein levels are controlled by a variety of factors. A 53-nt RNA (DicF RNA), encoded by the dicF gene of a defective prophage, acts as a negative regulator of FtsZ protein production (127, 128). DicF RNA is complementary to a segment of theftsz mRNA containing the TIR (128). Preliminary experimental evidence indicates that dicF RNA inhibits FtsZ protein production by blocking 30-S subunit recognition of the ftsZ TIR (127, 128). An E . coZi cell-encoded antisense RNA, MicF RNA, has been implicated in regulating the expression of the outer membrane protein, OmpF. MicF RNA is transcribed from an unlinked locus, and is complementary to the 5’ end of OmpF mRNA (129, 130). The MicF-dependent reduction in OmpF protein production precedes the drop in steady-state levels of OmpF mRNA, indicating that repression occurs through translation inhibition rather than by mRNA destabilization (130). There also is evidence for specific protein binding to the antisense RNA, suggesting that MicF RNA functions as an RNA-protein complex (131).Perhaps the protein stabilizes MicF RNA in a manner similar to the stabilization of FinP mRNA by F i n 0 protein.

7. ANTISENSE RNA DESIGNSTRATEGIES An important experimental objective is to achieve targeted control of gene expression. “Designer” antisense RNAs can provide such control at the

33

DOUBLE-STRANDED RNA

post-transcriptional level, and are particularly well-suited to negatively regulating the expression of genes essential or otherwise inaccessible to other forms of control. It was originally speculated that antisense RNAs with optimal activity would be relatively unstructured and specific for a comparably unstructured, functionally essential region in the target. Several studies examined the efficacy of artificial antisense RNAs, expressed from “reversed copies of the target genes (132-134). Reversed gene expression was shown to inhibit target mRNA expression, and optimal inhibition was observed when the antisense transcript is complementary to the TIR of the target mRNA (132, 134). However, the requirement for relatively large amounts of the antisense RNA indicated an inherent inefficiency of action. Placing a TIR at the 5’ end of the reversed gene transcript increased the effectiveness of inhibition (134). The TIR may promote ribosome binding, which would block RNA degradation that initiates at the 5’ end. Incorporation of a transcriptional terminator structure at the 3’ end of the reversed RNA also increased the inhibition, and it was hypothesized that the terminator permits a higher rate of antisense RNA synthesis (134). Alternatively, the terminator hairpin may act as a 3’-end stabilizer, protecting the antisense RNA from 3‘ + 5‘ exonucleolytic decay. In contrast to “reversed gene” transcripts, natural antisense RNAs reflect sophisticated design principles. As evidenced by the examples described above, these RNAs are typically small (50-110 nt), with a high degree of secondary structure and specific noncanonical elements that afford protection against degradation (134~).The loop structures appear to provide optimal recognition of the target RNA, and bases within the stem can influence the antisense interaction (135, 1352).The precise nature of the kissing interaction between loops must be carefully considered for proper function. Recognition loops typically contain five to seven nucleotides, and loops exhibiting fewer or a greater number of nucleotides usually exhibit a decreased rate of stable complex formation. However, antisense and target RNAs that contain significantly larger loops can interact productively, wherein duplex formation directly propagates from the site of initial binding (136). Finally, the ability of a small antisense RNA to hybridize to a model RNA hairpin is sensitive to the exact placement of the target sequence within the hairpin loop, and dependent on specific structural features of the stem (137).

8. dsKNA AND RIBOZYME FUNCTION

I N PAOKARYOTES

Ribozyme-catalyzed cleavage of RNA incorporates the essential features of antisense RNA action, in that trans-acting ribozymes recognize their target through complementary base-pairing. Because ribozymes act catalytically rather than stoichiometrically, a higher efficiency of action may be realized. Targeted cleavage of bacterial RNA by ribozymes in viuo has not

34

ALLEN W. NICHOLSON

been extensively investigated, but a preliminary report suggested an ineffectiveness of a ribozyme in E . coli (138). An explanation for the observed inefficiency was that the coupled synthesis and translation of bacterial mRNA reduces the accessibility of the target sites (138, 139). A recent study demonstrates that a ribozyme can function with reasonable efficiency in the bacterial cell (140).A plasmid-encoded ribozyme was targeted to a site within the coding region of the A2 gene of the RNA coliphage, SP. Expression of the ribozyme suppressed phage growth. The inhibition presumably occurred through site-specific cleavage, because a catalytically inactive version of the ribozyme only weakly inhibited phage growth. The rapid in vivo turnover of the RNA prevented direct confirmation of cleavage at the predicted site. The corresponding antisense RNA was also able to inhibit SP phage infection, which may have been due to formation of the RNA.RNA duplex, followed by degradation by RNase I11 (140).

VIII. dsRNA Function in Eukaryotes

A. dsRNA and hnRNA A significant fraction (approximately 5%) of the sequences in mammalian cellular heterogeneous nuclear RNA (hnRNA) can be isolated in doublestranded form (141).The dsRNA component has been identified by (i) resistance to RNase A in high salt, (ii) chromatographic behavior on CF-11 cellulose, and (iii) sensitivity to KNase 111 (141-143). Analysis of HeLa cell hnRNA revealed that the dsRNA component occurs, on average, every 2000-2500 nt, and is derived from the A h family of repetitive sequence elements, of which there are approximately 300,000 copies per haploid genome (143-145). The size of the dsRNA ranges up to approximately 300 base-pairs (143, 146). Transcription of the A h inverted repeat sequences would allow formation of intramolecular hairpin (“snap-back) structures, as well as intermolecular duplexes. The latter process explains the tendency of hnRNA to aggregate, which can be reversed by brief heat treatment. A portion of the dsRNA fraction of mammalian hnRNA is resistant to RNase I11 (143).The resistance may be due to the natural sequence heterogeneity of the A h sequence family, which would provide mismatched intermolecular duplexes not recognized by RNase I11 (143). Alternatively (or in addition), the cleavage resistance may reflect the action of the dsRNA adenosine deaminase (Section VIII,C), which converts A.U to 1 . U base-pairs. This assumes that the dsRNA elements are present in viuo, and that 1.U basepairs can block RNase I11 action. It may be informative to determine the inosine content of hnRNA-derived dsRNA and whether hnRNA-specific

DOUBLE-STRANDED RNA

35

dsRNA is a substrate for the dsRNA adenosine deaminase. The presence of dsRNA in purified hnRNA could have a trivial explanation, in that the dsRNA is a product of phenol extraction during isolation. Phenol accelerates nucleic-acid-reassociation reactions (13). However, there are several lines of evidence for the occurrence of dsRNA within the eukaryotic cell nucleus. One study isolated hnRNP (hnRNA associated with specific nuclear proteins) by a gentle extraction procedure that omitted phenol, and applied differential nuclease sensitivity to demonstrate the presence of dsRNA within the hnRNP preparation (146). A subsequent study obtained cross-linking of dsRNA regions in vivo, using a photoreactive psoralen derivative that could be taken up by the cell (147).These and related studies (16) provide strong evidence that dsRNA is an intrinsic component of hnRNP, and is relatively accessible to nuclease digestion and photocross-linking. The presence of dsRNA in vivo also has been supported by immunocytochemical studies. Immunofluorescent staining by dsRNA-specific antibodies was observed in the nucleus of Vero cells and mosquito cells (14). There was no detectable immunofluorescence of the nucleolus or the cytoplasm of these cells. However, it should be noted that under the same conditions, other cell lines, which included HeLa, KB, BHK, and CEF cells, did not provide a detectable reaction (14). The functional roles of the dsRNA component of hnRNA is not known, but its nuclear localization has focused attention on several possibilities. The dsRNA component may provide a structure that organizes hnRNP and promotes specific interactions with the nuclear matrix, including those that facilitate nuclear-cytoplasmic transport. Because dsRNA-binding proteins are implicated in developmental programs (52), it is possible that dsRNA elements not necessarily related to the Alu-related sequences, in specific mRNA precursors provide protein binding sites or signals for trafficking, storage, and controlled expression. Alternatively, the nuclear dsRNA component may lack a specific function and is targeted for degradation by dsRNA-specific nucleases, the dsRNA adenosine deaminase, or the (2’-5’)A polymerase/RNase L system. Depending on their location within hnRNA, dsRNA elements may be removed along with introns, or by cleavage of 3’ trailer sequences. Normal cell function may require compartmentalization or masking of dsRNA. For example, given the lengths of the Alu-related dsRNA sequences (up to 300 bp), the inappropriate presence in the cytoplasm of these sequences could activate the dsRNA-dependent protein kinase and inhibit translation, as well as trigger interferon gene expression. It is also possible that during certain cellular events (e.g., nuclear envelope breakdown or altered RNA processing) nuclear-localized dsRNA may enter the cytoplasm and trigger specific changes in cell physiology.

36

ALLEN W. NICHOLSON

B. dsRNase Activities The discovery of E . coZi RNase I11 and identification of its role in rRNA maturation prompted the search for a similar activity involved in eukaryotic rRNA processing. There is now good evidence for the existence of one or more dsRNase activities in mammalian cells (summarized in Table 11), but there are scant data on their functional roles. Biochemical analyses provide limited information, and because the activities have been difficult to purie. A cautionary note is provided by the observation that mycoplasmasubiquitous contaminants of mammalian cell lines-are a source of a dsRNase (148).A yeast dsRNase is here described first, because the enzyme bears a number of similarities to RNase 111, because there is some information available on its cellular role.

1. RNase 111-RELATED ACTIVITIES

IN

YEAST

A dsRNA-specific nuclease in the yeast Saccharomyces cerevisiae was first detected using an in situ gel electrophoretic enzyme assay. The dsRNase activity degrades poly(rG).poly(rC), and is associated with a 26-kDa polypeptide (149).Using a different approach, another study described a S . cerevisiae dsRNase of 27 kDa (150).This dsRNase required reducing agents for full activity, and was stimulated by KCI. Cell-growth experiments indicated that the dsRNase activity levels are higher in cells deprived of nutrients, and it was suggested that under these conditions the increased activity may enhance RNA turnover and ribonucleotide reutilization (150). The Schizosaccharomyces pombe p a d gene encodes a 41-kDa polypeptide that degrades dsRNA in vitro (64).The C-terminal portion of the Pacl enzyme has a 25% amino-acid similarity with the complete primary structure of E . coZi RNase 111 (64, 151).However, antibodies to RNase I11 do not react with the Pacl enzyme, and neither the pacl nor the rnc gene exhibits measurable activity in reciprocal complementation experiments (64). The role of Pacl enzyme in RNA metabolism has been partially defined. The p a 1 gene product (Pacl) is essential for vegetative growth (M), and overexpression of the enzyme inhibits entry into meiosis. It is possible that the enzyme suppresses meiotic gene expression during vegetative growth, and must be down-regulated to allow entry into meiosis (64). Alternatively, Pacl may be required for the maturation of meiosis-specific transcripts. The Patl protein kinase may regulate Pacl enzyme activity. Because pat1 mutants exhibit uncontrolled meiosis, the Patl enzyme inhibits meiosis, and must be suppressed (probably by the mei3 gene product) to allow the cell to enter meiosis. Because overexpression of Pacl enzyme permits normal vegetative growth and sexual development of a p Q f l t smutant at the nonpermissive temperature (64), one scenario is that the Pacl enzyme activity is stimu-

TABLE I1 MAMMALIAN CELLAND Nameb FV3 dsRNase

Sourcec FV3 virions; cytoplasm of FV3-infected BHK cells

VIRUS-ASSOCIATEDdSRNA-CLEAVING

Sized ND

ACTIVITIES=

Salt optima; other requirements"

Other features

Ref.

Requires Mgz+ (-5 mM)

165, 166

RSV virions

ND

Requires Mgz+

Cytoplasm of Krebs I1 ascites cells

ND

ND

167 155

RNase D

Cytoplasm of Krebs I1 ascites cells

50-150 kDa

ND

154

RNase DS

dsRNA-treated chick embryo cells

34.5 kDa

0.05-1.4 mM Mg2'; 0.3-30 mM M+

Associated ssRNase

168

RNase DII

Chick embryo cell extracts, or nucleolar fraction

43-70 kDa (several species)

0.5-1 mM Mg2+; 75100 mM M +

Associated ssRNase

158

-

Cytoplasm of mouse embryo cells

65 kDa

2-5 mM Mgz+; 25-50 mM M +

Associated ssRNase

160, 161

RNase D -

HeLa cell hnRNP

ND

ND

-

157

Calf thymus (whole cell and nuclei)

60 kDa

2-4 mM Mgz+; <20 mM M +

Associated ssRNase

164

PCI Nuclease

HeLa cell nuclei

55 kDa

Requires Mg2+ or Mnz+

Also digests ssRNA, ssDNA, dsDNA

162, 163

PCII Nuclease

HeLa cell nuclei

20 kDa

Requires Mg2+ or Mn2+

Associated ssRNase; exonucleolytic action

162, 163

RSV dsRNase -

0

b c

d e

See text for further discussion of the listed activities. FV3, Frog virus 3; RSV, Rous sarcoma virus. No name listed means that no name was given to the dsRNase activity. BHK, Baby hamster kidney. ND, Not determined. M refers to a monovalent metal ion. +

38

ALLEN W. NICHOLSON

lated during the vegetative phase by Patl-dependent phosphorylation, thereby enforcing the suppression of meiotic gene functions. Entry into meiosis would be accompanied by dephosphorylation and down-regulation of Pacl activity. Further evidence for Pacl enzyme involvement in RNA metabolism is suggested by a report that overexpression of Pacl suppresses a defect in small nuclear RNA (snRNA) maturation, caused by the temperature-sensitive snml mutation (152).

2. dsRNA-SPEcIFIc RNase ACTIVITIES

IN

MAMMALIAN CELLS

The first reported mammalian dsRNase was an activity present in animal serum (153).Sucrose gradient centrifugation analysis provided a preliminary molecular mass estimate of 45-55 kDa. Subsequent studies detected dsRNase activities in mammalian cell-free extracts. A dsRNase activity in the cytosol of Krebs I1 ascites cells, termed RNase D ( D for duplex), could cleave poly(rG).poly(rC)and phage Qp dsRNA (154).A requirement of RNase D for divalent metal ion was indicated by sensitivity to EDTA. DEAE-cellulose chromatography of RNase-D-containing fractions afforded a large increase in specific activity, which may reflect the removal of an endogenous inhibitor, perhaps RNA. An earlier study characterized an EDTA-sensitive dsRNase activity associated with the ribosomal fraction of Krebs I1 ascites cell extracts (155). It is possible that the two studies described the same activity. A dsRNase was detected in both the nuclear and cytoplasmic membrane fractions of human KB cells (I%), but was not further characterized. Examination of specific HeLa subcellular fractions uncovered a dsRNase activity associated with hnRNP (157).Cleavage of poly(rA-rU) by this activity was inhibited by ethidium bromide or excess dsRNA, but not by ssRNA or an RNA-DNA hybrid. Exhaustive digestion of the substrate does not release mononucleotides, ruling out the involvement of an exoribonuclease. The presence of an endogenous RNA inhibitor was indicated by the necessity of DEAE-cellulose chromatography, or treatment with RNase A and RNase T1, in order to attain maximal activity. Two activities, termed nucleases DI and DII, were isolated from 9-dayold chick embryos (60). Nuclease D1 exhibits a molecular mass ofapproximately 60 kDa, and endonucleolytically cleaves dsRNA as well as ssRNA, yielding 5’-phosphate, 3’-hydroxyl termini. Nuclease DI requires Mn2+ for activity and is inhibited by high salt. Nuclease DII has a molecular mass of 38-40 kDa and also endonucleolytically cleaves dsRNA as well as ssRNA with 5’ polarity. Nuclease DII requires a divalent metal ion (Mg2+ or Mn2+), whereas the presence of monovalent metal ions provides optimal activity. Subcellular fractionation experiments suggest that nuclease DII is localized in the nucleus and that nuclease D I is largely cytoplasmic (60).

DOUBLE-STRANDED RNA

39

The nucleolar fraction of chick embryos and mouse ascites cells contains a dsRNase activity, whose chromatographic properties and enzymatic behavior are similar to that of nuclease DII (158). A significant amount of this activity is also present in the fraction that includes ribosomal particles. A role of the nucleolar dsRNase in rRNA metabolism was suggested by its ability to catalyze limited, site-specific cleavage of the mammalian 4 5 3 rRNA precursor (158). However, the low activity levels, the presence of contaminating RNases, and the ambiguity of suhcellular fractionation procedures complicated the further analysis of the nucleolar dsRNase, as well as nucleases DI and DII. There is no direct evidence for an RNase-111-like activity in mammalian rRNA maturation, but complementary base-pairing of small nucleolar RNAs with rRNA precursors leave open the possibility of a duplex RNAspecific cleavage activity (159). An activity that site-specifically cleaves Moloney murine leukemia viral mRNA was detected in the microsomal fraction of mouse embryo cells (160). The same (or similar) activity could cleave poly(rA-rU) but not ssRNA. The dsRNase activity could be solubilized in high salt, and was further purified by ammonium sulfate fractionation and DEAE-cellulose chromatography (161). The most highly purified fraction had a G-200 gel filtration behavior similar to that of RNase D of Krebs ascites cells, but retained small amounts of ssRNase and RNase H activities. The nuclease could cleave the 4 5 3 rRNA precursor in a selective manner, and digestion of poly(rA-rU) was inhibited by excess dsRNA, ethidium, or EDTA (161). Studies on two dsRNA-specific activities from HeLa cells (162, 163) showed that dsRNA degradation can occur by different mechanisms. The two activities (nucleases PCI and PCII) were partially purified by phosphocellulose chromatography. PCI nuclease endonucleolytically digests reovirus dsRNA to rnononucleotides or short oligonucleotides. It also endonucleolytically cleaves dsDNA and RNA-DNA hybrids, as well as ssRNA and ssDNA (163). Glycerol gradient centrifugation indicated a molecular mass for PCI nuclease of approximately 55 kDa. PCII nuclease has a greater selectivity for dsRNA, but also degrades ssRNA. It apparently acts in an exonucleolytic fashion, as the only products released from reovirus dsRNA digestion are 5’-ribonucleoside monophosphates (163). PCII nuclease has an apparent molecular mass of approximately 20 kDa, and can functionally substitute in uitro for E . coli RNase 11, a 3’ + 5’ exoribonuclease (162). A dsRNase activity in calf thymus tissue has been partially purified and characterized (164). The most highly purified preparation was free of RNase H and DNase activities, yet retained a small amount of ssKNase. The activity endonucleolytically cleaves dsRNA with 5’ polarity, yielding dinucleotide to

40

ALLEN W. NICHOLSON

nonanucleotide products. Glycerol gradient centrifugation provided a molecular mass of approximately 60 kDa. The activity was inhibited by EDTA, N-ethylmaleimide, or ethidium bromide. None of these reagents inhibited the associated ssRNase activity.

3. VIRAL INFECTION-ASSOCIATED dsRNase ACTIVITIES There have been several reports of virus-associated dsRNase activities. Two groups described a dsRNase activity in purified frog virus 3 (FV3) (165, 166). The FV3 dsRNase endonucleolytically degrades reovirus RNA, doublestranded poliovirus RNA, or poly(1)-poly(C) to a size corresponding to an approximately 6-S sedimentation value (166). Subfractionation experiments revealed that the dsRNase activity is contained within the viral capsid, but is not tightly associated with the viral core. Induction of the dsRNase occurs approximately 3 hours post-infection and is suppressed by prior administration of cytosine arabinoside (166).The FV3 dsRNase requires Mg2+ and is not detectable in extracts of uninfected HeLa, L, or BHK cells (165). It was not determined whether the dsRNase is virus encoded or cell encoded. An activity in purified Rous sarcoma virus (RSV) or avian leukosis virus particles can degrade poly(A)-poly(U) (167). The dsRNase activity requires prior detergent disruption of the virions, and gel filtration provided a molecular mass of approximately 13 kDa. A similar activity is normally present at low levels in chick embryo fibroblasts and is stimulated by RSV infection (167). It was proposed that a cellular dsRNase is induced by RSV infection and is encapsidated into progeny viral particles (167). An inducible dsRNase is associated with the interferon response. Treatment of vertebrate (avian) cells with dsRNA is followed by the appearance of a secreted dsRNase (168). The nuclease activity, termed RNase DS (DS for double-stranded) is coinduced with interferon (IFN). RNase DS is also produced following treatment with other IFN inducers, such as defective vesicular stomatitis virus or UV-irradiated avian reovirus. Induction of RNase DS is blocked by actinomycin D or cycloheximide, coadministered with either dsRNA or the infecting virus (168).A lower level of an IFN-inducible ssRNase activity copurifies with RNase DS on poly(I).poly(C)agarose. It is not known whether the ssRNase is intrinsic to RNase DS, or represents another activity. RNase DS requires divalent (Mg2+, Mn2+)and monovalent (Na+) ions. Gel filtration analysis provided a molecular mass of 34.5 kDa. It is not known whether RNase DS is related to any of the cellular dsRNase activities. However, its molecular mass is similar to that of nuclease DII (see Table 11), and the time course for appearance of RNase DS is similar to that seen with the FV3-infection-associated dsRNase (166).

41

DOUBLE-STRANDED RNA

4. A RETROVIRAL REVERSE ASSOCIATED dsRNase

TRANSCRIPTASE-

A dsRNase activity has recently been described that is associated with the reverse transcriptase (RTase) of human immunodeficiency virus type 1 (HIV-l), as expressed in E. coli (169).The HIV-1 RTase is a p66/p51 heterodimer, in which the p51 subunit is a truncated version of the p66 subunit. The nuclease activity, termed RNase D, specifically cleaves HIV-1 RNA at two sites within an 18-bp duplex region formed by binding of the reverse transcriptase primer (tRNALys)to the primer binding site (PBS). It was proposed that the endonucleolytic cleavages may facilitate template switching or reverse transcription of the tRNALys sequence (169). An in situ gel electrophoretic enzyme assay showed that the RNase D activity copurifies with the p66 subunit (170). Association of RNase D with the carboxyl terminal domain of p66, which contains the RNase H active site, is indicated by the observation that a p66 point mutation (E478Q) that suppresses RNase H activity also suppresses RNase D activity (170). Evidence (cited as unpublished data) in a separate study (171) also indicates that the dsRNase activity resides in the p66 but not the p51 subunit. These studies provide strong evidence that RNase D represents a fourth specific enzymatic activity of HIV-1 RTase, which is associated with the RNase H domain. The occurrence of an HIV-1 RTase-associated dsRNase has been called into question by Hostomsky and co-workers (172), who showed that the RNase D activity of the bacterially expressed RTase exhibits the same substrate cleavage specificity as E. coli RNase 111. The RNase D activity cleaved an RNase 111 substrate of phage T7 with the same specificity as RNase 111, and purified E . coli RNase 111 cleaved the tRNALyS-PBS duplex with the same specificity as RNase D. However, at least 100 times more RTase protein was needed to provide a cleavage activity comparable to that of RNase 111, and the RTase prepared from an RNase 111- E. coli strain failed to cleave either substrate. It was also shown that the RNase-111-likeactivity associates with the p51 subunit, which lacks the. RNase H domain. Finally, mutations that suppressed RNase H activity did not also destroy the dsRNase activity (172). It was concluded (172) that RNase I11 fortuitously copurifies with HIV-1 RTase when expressed in E . coli, and that RNase D and RNase I11 are one and the same. These conflicting results may be resolved as follows. The specific cleavages in the HIV-1 PBS (169) may indeed have been due to contaminating RNase 111, which coincidentally associates with the p51 subunit (172). This accidental affinity is supported by the observation that a mock RTase fraction, purified from a nonexpressing E . coli strain, does not exhibit RNase D

42

ALLEN W. NICHOLSON

activity. On the other hand, when proper steps are taken during purification to avoid contamination, the bacterially expressed RTase can still cleave dsRNA (170, 171). The strongest evidence for an intrinsic dsRNase activity associated with the p66 carboxyl terminal domain is the existence of specific point mutants that selectively inhibit the RNase D or RNase H activities (170), and the application of in situ gel electrophoretic enzyme assays. However, it has yet to be demonstrated whether the dsRNase activity can specifically cleave the tRNALys-PBS duplex, and if so, whether this event is important for the retroviral infection strategy. An experimental approach to these questions would be to examine the functional consequences of specific mutations in the RNase D domain on expression of the provirus ( I 71). Other retroviral reverse transcriptases may possess a dsRNase activity. An in situ gel electrophoretic enzyme assay employing synthetic dsRNA substrates revealed a dsRNase activity associated with the RTase of Moloney murine leukemia virus (M-MuLV) (170,171).As with the HIV RNase D, the M-MuLV dsRNase activity is distinct from the RNase H activity. Thus, two M-MuLV RTase mutants that exhibit alterations in the carboxyl terminal RNase H domain inhibited dsRNase activity without affecting RNase H activity, whereas a third mutant in the same domain inhibited RNase H activity without affecting the dsRNase (1 71).The M-MuLV dsRNase activity appears to have a requirement for an intact reverse transcriptase structure, because the isolated RNase H domain exhibits RNase H but not dsRNase activity (171).

C. Other dsRNA-specific Activities 1. THE dsRNA ADENINE DEAMINASE Two groups described an activity in Xenopus oocytes that unwinds dsRNA (173, 174). The activity is present also in mammalian and insect cells, and is nuclear localized (1 75, 176). Because the unwinding is caused by a covalent modification of adenine residues, the activity was initially termed the dsRNA unwinding/modlfying enzyme ( 1 77). The enzyme catalyzes the hydrolytic deamination of adenines specifically engaged in a dsRNA structure (178). The product of deamination is inosine, which creates the less stable I - U base-pair. Adenine-containing mononucleotides, as well as adenine residues in ssRNA or DNA, are unreactive (177, 179). The enzyme has been named dsRNA adenine deaminase (180) and nicknamed d s M D (181) or DRADA (182). For the purpose of this discussion, dsFL4D will be used. At least 15-20 bp of duplex structure is required for dsRNA binding to the dsFtAD, but efficient modification requires at least 100 bp (183). The lower size limit for effective enzyme binding may reflect the minimum size

DOUBLE-STRANDED HNA

43

for a stable duplex in solution (180), whereas the relatively long length required for efficient deamination may reflect the need for more than one enzyme to be bound in order to carry out deamination (183).In this regard, the dsRAD may exhibit a similarity with the dsRNA-dependent protein kinase, which is proposed to undergo autophosphorylation and activation when two monomers bind to the same dsRNA (Section IX,A). The dsRAD of Xenopus has been purified to near homogeneity; it is a single polypeptide of molecular mass -120 kDa (by SDS-PAGE) or -90 kDa (by HPLC) (181).Gel filtration chromatography provides a size estimate of approximately 210 kDa, suggesting a dimeric form, or an enzyme with an associated factor(s) (181).The dsRAD was also isolated from bovine nuclear extracts as a mixture of three forms of 93, 88, and 83 kDa (282). The two smaller forms probably are proteolytically derived from the 93-kDa species, and gel filtration and glycerol gradient centrifugation indicate an approximately 100-kDa form as the active species (182). The dsRAD from calf thymus tissue has been purified and exhibits a molecular mass of 116 kDa ( 1 8 3 ~ )The . gel filtration and sedimentation behavior indicates that it exists . dsRAD does not require ATP or dias a monomer in solution ( 1 8 3 ~ )The valent metal ion for catalytic activity (180). The dsRAD tightly binds dsRNA (Kd < 1nM), and it was suggested that the high affinity may be necessary for efficient recognition of scarce RNA species by an enzyme of comparable low abundance (182). The cloning and sequence analysis of a cDNA encoding a human dsRAD has provided insight into the structure-function relationships of the enzyme (183b). The cDNA contained a single open reading frame of 1226 amino acids, with a predicted size (136 kDa) consistent with its physical properties. The expression of the cDNA in insect cells from a recombinant baculovirus vector showed that the protein exhibits dsRAD activity (183b). The occurrence of a nuclear localization signal is consistent with the predominance of the dsRAD in the nucleus. The dsRAD has three copies of the dsRBD motif, which confer specificity as well as high affinity for dsRNA. Two tripeptide motifs, HAE and PCG, which are involved in the catalytic mechanisms of several other deaminases, are present near the C terminus (183b). The intriguing biochemical activity of the dsRAD has prompted detailed speculation on its biological roles (reviewed in Refs. 180 and 184). It has been proposed that the enzyme participates in gene regulatory pathways involving the turnover of sense-antisense KNA complexes (173, 177, 185). The dsRAD-dependent destabilization of dsRNA may render it susceptible to further enzymatic degradation (185).The dsRAD may act as an RNA editor, changing the coding capacity of inRNA by selective conversion of A residues to I. In this regard, I is formally equivalent to a G in the genetic code basepairing scheme (184). There is growing evidence that dsRAD-dependent

44

ALLEN W. NICHOLSON

RNA editing is responsible for the biased hypermutation observed in the measles virus genome (186, 187), and in specific editing of the mRNAs for the mammalian glutamate receptor subunits (188, 188u, 188b). The Xenopus dsRAD exhibits selectivity toward adenine targets in uitro: using model dsRNA substrates, there is a preferred 5’ neighbor nucleotide of A, U > C > G , with no obvious 3’ neighbor preference (189). Short dsRNAs showed high site selectivity, whereas longer substrates were promiscuously deaminated. The placement of specific adenine residues relative to the duplex termini strongly influenced their ability to be deaminated (189). Thus, the size and sequence of the duplex substrate may be sufficient to confer the requisite editing specificity. Additional factors may regulate dsRAD activity (180). A cytoplasmic protein or protein complex can bind dsRNA and block the action of dsRAD ( 1 8 9 ~ )Also, . depending on the specific developmental stage, the enzyme can either be nuclear or cytoplasmically localized (176). The dsRAD may be involved in the cellular antiviral response and in cell development. The dsRAD mRNA is expressed in every human tissue tested and is especially prevalent in brain tissue (183b). Specific viral infection or dsRNA treatment causes a decrease in dsRAD activity (190), and it was proposed that the down-regulation of dsRAD may increase the cytoplasnlic dsRNA levels, thereby enhancing the antiviral interferon response. Another study implicated the dsRAD in triggering the differentiation of pluripotent embryonal carcinoma cells through an autocrine signaling mechanism (191). Specifically, a programmed decrease in dsRAD activity would cause a corresponding rise in the cytoplasmic dsRNA levels. The cytoplasmic dsRNA would autoinduce interferon production, and force the cells to exit the proliferative state and terminally differentiate (191).

2. dsRNA UNWINDING AND ANNEALINGACTIVITIES The existence of proteins that catalyze unwinding of dsRNA (RNA helicases) or, conversely, facilitate dsRNA formation (RNA annealing proteins) implies biological processes that involve the directed denaturation or formation of dsRNA. RNA helicase activities are ubiquitous, and use the free energy provided by nucleoside triphosphate hydrolysis to catalyze the unwinding and separation of RNA strands engaged in a duplex structure (e.g., see Refs. 192-202). Several prokaryotic RNA helicases have been identified that appear to be involved in the assembly and function of the translational apparatus and in mRNA utilization. The DbpA protein, encoded by the dbpA gene (193), hydrolyzes ATP specifically in response to binding 23-S rRNA, and may manipulate a 23-S rRNA structure during 50-S subunit assembly (198).The product of the s m B gene suppresses a temperature-sensitive defect in

DOUBLE-STRANDED RNA

45

ribosomal protein L24, which inhibits proper ribosome assembly at the nonpermissive temperature (203). The deaD gene [also identified as the mssB gene (201)] was first identified as a multicopy suppressor of a temperature-sensitive mutation in ribosomal protein S2 (195). It has been speculated that the DeaD protein may alter mRNA structure during translation andlor participate in 3 0 3 subunit assembly (I%), although other functional roles are possible (201). Genetic evidence indicates that the DeaD and SrmB proteins do not share a common role (195).The transcriptional terminator protein Rho can be regarded as a helicase, because its action is directed toward unwinding RNA. DNA hybrid structures at Rho-dependent terminator sites (204). Eukaryotic RNA helicases have been implicated in manipulating mRNA structure during translation initiation (192) or pre-mRNA structure during nuclear splicing reactions (199). The ATP requirement for spliceosomecatalyzed pre-mRNA splicing in part reflects the action of specific helicases that mediate interactions between snRNP particles (194). Several helicase activities have been purified from nuclear extracts of HeLa cells. One activity, termed RNA Helicase A, unwinds dsRNA within a 3’ -+ 5‘ directionality (200),whereas the other enzyme (RNA Helicase 11) exhibits a 5’ + 3’ directionality (205). Both enzymes catalyze multiple rounds of duplex unwinding. RNA helicase A is active in monomeric form and is closely related to the protein encoded by the Drosophila gene maleless (196). The exact role of these mammalian nuclear-localized helicases remain to be demonstrated. It was recently shown that the monomeric RNA helicases contain two copies of the dsRBD (53),and it was proposed that for the monomeric helicases, two dsRNA-binding domains are necessary to generate the unwinding force and movement along the double helix (53). A protein present in the mammalian cytoplasm and nucleus, termed La, can bind and unwind dsRNA by a mechanism that may not require NTP hydrolysis (202). Proposed roles of La protein include facilitation of translation by mRNA secondary structure melting, nuclear-cytoplasmic transport of mRNA, transcription termination by RNA polymerase 111, and global regulation of translation by controlling the accessibility of dsRNA to the dsRNA-activated protein kinase (202).La protein may therefore be an important regulator of cell growth and development. Specific proteins present in HeLa cell nuclei can catalyze RNA-RNA annealing (206). Several of the activities correspond to specific hnRNP proteins, and one of the proteins (hnRNP A1 protein) may be controlled by reversible phosphorylation (207). Two nonexclusive models have been proposed to describe how these species promote RNA-RNA annealing (206). In the “matchmaker” model, interaction of annealing proteins with bound RNA provide an increase in local HNA concentration, thereby facilitating duplex

46

ALLEN W. NICHOLSON

formation by accelerating the nucleation step. In the “chaperone” model, the annealing proteins maintain the bound RNAs in an unstructured conformation and enhance the rate of duplex formation (206). One may regard both RNA helicases and RNA annealing proteins as molecular chaperones, possessing counterpoised activities that mediate the association and dissociation of complementary RNA chains.

IX. dsRNA and the Interferon System

A. The dsRNA-activated Protein Kinase dsRNA is a potent inhibitor of mammalian protein synthesis in uitro (208). The inhibition is mediated by a protein kinase whose activity is stimulated by dsRNA binding, and which catalyzes the phosphorylation of the 01 subunit of initiation factor eIF2. The phosphorylated eIF2 sequesters the guanine exchange factor protein (GEF), inhibiting the exchange of GDP for GTP. The double-stranded RNA-activated protein kinase has been termed the DAI (double-stranded RNA-activated inhibitor), p68 kinase, P1 kinase, Pl/eIF201 kinase, PK-ds, and Dsl (10). A consensus has recently been reached on the name PKR (for Protein Kinase, dsRNA-dependent) (209). PKR (and DAI-see Fig. 10) is used in this discussion. Structure-function studies on PKR have been undertaken following the cloning of the PKR cDNA (210)and the ability to express the protein in uitro, as well as in vivo, in heterologous systems (e.g., see Ref. 211). dsRNA and PKR play specific roles in the interferon response: dsRNA is a by-product of viral replication (see above), and the presence of dsRNA in the cytoplasm can activate PKR, which not only inhibits cell protein synthesis but also stimulates transcription of genes whose products participate in the interferon response (see below). Attention has also been focused on the role of dsRNA and PKR in normal cell development and proliferation (see below). Because there are recent excellent reviews on the dsRNA-activated protein kinase (10, 212), this section focuses on the structure-function aspects of the enzyme and how it interacts with dsRNA. The involvement of PKR in signal transduction and gene expression is discussed in Section IX,C. PKR is normally a low-abundance protein, but treatment of cells with interferon or dsRNA greatly increases its Ievels, as a result of transcription of the PKR gene. The enzyme is cytoplasmic and may be ribosome-associated. dsRNA binding to PKR triggers self-phosphorylation on multiple serine and theonine residues, which is believed to cause a protein conformational change. The autophosphorylated enzyme can phosphorylate the 01 subunit of eIF2 on a specific serine residue. There is evidence for at least one other

47

DOUBLE-STRANDED RNA

I

A DAI ACTIVATION

Inactive

-

[dsRNA] Active

Inactive

0

B MODEL 1

- 1 site for dsRNA - DAI dimer - intermolecular

or

autophosphorylation

MODEL la.

Activation site. High affinity

C

/

MODEL 2

- 2 sites for dsRNA -DAI monomer

- intramolecular

autophosphorylation Inhibitory site, Low affinity

FIG. 10. Models for dsRNA activation of t h e dsRNA-dependent protein kinase, DAI (PKR). (A) The observed dependence of DAI activation on dsRNA concentration, and inhibition at high dsRNA concentrations. (B) In Model 1 the binding of two DAI monomers to a single dsRNA species stimulates autophosphorylation and activation. High dsRNA concentrations would disfavor binding of two proteins on a single dsRNA molecule, and therefore inhibit activation. (C) In Model 2 binding of a monomer to dsRNA (low concentration) induces autophosphorylation, whereas at high dsRNA concentrations, a weaker binding site is also utilized, which prevents autophospliorylation. Reprinted with permission from Ref. 10.

target, I-NF-KB, whose phosphorylation promotes interferon gene transcription (see Section IX,C). There is in vitro evidence that PKR can autophosphorylate in an intermolecular fashion, and that the activated PKR phosphorylates its targets independent of continued dsRNA binding (213).

48

ALLEN W. NICHOLSON

Whether PKR undergoes intramolecular phosphorylation remains to be demonstrated. PKR regulates its expression at the translational level; perhaps there is a cis-acting element (dsRNA?) on the PKR mRNA (213). PKR inhibition of protein synthesis in vitro requires Watson-Crick basepaired dsRNA at least 50 bp in size, whereas ssRNA, DNA, or RNA.DNA hybrids are ineffective (208).More recent studies, which used purified protein and dsRNAs of defined lengths, have further characterized the dsRNA requirements for PKR activation. First, there is no apparent base-pair sequence specificity, but base-pair length is important: dsRNA species that are shorter than approximately 30 bp interact only weakly with PKR, and fail to activate. Above 30 bp there is stronger binding, with a concomitantly increased ability to activate, until maximal effect is reached at approximately 85 bp (214).Short dsRNAs can inhibit activation by longer dsRNAs, and high concentrations of long dsRNAs also inhibit activation. These studies suggest that PKR activity is differentially responsive to the length of the bound dsRNA. In this regard, a number of viruses counteract the growth-inhibitory action of PKR by expressing specific RNAs that bind but do not activate PKR. The competitive binding of these RNAs can prevent subsequent activation of PKR by viral-infection-specific dsRNAs (10). Based on these and other observations, two models have been proposed for the mechanism of PKR activation by dsRNA (Fig. 10) (10). The first model proposes that autophosphorylation is an intramolecular event that occurs within a binary complex of PKR and dsRNA. The suppression of phosphorylation by high dsRNA concentrations may result from dsRNA binding to a weaker, inhibitory site (Fig. 10). The second model proposes that phosphorylation is intermolecular, and occurs efficiently only when two PKR monomers are bound to the same dsRNA. This model rationalizes the more efficient activation by longer dsRNAs, because these species would provide multiple binding sites. Moreover, high dsRNA concentrations would inhibit activation, because the excess dsRNA would favor binding of (at most) one PKR to a single dsRNA, thereby preventing intermolecular phosphorylation (10). Recent biochemical and genetic data support the proposal that PKR monomers cooperatively bind dsRNA, producing an autophosphorylated, dimeric species as the active enzyme (211a). The PKR contains an ATP-binding/phosphate transfer domain in the C-terminal region, which includes a lysine residue essential for catalytic activity (215).The N-terminal portion of the protein binds dsRNA and contains two consensus dsRBDs (motifs I and 11) (52, 216-218). An in viva analysis using specific PKR' mutants showed that (i) both motifs are required for maximal PKR activity and (ii) the N-terminal-proximal dsRBD (motif I) is more important for activity than motif I1 (211a). I n vitru experiments also demonstrated that motif I plays a greater role than motif I1 in dsRNA bind-

DOUBLE-STRANDED RNA

49

ing (217, 218, 2 1 8 ~ )In . this regard, motif I exhibits a better match with the consensus dsRBD than motif I1 (52).The activities in yeast of PKR variants exhibiting catalytic domain point or deletion mutations indicate that the respective catalytic domains of two monomers must specifically interact to activate the phosphotransferase mechanism ( 2 1 1 ~ )It. was reported that a PKR mutual lacking motifs I and I1 is activated in mammalian cells (218b). The evidence argued against a constitutively active mutant PKR, but supported a mechanism whereby the mutant PKR is activated by the endogenous PKR, or by another cofactor unrelated to dsRNA (218b).

B. The dsRNA-activated 2'-5'A Synthetase Interferon treatment of mammalian cells causes a 10- to 100-fold increase in the levels of a unique enzyme activity, (2'-5')oligo(A) synthetase (2-5A synthetase). The induction of 2-5A synthetase occurs at the transcriptional level, and new protein synthesis is not required for transcription (219). On binding dsRNA, the 2-5A synthetase polymerizes ATP to form the oligonucleotide species (2'-5')oligo(A) (2-5A). The 2-5A has 2'-5' phosphodiester linkages and ranges in size from 2 to 15 nt (for a review see Refs. 219 and 220). The 2-5A binds and activates the ssRNA-specificendoribonuclease RNase L (221), which can inhibit viral replication by degrading viral and cellular RNAs. The dsRNA species that activate 2-5A synthetase most likely derive from viral replication intermediates, and a recent study has demonstrated the binding of viral-specific dsRNA to the 2-5A synthetase isolated from interferon-treated, EMCV-infected cells (222).Is there a role for 2-5A synthetase and RNase L in KNA metabolism in the uninfected cell? The two enzymes may be involved in the maturation and/or turnover of hnRNA, wherein internal dsRNA elements are cis-acting processing signals. hnRNA can activate the 2-5A synthetase in vitro (223),and there is a recent report that one of the 2-5A synthetase isoforms may participate in the mammalian nuclear pre-mRNA splicing pathway (224). Are there specific sequence or structural features in dsRNA that are necessary for 2-5A synthetase activation? There is no apparent sequence requirement, and a low level of base-pair mismatch can be tolerated (225). The synthesis of 2-5A in cell-free extracts is maximally stimulated by dsRNA species longer than 65-80 bp, whereas dsRNAs less than 30 bp fail to activate (225).There is a good correlation between dsRNA size requirements for efficient induction of the interferon response and activation of 2-5A synthetase (225). 2-5A synthetase activation by dsRNA exhibits a sigmoidal dependence on enzyme concentration (226),suggesting that efficient activation may require assembly of multiple proteins on the same dsRNA. However, unlike the behavior of the dsRNA-dependent protein kinase, 2-5A synthetase activation is not inhibited by high dsRNA concentrations (225).

50

ALLEN W. NICHOLSON

Immunoprecipitation experiments reveal at least three forms of the 2-5A synthetase in human cells (of approximately 40/46, 67/69, and 100 kDa), whose levels and activities apparently are regulated in a cell-type specific manner (227). The enzymatic activity of each synthetase isoform exhibits a different dsRNA concentration dependence, and the 100-kDa form has the highest affinity for dsRNA. The isoforms are expressed from at least two different genes, and alternate RNA splicing and post-translational modification provide further differentiation. The 40- and 46-kDa isoforms are encoded by the same gene (on human chromosome 12), but are expressed from separate mRNAs, which are generated by alternative splicing. The two isoforms therefore are identical for the first 346 amino acids, but have different carboxyl termini. The 69-kDa form of 2-5A synthetase is expressed from a separate gene and there is no current information on the relationship of the 100-kDa species to the other 2-5A synthetases (228). Gel filtration chromatography shows different aggregate forms of the 25A synthetases. It has been proposed that complexes containing multiple copies of the synthetases can synthesize more efficiently the longer forms of 2-5A, which in turn are better activators of RNase L (228). In support of this hypothesis, the monomeric 100-kDa enzyme produces mainly the dimeric form of 2-5A, whereas the tetrameric 40146-kDa species and the dimeric 69-kDa species preferentially synthesize the longer forms (219). The physical proximity of multiple catalytic sites may more efficiently convert the dimeric 2-5A species to longer chains, and a preformed multisubunit complex may be more easily activated by dsRNA binding. The sigmoidal dependence of synthetase activation on protein concentration may reflect this requirement . Chemical fractionation studies indicate different subcellular locations for the 2-5A synthetase isoforms. The 100-kDa enzyme is associated with ribosomes and the rough microsomal fraction; perhaps this isoform suppresses viral protein synthesis by mediating RNase L-dependent cleavage of viral and ribosomal RNA (see Ref. 227 and references therein). The 46-kDa enzyme is also associated with ribosomes, but the more hydrophobic 40-kDa isoform, which is myristylated, preferentially associates with the plasma membrane fraction (227, 229). The 67-kDa 2-SA synthetase isoform is plasma membrane associated, but also associates with the nuclear matrix (230). The activity levels of this synthetase are stimulated by HIV-1 infection, which is followed by an increase in RNase L activity that is also nuclear matrix associated. The involvement of the nuclear matrix in mRNA synthesis, maturation, and transport further supports the general involvement of 2-SA synthetase and RNase L in the normal metabolism of nuclear RNA. Perhaps the cell in the antiviral state exhibits enhanced turnover of nuclearlocalized cellular as well as viral RNA.

DOUBLE-STRANDED RNA

51

C. dsRNA and Mammalian Cell S ig na I Tra nsduct io n dsRNA induces the transcription of the interferon beta gene, as well as other genes (3). The mechanism of transmission of the dsRNA-triggered signal from the cell surface to the nucleus has been the focus of recent studies, and several important features of the pathway have been established (reviewed in Ref. 3). One question concerns the primary event: does dsRNA binding to the cell surface generate the signal directly, or is the dsRNA first internalized? The experimental evidence supports the latter process. First, microinjected poly(I).poly(C) causes rapid lysis of interferon-treated mouse LM cells, whereas treatment of the cells with poly(I),poly(C)covalently linked to Sepharose beads, which physically blocks dsRNA internalization, is without effect (231). Second, Northern analysis was applied to show that interferon induction in mouse cells is directly correlated with the intracellular uptake of poly(I).poly(C)(232).Additional evidence indicates that dsRNA internalization occurs by an energy-dependent process involving an endocytic pathway. Specifically, prior treatment of cells with an endocytosis inhibitor inhibits dsRNA uptake, and other evidence indicates an acidic intracellular compartment as an important intermediary in potentiating the biological action of dsRNA (231). Is there a specific cell surface receptor for dsRNA? Early studies suggested specificity in the interaction of dsRNA with the mammalian cell surface (233, 234). One study employed rabbit kidney cell lines that were either sensitive or unresponsive to dsRNA to provide evidence for a specific cell surface protein that may be a component of the putative dsRNA receptor (235). Characterization of the receptor has been complicated by the existence of both specific and nonspecific binding sites for dsRNA on the cell surface, and by the observation that only a small fraction of the bound dsRNA (presumably the specific receptor-bound component) is needed to generate the signal (235). As discussed above, two cellular proteins that bind dsRNA are PKR and 2-5A synthetase. Because these enzymes are present at low levels in the absence of interferon treatment or viral infection, they provide targets for the internalized dsRNA. Do either of these proteins participate in the dsRNA signal transduction pathway? A clue is provided by the occurrence of signal amplification in the dsRNA-mediated interferon response. Specifically, infection of cells with defective-interfering vesicular stomatitis virus (VSV) particles containing covalently cross-linked (+) and (-) strands of VSV RNA (i.e., an encapsidated, noninfectious dsRNA) showed that essentially a single molecule of dsRNA is sufficient to invoke a full interferon response (236).Signal amplification may be attained by the binding of dsRNA to PKR,

52

ALLEN W. NICHOLSON

because the autophosphorylated PKR could then activate other PKR molecules through intermolecular phosphorylation. The involvement of PKR in the dsRNA signal transduction pathway is also indicated by the ability of 2-aminopurine to block interferon induction in dsRNA-treated or virusinfected cells (237, 238). 2-Aminopurine suppresses PKR activation through competitive inhibition of ATP binding (239, 240). Recent evidence indicates that the dsRNA signal transduction pathway does not involve PKR-catalyzed phosphorylation of eIF2a. Rather, dsRNA stimulates, through PKR phosphorylation of a separate target, the binding of nuclear transcription factor NF-KB to promoter elements of the human P-interferon and other genes (241). NF-KB is a heterodimeric species, containing 50-kDa (p50)and 65-kDa (p65) subunits. Because each subunit may be one of several subtypes, NF-KB can be regarded as a family of closely related transcription factors (for a recent review, see Ref. 242). NF-KB is normally present in the cytoplasm in an inactive complex with I-KB, a protein inhibitor of NF-KB activity, which also has specific isoforms (243).The inactive, I-KB-bound form of NF-KB also contains the precursor (p105) to the active p50 subunit. It has recently been shown that I-KB is a target for phosphorylation by PKR (244) [as well as several other protein kinases, including protein kinase C (245) and Raf-1 (246)].On phosphorylation, the I-KWNF-KBcomplex dissociates, and the p105 is proteolytically processed to the p50 form, which exposes a nuclear localization signal (243). NF-KB migrates to the nucleus, where it binds to promoter-specific sequences and cytoplasmic membrane I

.

FIG. 11. Summary of the dsRNA-dependent signal transduction pathway for transcription of the IFN gene (as well as other genes). See Section IX,Cfor further discussion.

DOUBLE-STRANDED RNA

53

activates interferon gene transcription. The PKR-catalyzed phosphorylation of I-KB apparently is followed by rapid proteolytic destruction of I-KB (243, 247). Figure 11provides a summary of the present status of the dsRNA signal transduction pathway. The incomplete nature of this scheme is underscored by recent evidence that dsRNA mediates the action of transcription factors other than NF-KB, and that tyrosine-specific protein phosphorylation may be involved in a signal transduction pathway involving dsRNA (248, 249).

X. Cellular and Physiological Effects of dsRNA, and Therapeutic Applications

A. Viral Infection, dsRNA, and the Acute Phase Response dsRNA exerts a variety of cellular and physiological effects in addition to (or as a consequence of) interferon production, which underscores the complexity of the response of the organism to dsRNA (reviewed in Refs. 3 and 250). The physiological effects of dsRNA stem from specific cellular events, which would include activation of PKR, activation of the various 2-5A synthetases, and expression of interferons. The dsRNA response may also vary significantly from cell-type to cell-type, adding to the complexity in interpreting the physiological effects. It has been proposed that dsRNA plays a primary role in the physiological response to a cytolytic viral infection (250).Although a viral infection may be limited to a specific tissue, the effects of the infection can be systemic. Thus, viral dsRNA released from infected cells may be distributed by the bloodstream and affect the function of other tissues. In support of this proposal, dsRNA (either synthetic or isolated from influenza virus-infected lung tissue) can provoke in rabbits the constitutional symptoms of the acute phase response of influenza infection, which includes fever and drowsiness (250252). Additional adverse effects of dsRNA include ocular and embryonal toxicity (in the rabbit), and suppression of hemopoietic stem cell proliferation and differentiation, along with spleen hypoplasia and thymic atrophy (in the mouse and rat) (250). dsRNA can also mimic a viral infection at the cellular level, which includes cell damage, as evidenced by vacuolation and cloudy swelling, and can induce apoptosis, as indicated by pyknosis and . activation of PKR has been implichromatin breakdown (250, 2 5 0 ~ )dsRNA cated in triggering apoptosis (250~).

B. dsRNA and Cell Proliferation The antiproliferative ability of dsRNA was demonstrated shortly after its discovery as an interferon inducer (reviewed in Ref. 3). The ability of dsRNA

54

ALLEN W. NICHOLSON

to limit neoplastic or normal cell growth might be a consequence of interferon production, but dsRNA may play a more direct role in modulating cell proliferation. Liver metastases quickly form when LSlBL ascitic tumor cells are injected into a mouse host (253).Administration of poly(rG).poly(rC)or poly(rI).poly(rC) prior to and following inoculation of the tumor cells significantly decreases the number of metastatic colonies in the liver, and prolongs survival of the mice (253). The growth of the same tumor cells in culture is not inhibited by dsRNA, suggesting that inhibition of metastatic growth is host mediated (253). Poly(I).poly(C)suppresses the proliferation of human umbilical vein endothelial cells (254).It was noted that interleukin (1L)-la mRNA production occurred concomitantly with the inhibition of cell division. The involvement of IL-la in the dsRNA-mediated inhibition of endothelial cell proliferation is also indicated by the observation that an antisense oligonucleotide specific for IL-la mRNA abrogates the dsRNA-dependent growth inhibition (254).A consequence of blocking endothelial growth would be a weakening of the lining ofblood vessels, which may be responsible for the hemorrhage and edema seen following administration of dsRNA to chickens (250). There is additional evidence that the antiproliferative effect of dsRNA is mediated by factors other than interferon. For example, interferon-specific antibodies do not relieve the growth inhibitory effects of dsRNA (3). Also, dsRNA suppression of human glioma cell growth may proceed through a pathway involving the CAMP-dependent protein kinase (255). Finally, dsRNA need not be strictly antiproliferative: poly(rI).poly(rC)can stimulate the growth of Balb/C 3T3 or human fibroblast cells (256, 257). In the study on human fibroblasts, antibody neutralization of interferon enhanced the proliferative response, indicating that interferon may self-limit cell growth and division (257). The cell type is a major determinant of the dsRNA response: a recent study demonstrates that dsRNA stimulates the growth of fibroblast MDBK cells while inhibiting the growth of epithelial HT-29 cells

(258). The sequence of a dsRNA can influence its biological activity. A 300-bp dsRNA of defined sequence did not inhibit tumor cell growth, but under the same conditions, poly(I).poly(C) did (259). Because the 300-bp definedsequence dsRNA was capable of activating PKR and the 2-5A synthetase in vitro, the lack of a cellular response may reflect differential binding and/or internalization of the dsRNA, or an increased nuclease sensitivity (259). It would be informative to screen additional dsRNAs of defined sequences and lengths for their ability to inhibit cell proliferation I

C. Therapeutic Applications of dsRNA The potential ofdsRNA as an anticancer agent has been investigated for a number of years. However, the adverse physiological effects of dsRNA have

DOUBLE-STRANDED RNA

55

limited its effectiveness as a therapeutic agent, prompting different approaches. For example, administration of poly(I).poly(C), complexed to the stabilizing agents polylysine and carboxymethylcellulose, has toxic effects (3). In contrast, the mismatched dsHNA, poly(I).poly(CIzU), also termed Ampligen, can provide favorable biological responses with minimal side effects (3). The reduced toxicity of mismatched dsRNA may be due to its shorter physiological half-life (3). Administration of high doses (up to 600 mg) of poly(I).poly(C,,U) to healthy human volunteers had minimal side effects, but the anticipated induction of interferon was not realized (260). In another clinical trial, high doses of poly(I).poly(C,,U) were given to cancer patients. The most frequent response was suppression of tumor growth, with one reported instance of complete remission (discussed in Ref. 3). A synergistic enhancement was observed when IFN-a was given in combination with the poly(I).poly(C,,U). It is not yet clear whether the dsRNA acts directly on the tumor cells or indirectly through the immune system (3). A problem with the use of poly(I).poly(C,,U) is the high dose requirement, necessitating the use of large volumes that must be infused into the bloodstream over a period of hours. Liposomes have been investigated as agents for delivery of poly(I).poly(C,,U) and poly(I).poly(C) (261).However, poly(1).poly(C) is toxic in liposome-encapsulated form, compared to the unencapsulated form (261). For dsRNA to realize full potential as an anticancer agent, more efficient methods of delivery, increased efficiency of response, and decreased toxicity must be established. Double-stranded RNA may provide an effective base therapy for HIV disease, because it can stimulate immune cells, inhibit HIV infection, suppress growth of opportunistic tumors, and perhaps act in a synergistic manner with other anti-HIV drugs (262).Poly(I).poly(C,,U) inhibits HIV replication in vitro and enhances the ability of azidothymidine to suppress HIV replication and stabilize the T cell count (262). For example, HIV-infected human H9 T-cells shed virus approximately 3 days post-infection. However, a prior treatment with poly(I).poly(C,,U) significantly delays the appearance of progeny HIV (263). Poly(A).poly(U)inhibits HIV infection of human lymphoid cells in culture and exhibits a synergistic inhibitory effect with AZT (264).The mechanism of inhibition appears to occur at the level of HIV entry (265),and other polyanions, such as heparin and dextran sulfate, also inhibit entry. There are some biochemical data indicating that the 2-5A synthetase/ KNase L enzyme system is altered in HIV-infected cells. Specifically, the levels of 2-5A synthetase in HIV-infected human T cells are inversely correlated with the amount of progeny virus (230), and poly(I).poly(C,,U) may sustain the increased activity levels of2-5A synthetase in HIV-infected cells (263).

56

ALLEN W. NICHOLSON

XI. Conclusions and Prospects Almost four decades of research have provided a wealth of information on dsRNA structure and biological function. Although the molecular features of the A-form RNA double helix are now well-defined, more studies are required to understand how mismatches, bulges, internal loops, and other structures perturb dsRNA structure and stability, and modulate function. Understanding the determinants that promote triple-helix formation, or that facilitate interconversion of the A and Z helices, would be useful in predicting the occurrence and function of these structures in biological RNAs and in evaluating their potential for the directed control of gene expression. Complexity as well as diversity of protein recognition of dsRNA is anticipated. Determining the structure of the dsRBD motif would provide direct information on one mode of dsRNA recognition. Whether the dsRBD is capable of base-pair sequence-specific binding or whether other motifs exist that also recognize dsRNA remains to be determined. Structural analyses of RNase 111, PKR, and dsRAD, with or without bound substrate, would provide a basis for describing how dsRNA binding triggers an enzymatic activity (dsRNA cleavage, protein phosphorylation, and deamination, respectively). Although not directly addressed in this review, the importance of the RNA double helix in organizing macromolecular assemblages should be noted. A striking example is provided by the recently determined structures of a plant virus and an animal virus (266,267).In each case, a specific dsRNA element within the viral chromosome mediates capsomere-capsomere interactions, thereby stabilizing the protein coat. The structure of the animal virus capsid and the location of the dsRNA element are shown in Fig. 12. Analyses of the structures of prokaryotic antisense RNAs and the precise interactions with their targets provide insight into the dynamics of dsRNA formation and how gene expression is regulated by RNA-RNA interactions. Elucidation of the factors that influence sense-antisense binding, such as RNA secondary and tertiary structure, base-pairing dynamics, and metabolic stability, should assist in the intelligent design of efficient antisense RNAs and ribozymes. There is a growing number of examples of natural antisense RNAs in eukaryotic cells and associated viruses (e.g., see Refs. 268-271). Although little is known of their functional roles, a recent report suggests a role of antisense transcripts in the temporal regulation of translation, through interaction with mRNA 3’-UTRs (272, 2 7 2 ~ )A. wider role of antisense RNA and duplex RNA elements in the regulation of eukaryotic cellular processes is anticipated (273, 274). The characterization of activities that catalyze dsRNA formation, denaturation, movement, degradation, or specific covalent modification is be-

57

DOUBLE-STRANDED RNA

c

FIG. 12. Involvement of dsRNA in capsid protein interactions in flock house virus (FHV). (A) The entire capsid is shown (T = 3). The positions of the dsRNA and the C subunit polypeptide arms are shown at the icosahedral twofold axis (indicated by the solid oval). (B) The interactions of dsRNA with the peptide arms are shown. (C) Another view of the subunitsubunit interactions near the dsRNA-binding cleft. Reprinted with permission from Nature (from Ref. 267). Copyright 1993 Macmillan Magazines Limited.

ginning to reveal an intricate choreography of dsRNA in the mammalian cell. Understanding the mechanisms of these processes and how they change during normal or abnormal cell development or during viral infection presents an experimental challenge. A description of the dsRNA-specific signal transduction mechanism is currently incomplete, but ongoing studies should fill in the gaps and perhaps interrelate it with other signaling pathways. The successful application of dsRNA and specific analog in fighting neoplastic and viral disease may only be realized when (i) more is learned about the cellular, physiological and immunological responses to dsRNA, (ii) more efficient dsRNA delivery methods are developed, and (iii) next-generation dsRNA analogs with improved effectiveness-perhaps when coadministered with other drugs and/or biological response modifiers-are developed. It will be important to determine the cellular factors that determine whether dsKNA acts as a growth inhibitor or stimulator, because this would have an impact on the therapeutic use of dsRNA and antisense RNA. If the past is indeed prologue, then future research on dsRNA structure, reactivity, and biology should still hold many surprises.

58

ALLEN W. NICHOLSON

ACKNOWLEDGMENTS The author thanks the members of his laboratory for sharing their results and interest in dsRNase, and also thanks his colleagues for providing material and information. Thanks also to M. T. Murray and R. H. Nicholson for critically reading the manuscript. Research in the author’s laboratory is supported by the National Institutes of Health (Grant GM41283).

REFERENCES 1. N. R. Kallenbach and H. M. Berman, Q. Reu. Biophys. 10, 137 (1977). 2. M . Libonati, A. Carsana and A. Furia, MCBchem 31, 147 (1980). 3. D. S. Haines, K. I. Strauss and D. H . Gillespie, J. Cell. Biochem. 46, 9 (1991). 4 . J. R. Wyatt and I. Tinoco, in “The RNA World” (R. F. Gesteland and J. F. Atkins, eds.), p. 465. CSH Lab, Cold Spring Harbor, New York, 1993. 5 . M. A. Billeter and C. Weissmann, Prnc. Nucleic Acids Res. 1, 498 (1966). 6. H. D. Robertson and T. Hunter, ]BC 250, 418 (1975). 7 . W. Saenger, “Principles of Nucleic Acid Structure,” Springer-Verlag, New York, 1984. 8 . R. M. Franklin, PNAS 55, 1504 (1966). 9. K. H. Mellits, T. Pe’ery, L. Manche, H. D. Robertson and M. B. Mathews, NARes 18, 5401 (1990). 10. M. B. Mathews, Sem. Virnl. 4, 247 (1993). 1 1 . R. E. Lockard and A. Kuniar, NARes 9, 5125 (1981). 12. H. B. Lowman and D. E. Draper, JBC 261, 5396 (1986). 13. D. E. Kohne, S. A. Levison and M. J. Byers, Bchem 16, 5329 (1977). 14. B. D. Stollar, R. Koo and V. Stollar, Science 200, 1381 (1978). 15. J. Schonborn et a l . , NARes 19, 2993 (1991). 16. J. P. Calvet and T. Pederson, J M B 122, 361 (1978). 17. Y. L. Lyubchenko, A. A. Gall, L. S. Shlyakhtenko, R. E. Harrington, B. L. Jacobs, P. I. Oden and S. M. Lindsay, J . Biomol. Struct. Dynam. 10, 589 (1992). 18. M . Chastain and I. Tinoco, This Series 41, 131 (1991). 19. S . R. Holbrook, J. L. Sussman and S:H. Kim, Science 212, 1275 (1981). 20. R. 0. Day, N. C. Seeman, J. M. Rosenberg and A. Rich, PNAS 70, 849 (1973). 21. J. M. Rosenberg, N. C. Seeman, J. J. P. Kim, F. L. Suddath, H. B. Nicholasand A. Rich, Nature 243, 150 (1973). 22. S. H. Kim, F. L. Suddath, G. J. Quigley, A. McPherson, J. L. Sussman, A. H.-J. Wang, N. C. Seeman and A. h c h , Science 185, 435 (1974). 23. C. J. Alden and S.-H. Kim, J M B 132, 411 (1979). 24. S. Corbin, R. Lavery and B. Pullman, BBA 698, 86 (1982). 25. A. C. Dock-Bregeon, B. Chevrier, A. Podjarny, D. Moras, J. S. deBear, G. R. Gough, P. T. Gilham and J. E. Johnson, Nature 335, 375 (1988). 26. A. C. Dock-Bregeon, B. Chevrier, A. Podjarny, J. Johnson, J. S. deBear, G. R. Gough, P. T. Gilham and D. Moras, J M B 209, 459 (1989). 27. S. R. Holbrook, C. Cheong, I. Tinow and S.-H. Kim, Nature 353, 579 (1991). 28. C. S. Happ, E. Happ, M. Nilges, A. M. Gronenborn and G. M. Clore, Bchem 27, 1735 (1988). 29. S.-H. Chou, P. Flynn and B. Reid, Bchem 28, 2422 (1989). 30. K . Hall, P. Cruz and M. J. Chamberlin, ABB 236, 47 (1984). 31. H. H. Klump and T. M. Jovin, Bchem 26, 5186 (1987).

DOUBLE-STRANDED HNA

59

32. C. C. Hardin, D. A. Zarling, J. D. Puglisi, M. 0. Trulson, P. W. Davis and I. Tinoco, Bchem 26, 5191 (1987). 33. P. W. Davis, R. W. Adamiak aiid I. Tinoco, Biopolyrner 29, 109 (1990). 34. M. Teng, Y.-C. Liaw, G . A. van der Marel, J. H. van Boom and A. H.-J. Wang, Bchem 28, 4923 (1989). 35. D. A. Zarling, C. J. Calhoun, C. C. Hardin and A. H. Zarling, PNAS 84, 6117 (1987). 36. R. Kapahnke, W. Rappold, U. Desselberger and D. Riesner, NARes 14, 3215 (1986). 37. M. A. Livshits, 0.A. Amosova and Y. L. Lyubchenko, J. Biomol. Struct. Dynani. 7, 1237 (1990). 38. F.-U. Gast and P. J. Hagerman, Bcherri 30, 4268 (1991). 39. Y.-H. Wang, M. T. Howard and J. D. Griffith, Bchem 30, 5443 (1991). 40. D. Riesner, J. M. Kaper and J. W. Randles, NARes 10, 5587 (1982). 41. R. S. Tang aiid D. E. Draper, Bchem 29, 5232 (1990). 42. A. Bhattacharyya, A. I. H. Murchie and D. M. J. Lilley, Nature 343, 484 (1990). 42a. C . Gohlke, A. I. H. Murchie, D. M. J. Lilley and R. M. Clegg, PNAS 91, 11660 (1994). 43. R. S. Tang and D. E. Draper, NARes 22, 835 (1994). 44. S. P. Edmondson and D. M. Gray, Biopolymer 23, 2725 (1984). 44a. F.-U. Gast and H. L. SBnger, Electrophoresis 15, 1493 (1994). 45. M. Chastain and I. Tinoco, NARes 20, 315 (1992). 46. R. W. Roberts and D. M. Crothers, Science 258, 1463 (1992). 47. L. Yang and T. A. Keiderling, Biopdymer 33, 315 (1993). 48. B. J. Rao and C. M. Radding, PNAS 91, 6161 (1994). 49. N. C. Seeman, J. M. Rosenberg and A. Rich, PNAS 73, 804 (1976). 50. K. M. Weeks and D. M. Crothers, Science 261, 1574 (1993). 51. C. W. Carter and J. Kraut, PNAS 71, 283 (1974). 52. D. St Johnstone, N . H. Brown, J. G. Gall and M. Jantsch, PNAS 89, 10979 (1992). 53. T. Gibson and J. D. Thompson, NARes 22, 2552 (1994). 54. R. E. Klevit, Science 253, 1367 (1991). 55. M. A. Rould, J. J. Perona, D. Sol1 and T. A, Seitz, Science 246, 1135 (1989). 56. F. H. Westheimer, Accts. Chem. Res. 1, 70 (1968). 57. D. A. Usher, Nature (New B i d . ) 235, 207 (1972). 58. D. A. Usher and A. H. McHale, PNAS 73, 1149 (1976). 59. D. A. Usher and A. H. McHale, Science 192, 53 (1976). 60. S. H. Hall and R. J. Crouch, JBC 252, 4092 (1977). 61. D. Court, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman, eds.), p. 71. Academic Press, New York, 1993. 62. H. Nashimoto arid H. Uchida, MGG 201, 25 (1985). 63. Y. Davidov, A. Rahat, I. Flechiier and 0. Pines, J. Gen. Microbiol. 139, 717 (199.3). 6 4 Y. Iino, A. Sugimoto and M. Yamamoto, EMBO J. 10, 221 (1991). 64a. M. Zuber, T. A. Hoover, B. S. Powell, and D. L. Court, Mol. Microbiol. 14, 291 (1994). 65. H. Nashimoto, A. Miura, H. Saito and H. Uchida, MGG 199, 381 (1985). 66. S.-M. Chen, H. E. Taka, 6. C . Dubois, A. M . Barber, J. C. A. Bardwell and D. L. Court, JBC 265, 2888 (1990). 67. P. E. March and M. Gonzalez. NARes 18, 3293 (1990). 68. H. Li, B. S. Chelladurai, K. Zhang and A. W. Nicholson, NAHes 21, 1919 (1993). 69. E. E. Kim and H. W. Wyckafl; J M B 218, 449 (1991). 70. T. A. Steitz and J. A. Steitz, PNAS 90, 6498 (1993). 71. A. W. Nicholson, BBA 1129, 318 (1992). 72. H. 11. Robertson, Cell 30, 669 (1982). 7,3. L. Krinke and D. L. Wulfl; NARes 18, 4809 (1990).

60

ALLEN W. NICHOLSON

74. B. S. Chelladurai, H. Li, K. Zhang and A. W. Nicholson, Bchem 32, 7549 (1993). 75. H. D. Robertson and F. Barany, in “Proceedings of the 12th FEBS Congress,” p. 285. Pergamon, Oxford, 1979. 76. D. C. Schweisguth, B. S. Chelladurai, A. W. Nicholson and P. B. Moore, NARes 22, 604 (1994). 77. B. S. Chelladurai, H. Li and A. W. Nicholson, NARes 19, 1759 (1991). 78. S. Altuvia, H. Locker-Giladi, S. Koby, 0. Ben-Nun and A. B. Oppenheim, PNAS 84,6511 (1987). 79. R. J. Roberts and S. E. Halford in “Nucleases” (S. M. Linn, R. S. Lloyd and R. J. Roberts, eds.), 2nd Ed., p. 35. CSHLab, Cold Spring Harbor, New York, 1993. 80. J. J. Dunn, JBC 251, 3807 (1976). 81. G. Gross and J. J. Dunn, NARes 15, 431 (1987). 82. 0. 0. Favorova, F. Fasiolo, G. Keith, S. K. Vassilenko and J.-P. Ebel, Bchem 20, 1006 (1981). 83. P. E. Auron, L. D. Weber and A. Rich, Bchem21, 4700(1982). 84. M. Libonati and S. Sorrentino, MCBchem 117, 139 (1992). 85. S. Sorrentino and M. Libonati, ABB 312, 340 (1994). 86. M. Libonati, BBA 228, 440 (1971). 87. M. Libonati, M. C. Malorni, A. Parente and G. D’Alessio, BBA 402, 83 (1975). 88. P. Kindler, T. U . Keil and P. H. Hofschneider, MGG 126, 53 (1973). 89. P. Gegenheimer and D. Apirion, Microbiol. Reu. 45, 502 (1981). 90. F. W. Studier, J. J. Dunn and E. Buzash-Pollert, in “From Gene to Protein: Information Transfer in Normal and Abnormal Cells” (T. R. Russell, K. Brew, H. Farber and T. Schultz, eds.), p. 261. Academic Press, New York, 1979. 91. H. Saito and C. C. Richardson, Cell 27, 533 (1981). 92. G. Koraimann, C. Schroller, H. Graus, D. Angerer, K. Teferle and G . Hogenauer, Mol. Microbiol. 9, 717 (1993). 93. J. J. Dunn and F. W. Studier, J M B 166, 477 (1983). 94. M. Gottesman, A. Oppenheim and D . Court, Cell 29, 727 (1982). 95. P. Regnier and M. Grunberg-Manago, Biochimie 72, 825 (1990). 96. J. C. A. Bardwell, P. Regnier, S. M. Chen, Y. Nakamura, M. Grunberg-Manago and D. Court, EMBO J. 8, 3401 (1989). 97. E. Hajnsdorf, A. J. Carpousis and P. Regnier, J M B 239, 439 (1994). 98. D. R. Gitelman and D. Apirion, BBRC 96, 1063 (1980). 99. J. E. Mayer and M. Schweiger, JBC 258, 5340 (1983). 100. R. A. K. Srivastava, N. Srivastava and D. Apirion, Znt. J. Biochem. 24, 737 (1992). 101. E. G. H. Wagner and R. W. Simons, ARBchem 48, 713 (1994). 102. Y. Eguchi, T. Itoh and J.-I. Tomizawa, ARBchem 60, 631 (1991). 103. K. M. Takayama and M. Inouye, CRC Crit. Reu. Biochem. Mol. B i d . 25, 155 (1990). 104. J.-I. Tomizawa, in “The RNA World (R. F. Gesteland and J. F. Atkins, eds.), p. 419. CSHLab, Cold Spring Harbor, New York, 1993. 105. Y. Eguchi and J.-I. Tomizawa, J M B 220, 831 (1991). 106. H. Masai and K. Arai, NARes 16, 6493 (1988). 107. C. Persson, E. 6. H. Wagner and K. Nordstrom, EMBO J . 7, 3279 (1988). 108. P. Blomberg, K. Nordstrom and E. C. H. Wagner, E M B O J . 11, 2675 (1992). 109. P. Blomberg, H. M. Engdahl, C. Malmgren, P. Romby and E. G. H. Wagner, Mol. Microbiol. 12, 49 (1994). 110. E. G. H. Wagner, P. Blomberg and K. Nordstrom, EMBOJ. 11, 1195 (1992). 111. P. Blomberg, E. 6 . H. Wagner and K. Nordstrom, EMBO J. 9, 2331 (1990). 112. K. Gerdes, A. Nielsen, P. Thorsted and E. 6. H. Wagner, J M B 226, 637 (1992).

DOUBLE-STRANDED RNA

61

T. Thisted and K. Gerdes, J M B 223, 41 (1992). T. Thisted, N. S. Sorensen, E. 6. H. Wagner and K. Gerdes, EMBOJ. 13, 1960 (1994). T. Thisted, A. K. Nielsen and K. Gerdes, EMBOJ. 13, 1950 (1994). R. W. Simons and N. Kleckner, Cell 34, 683 (1983). C. Ma and R. W. Simons, E M B O J . 9, 1267 (1990). C. C. Case, E. L. Simons and R. W. Simons, EMBOJ. 9, 1259 (1990). J. D. Kittle, R. W. Simons, J. Lee and N. Kleckner, J M B 210, 561 (1989). C. C. Case, S. M. Roels, P. D. Jensen, J. Lee, N. Kleckner and R. W. Simons, E M B O J . 8, 4297 (1989). 121. L. Krinke and D. L. Wulff, Genes Deu. 4, 2223 (1990). 122. L. Krinke, M. Mahoney and D. L. Wulff, Mol. Micl-obiol. 5, 1265 (1991). 123. T. van Biesen, F. Soderbom, E. G . H. Wagner and L. S. Frost, Mol. Microbid. 10, 35 (1993). 124. S. H. Lee, L. S. Frost and W. Paranchych, MGG 235, 131 (1992). 125. M. Citron and H. Schuster, NARes 20, 3085 (1992). 126. A. L. Biere, M . Citron and H. Schuster, Genes Deo. 6, 2409 (1992). 127. M. Faubladier, K. Cam and J.-P. Bouche, ] M B 212, 461 (1990). 128. F. Tetart and J.-P. Bouche, Mol. Microbid. 6, 615 (1992). 129. T. Mizuno, M.-Y. Chou and M. Inouye, PNAS 81, 19066 (1984). 130. J. Andersen, S. A. Forst, K. Zhao, M. Inouye and N. Delihas, JBC 264, 17961 (1989). 131. J. Andersen and N. Delihas, Bchem 29, 9249 (1990). 132. J. Coleman, P. J. Green and M. Inouye, Cell 37, 429 (1984). 133. S. Pestka, B. L. Daugherty, V. Jung, K. Hotta and R. K. Pestka, PNAS 81, 7275 (1984). 134. B. L. Daugherty, K. Hotta, C. Kumar, Y. H . Ahn, J. Zhu and S. Pestka, Gene Anal. Techn. 6, l(1989). 134a. T. A. H. Hjalt and E. G. H. Wagner, NARes 23, 571 (1995). 135. T. A. H. Hjalt and E. G. H. Wagner, NARes 20, 6723 (1992). 135a. T. A. H. Hjalt and E. G. H. Wagner, NARes 23, 580 (1995). 136. M . Homann, K. Rittner and G. Sczakiel, J M B 233, 7 (1993). 137. W. F. Lima, B. P. Monia, D. J. Ecker and S. M. Freier, Bchem 31, 12055 (1992). 138. J.-C. Chuat and F. Galibert, BBRC 162, 1025 (1989). 139. S. Altman, PNAS 90, 10898 (1993). 140. Y. Inokuchi, N. Yuyama, A. Hirashima, S. Nishikawa and J. Ohkawa, JBC 269, 11361 (1994). 141. W. Jelinek and J. E. Darnel], PNAS 69, 2537 (1972). 142. H. D. Robertson, E. Dickson and W. Jelinek, JMB 115, 571 (1977). 143. H. D. Robertson and E. Dickson, MCBiol 4, 310 (1984). 144. W. R. Jelinek, T.P. Toomey, L. Leinwand, C. H. Duncan, P. A. Biro, P. V. Choudary, S. M. Weissman, C. M. Rubin, C. M. Houck, P. L. DeiningerandC. W. Schmid, PNAS 77, 1398 (1980). 145. C. M. Rubin, C. M. Houck, P. L. Deininger, T. Friedmann and C. W. Schmid, Nature 284, 372 (1980). 146. J. P. Calvet and T. Pederson, PNAS 74, 3705 (1977). 147. J. P. Calvet and T. Pederson, PNAS 76, 755 (1979). 148. P. I. Marcus and I. Yoshida, J. Chem. Physiol. 143, 416 (1990). 149. J. Huet, A. Sentenac and P. Fromageot, FEBS Lett. 94, 28 (1978). 150. D. J. Mead and S. 6 . Oliver, EJB 137, 501 (1983). 151. H.-P. Xu, M. Riggs, L. Rodgers and M. Wigler, NARes 18, 5304 (1990). 152. D. Frendewey, M. Gillespie and 1. Tarnok, I. Chem. Biochem. 17C, 177 (1993). 153. R. Stern, BBRC 41, 608 (1970).

113. 114. 115. 116. 117. 118. 119. 120.

62

ALLEN W. NICHOLSON

154. J. Rech, G. Cathala and P. Jeanteur, NARes 3, 2055 (1976). 155. H. D. Robertson and M. B. Mathews, PNAS 70, 225 (1973). 156. A. L. M. Bothwell and S. Altman, JBC 250, 1451 (1975). 157. J. Rech, C. Brunel and P. Jeanteur, BBRC 88, 422 (1979). 158. I. Grummt, S. H. Hall and R. J. Crouch, EJB 94, 437 (1979). 159. B. A. Peculis and J. A. Steitz, Cell 73, 1233 (1993). 160. G. Shanmugam, BBRC 70, 818 (1976). 161. G. Shanmugam, Bcheni 17, 5052 (1978). 162. B. K. Saha and D. Schlessinger, BBRC 79, 1142 (1977). 163. B. K. Saha and D. Schlessinger, JBC 253, 4537 (1978). la. K. Ohtsuki, Y. Groner and J. Hurwitz, JBC 252, 483 (1977). 165. P. Palese and 6 . Koch, PNAS 69, 698 (1972). 166. H. S. Kang and B. R. McAuslan, J. Virol. 10, 202 (1972). 167. P. P. Hung, Virology 51, 287 (1973). 168. J. M. Meegdn and P. I. Marcus, Science 244, 1089 (1989). 169. H. Ben-Artzi, E. Zeelon, M. Gorecki and A. Panet, PNAS 89, 927 (1992). 170. H. Ben-Artzi, E. Zeelon, S. F. J. Le-Grice, M. Gorecki and A. Panet, NARes 20, 5115 (1992). 171. S. W. Blain and S. P. Goff, JBC 268, 23585 (1993). 172. Z. Hostomsky, 6. 0. Hudson, S. Rahmati and Z. Hostomska, NARes 20, 5819 (1992). 173. B. L. Bass and H. Weintraub, Cell 48, 607 (1987). 174. M. R . Rebagliati and D. A. Melton, Cell 48, 599 (1987). 175. R. W. Wagner and K. Nishikura, MCBiol 8, 770 (1988). 176. Y. A. W. Skeiky and K. Iatrou, JMB 218, 517 (1991). 177. B. L. Bass and H. Weintraub, Cell 55, 1089 (1988). 178. A. G. Polson, P. F. Crain, S. C. Pomerantz, J. A. McCloskey and B. L. Bass, Bchem 30, 11507 (1991). 179. R. W. Wagner, J. E. Smith, B. S. Cooperman and K. Nishikura, PNAS 86, 2647 (1989). 180. B. L. Bass, in “The RNA World (R. F. Gesteland and J. F. Atkins, eds.), p. 383. CSHLab, Cold Spring Harbor, New York, 1993. 181. R. F. Hough and B. L. Bass, JBC 269, 9933 (1994). 182. U. Kim, T. L. Garner, T.Sanford, D. Speicher, J. M. Murray and K. Nishikura,JBC 269, 13480 (1994). 183. K. Nishikura, C. Yoo, U. Kim, J. M. Murray, P. A. Estes, F. E. CashandS. A. Leibhaber, EMBO J. 10, 3523 (1991). 183a. M. A. O’Connell and W. Keller, PNAS 91, 10596 (1994). 183b. U. Kim, Y. Wang, T. Sanford, Y. Zeng and K. Nishikura, PNAS 91, 11457 (1994). 184. K. Nishikura, Ann. N.Y. Acad. Sci. 60, 240 (1992). 185. D. Kimelman and M. W. Kirschner, Cell 59, 687 (1989). 186. B. L. Bass, H. Weintraub, R. Cattaneo and M. Billeter, Cell 56, 331 (1989). 187. S. M. Rataul, A. Hirano and T. C. Wong, J. Virol. 66, 1769 (1992). 188. M. Higuchi, F. N. Single, M. Kohler, B. Siimmer, R. Sprengel and P. H. Seeberg, Cell 75, 1361 (1993). 788a. S. M. Rueter, C. M. Burns, S. A. Coode, P. Mookherjee and R. B. Erneson, Science 267, 1491 (1995). 788b. J.-H. Yang, P. Sklar, R. Axel and T. Maniatis, Nature 374, 77 (1995). 189. A. G. Polson and B. L. Bass, EMBOJ. 13, 5701 (1994). 789a. L. Saccomanno and B. L. Bass, MCBiol 14, 5425 (1994). 190. L. M. Morrissey and K. Kirkegaard, MCBiol 11, 3719 (1991).

DOUBLE-STRANDED RNA

63

191. P. Belhuineur, J. Lanoix, Y. Blais, D. Forget, A. Steyaert and D. Skup, MCBiot 13, 2846

(1993). 192. B. K. Ray, T. G. Lawson, J. C. Kramer, M. H. Cladaras, J. A. Grifo, R. D. Abramson, W. C. Merrick and R. E. Thach, JBC 260, 7651 (1985). 193. R. Iggo, S . Picksley, J. Southgate, J. McPheat and D. P. Lane, NARes 18, 5413 (1990). 194. D. A. Wassarnian and J. A. Steitz, Nature 349, 463 (1991). 195. W. M. Toone, K. E. Rudd and J. D. Friesen, J. Bact. 173, 3291 (1991). 196. C.-G. Lee and J. Hunvitz, JBC 268, 16822 (1993). 197. H . Flores-Rosa and J. Hunvitz, JBC 268, 21372 (1993). 198. F. V. Fuller-Pace, S. M. Nicol, A . D. Reid and D. P. Lane, EMBO J. 12, 3619 (1993). 199. S. Teigelkamp, M. McGarvey, M. Plumpton and J. D. Beggs, EMBO J. 13, 888 (1994). 200. C.-G. Lee and J. Hunvitz, JBC 267, 4398 (1992). 201. K. Yamanaka, T. Ogura, E. V. Koonin, H. Niki and S. Hiraga, MGG 243, 9 (1994). 202. Q . Xiao, T. V. Sharp, I. W. Jeffrey, M. C. James, G. J. M. Pruijn, W. J. van Venrooij and M. J. Clemens, NARes 22, 2512 (1994). 203. K. Nishi, F . Morel-Deville, J. W. B. Hershey, T. Leighton and J. Schnier, Nature 336,496 (1988). 204. E . J. Steinmetz and T. Platt, PNAS 91, 1401 (1994). 205. H. Flows-Rozas and J. Hurwitz, JBC 268, 21372 (1993). 206. D. S. Portman and G . Dreyfus, E M B O 1. 13, 213 (1994). 207. H. Idriss, A. Kumar, J. R. Casas-Finet, H. Guo, Z. Damuni and S. H. Wilson, Bchem 33, 11382 (1994). 208. T. Hunter, T. Hunt and R. J. Jackson, JBC 250, 409 (1975). 209. M. J. Clemens, J. W. B. Hershey, A. Hovanessian, B. C. Jacobs, M. 6. Katze, R. J. ,Kaufman, P. Lengyel, C. E. Samuel, G. Sen and B. R. 6. Williams, J. Znterferon Res. 13, 241 (1993). 210. E . Meurs, K. Chong, J. Galabru, N . S. B. Thomas and A. Hovanessian, Cell 62, 379 (1990). 211. 6. N . Barber, M. Wambach, M.-L. Wong, T. E. DeverandA. G. Hinnebusch, PNAS90, 4621 (1993). 211a. P. R. Romano, S. R. Green, 6. N . Barber, M. B. Mathews and A. 6. Hinnebusch, MCBiol 15, 365 (1995). 212. A. 6. Hovanessian, J. Interferon Res. 9, 641 (1989). 213. D. C. Thomis and C. E. Samuel, J. Virol. 67, 7695 (1993). 214. L. Manche, S . R. Green, C. Schmedt and M. B. Mathews, MCBiol 12, 5238 (1992). 215. M. G. Katze, M. Wambach, M. L. Wong, M. Garfinkel, E. Meurs, K. Chong, B. R. 6. Williams, A. G. Hovanessian and C . N . Barber, MCBiol 11, 5497 (1991). 216. G . 3 . Feng, K. Chong, A. Kumar and B. R. 6. Williams, PNAS 89, 5447 (1992). 217. S . R. Green and M. B. Mathews, Genes Dev. 6, 2478 (1992). 218. S. J. McCormack, L. G . Ortega, J. P. Doohan and C. E. Samuel, Virology 198, 92 (1994). 218a. S . R. Green, L. Manche and M . 8 . Mathews, MCBiol 15, 358 (1995). 218b. S . B. Lee, S . R. Green, M. B. Mathews and M. Esteban, PNAS 91, 10551 (1995). 219. A. 6. Hovanessian, J. Interferon Res. 11, 199 (1991). 220. C. E. Samuel, Virology 183, l(1991). 221. A. Zhou, B . A. Hassel and R. H . Silverman, Cell 72, 753 (1993). 222. G . Grihaudo, D. Lernbo, 6. Cavallo, S. Landolfo and P. Lengyel, J. Virol. 65, 1748 (1Bl). 223. T. W. Nilsen, P. A. Maroney, H. D. Robertson and C. Baglioni, MCBiol 2, 154 (1982). 224. J. Sperling, J. Chebath, H. Arad-Dann, D. Offen, P. Spann, R. Lehrer, D. Goldblatt, B. Jolles and R. Sperling, PNAS 88, 10377 (1991).

64

ALLEN W. NICHOLSON

M. A. Minks, D . K. Weset, S. Benvin and C. Baglioni, JBC 254, 10180 (1979). H. Samanta, J. P. Dougherty and P. Lengyel, JBC 255, 9807 (1980). J. Chehath, P. Benech, A. Hovanessian, J. Galabru and M. Revel, JBC 262, 3852 (1987). I. Marie and A. G. Hovanessian, JBC 267, 9933 (1992). I. Marie, J. Svab, N. Robert, J. Galabru and A. G. Hovanessian, JBC 265, 18601 (1990). H. C. Schroder, R. Wenger, Y. Kuchino and W. E. G. Muller, JBC 264, 5669 (1989). P. G. Milhaud, M. Silhol, T. Salehzada and B. Lehleu, J. Gen. Virol. 68, 1125 (1987). K. A. Kelley and P. M. Pitha, Virology 147, 382 (1985). C. Colby and M. Chamherlin, PNAS 63, 160 (1969). J. Vilcek, M. H. Ng, A. E. Friedman-Kien and T. Krawciw, J. Virol. 2, 648 (1968). I. Yoshida, M. Azuma, H. Kawaii, H. W. Fisher and T. Suzutani, Acta Virol. 36, 347 (1992). 236. P. I. Marcus and M . J. Sekellick, Nature 266, 815 (1977). 237. K. Zinn, A. Keller, L.-A. Whittemore and T. Maniatis, Science 240, 210 (1988). 238. P. I. Marcus and M. J. Sekellick, J . Gen. Virol. 69, 1637 (1988). 239. P. J. Farrell, K. Balkow, T Hunt, R. J. Jackson and H. Trachsel, Cell 11, 187 (1977). 240. Y. Hu and T. W. Conway, J . Interferon Res. 13, 323 (1993). 241. K. V. Visvanathan and S. Goodbourn, EMBO]. 8, 1129 (1989). 242. P. A. Bauerle, BBA 1072, 63 (1991). 243. K. H. Mellits, R. T. Hay and S. Goodbourn, NARes 21, 5059 (1993). 244. A. Kumar, J. Haque, J. Lacoste, J. Hiscott and B. R. G . Williams, PNAS 91, 6288 (1994). 245. S. Ghosh and D. Baltimore, Nature 344, 678 (1990). 246. S. Li and J. M. Sedivy, PNAS 90, 9247 (1993). 247. T. Henkel, T. Machleidt, I. Alkalay, M. Kronke, Y. Ben-Neriah and P. Bauerle, Nature 365, 182 (1993). 248. T. Decker. J . Interferon Res. 12, 445 (1992). 249. C. Daly and N. Reich, MCBiol 13, 3756 (1993). 250. W. A. Carter and E. De Clercq, Science 186, 1172 (1974). 250a. S. B. Lee and M. Esteban, Virology 199, 491 (1994). 251. J. A. Majde, R. K. Brown and M. W. Jones, Microb. Pathogen. 10, 105 (1991). 252. M. Kimura-Takeuchi, J. A. Majde, L. A. Toth and J. A. Krueger,J. Infect. Dis. 166, 1266 (1992). 253. V. Juraskova, N. Dyatlova and V. Brabec, Eur. J . Phannacol. 221, 107 (1992). 254. S. Garfinkel, D. S. Haines, S. Brown, J. Wessendorf, D. H. Gillespie and T. Maciag, JBC 267, 24375 (1992). 255. H. R. Hubbell, J. E. Boyer, P. Roane and R. M. Burch, PNAS 88, 906 (1991). 256. J. N. Zullo, B. H. Cochran, A. S. Huang and C. D. Stiles, Cell 43, 793 (1985). 257. J. Vilcek, M. Kohase and D. Henrikson-DeStefano, J . Cell. Physiol. 130, 37 (1987). 258. M. K. Chelbi-Mix and C. E. Sripati, Exp. Cell Res. 213, 383 (1994). 259. D. S. Haines, R. J. Suhadolnik, H. R. Hubhell and D.H. Gillespie, JBC 267, 18315 (1992). 260. C. W. Hendrix, J. B. Margolick, B. G. Petty, R. B. Markham, L. Nerhood, H. Farzadegan, P. 0. P. Ts'o and P. S. Lietman, Antimicrob. Agents Chemother. 37,429 (1993). 261. P. G . Milhaud, P. Machy, S. Colote, B. Lehleu and L. Lesernian, J. Integeron Res. 11, 261 (1991). 262. D. Gillespie and W. A. Carter, Med. Hypotheses 37, 1 (1992). 263. H. Ushijima, P. G. Rytik, F. Schacke, H. U. Scheffer, W. E. G. Muller and H. C. Schroder, J . Interferon Res. 13, 161 (1993). 264. A. G. Laurent-Crawford, B. Krust, E. Deschamps de Paillette, L. Montagnier and A. Hovanessian, AIDS Res. H u m n Retrouir. 8, 285 (1992).

225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235.

DOUBLE-STRANDED RNA

65

265. B. Krust, C. Callebaut and A. Hovanessian, AZDS Res. Human Retrmir. 9, 1087 (1993). 266. S. B. Larson, S. Koszelak, J. Day, A. Greenwood, J. A. Dodds and A. McPherson, Nature 361, 179 (1993). 267. A. J. Fisher and J. E. Johnson, Natvre 361, 176 (1993). 268. J . G. Stevens, E. K. Wagner, G. B. Devi-Rao, M. L. Cook and L. T. Feldman, Science 235, 1056 (1987). 269. S . Khochbin and J.-J. Lawrence, E M B O J. 8, 4107 (1989). 270. M. Hildebrandt and W. Nellen, Cell 69, 197 (1992). 271. B. J. Dolnick, NARes 21, 1747 (1993). 272. R. C. Lee, R L. Feinbaum and V. Ambrose, Cell 75, 843 (1993). 273. M. Wickens and K. Takayama, Nature 367, 17 (1994). 274. R. Nowak, Science 263, 608 (1994). 275. Y. L. Lyubchenko, B. L. Jacobs and S. M. Lindsay, NARes 20, 3983 (1992). NOTE ADDEDIN PROOF:(1)A dsRNA persistence length of 720 i 70 A was determined by transient electric birefringence (TEB) [Kebbekus et al., Bchem 34, 4354 (1995)], and is consistent with earlier measurements, in that dsRNA is stiffer than DNA. TEB was also used to measure bulge-loop bending of dsRNA [Zacharias and Hagerman, J M B 247, 486 (1995)l. The angles range from 7-93", with an increasing number of nt (A or U) in the bulge loop. (2) Using immunocytochemical techniques, PKR was localized to the mammalian nucleus and nucleolus, in addition to the cytoplasm. Interferon treatment selectively increases cytoplasmic PKR levels, and a nuclear function of PKR is suggested [Jeffrey et al., Exp. Cell Res. 218, 17 (1995)l. (3) Additional evidence indicates accurate in oitro editing by d s U D of GluR-B pre-mRNAs [Melcher et al., JBC 270, 8566 (1995)l. (4) Biochemical studies indicate that the PKR dsRBD interacts with one turn of dsRNA [Schmedt et a l . , J M B 249, 29 (1995)l. (5) N M R was used to solve the structure of the dsRBD of Drusuphila Staufen protein [Bycroft et al., E M B O J. 14, 3563 (1995)]and ofE. Culi RNase 111 [Khdratt et at., EMBO J. 14, 3572 (1995)]. Both dsRBDs are closely similar, compact ellipsoids, and exhibit an alPlP2P3a, tertiary fold, with the two a helices packed on one side of the antiparallel P sheet. Direct interaction wih dsRNA is proposed to occur near the N terminus of helix aP.(6) Yeast RNase H I binds dsRNA in its N-terminal domain [Cerritelli and Crouch, RNA 1, 246 (1995)l. This domain, separate from the RNase H catalytic domain, contains two dsRBD-like motifs. Also, dsRNA binding is distinct from RNADNA hybrid binding and cleavage. (7) The HIV-1 reverse-transcriptase-associatedRNase H can cleave dsRNA under conditions of arrested reverse transcription [Gotte et al., E M B O J . 14, 833 (1995)], or in the presence of Mn2+ [Cirino et al., Bchem 34, 9936 (1995)l. (8) dsRNA induces adherence of sickle erythrocytes to the vascular endothelium [Smolinski et al., Bloud 85, 2945 (1995)], providing a connection between viral infection, dsRNA production, and resultant microvascular occlusion that precipitates sickle cell-associated pain.

Evolution, Expression, and Possible Function of a Master Gene for Amplification of an Interspersed Repeated DNA Family in Rodents PRESCOTTL. DEININGER~ Department of Biochemistry and . Molecular Biology Louisiana State University Medical Center New Orleans, Louisiana 70112 and Laboratory of Molecular Genetics Alton Ochsner Medical Foundation New Orleans, Louisiana 70121

HENRITIEDGE Departments of Pharmacology and Neurology State Unioersity of New York Health Science Center a t Brooklyn Brooklyn, New York 11203 JOOMYEONG

KIM

Department of Biochemistry and Molecular Biology Louisiuna State University Medical Center New Orleans, Louisiana 70112 JURGEN

BROSIUS

lnstitut f u r Erperimentelle Pathologie Zentruni f u r Molekularbiologie der Entziindung (ZMBE) Westfalische Wilhelm-Unioersitat 48149 Munster, Gennany I. Evolution of the BC1 RNA Gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. The BC1 RNA Gene As a Master Gene for ID Repeats . . . . . . . . . . . . . 111. Anatomical and Subcellular Distribution of BC1 RNA . . . . . . . . . . . . . . . IV. Transcriptional Regulation of the Rat BC1 RNA Gene . . . . . . . . . . . . . . V. Speculations on BC1 RNA Fnnction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69 74 79 81 85 87

’ To whom correspondence may be addressed. Progress in Nucleic Acid Keaearch

aid tvl~~leculiu Biology, Val. >2

67

Copyright 0 19% by Academic Press, Inc. All r&ts nf reproducbon in any form reserved

68

PRESCOTT L. DEININGER ET AL.

BC1 RNA was originally identified (1-4) as a small cytoplasmic RNA species, found primarily within the brain of rats, that hybridized with a major (short interspersed repetitive element; SINE) family of repetitive DNA sequences in the rat genome. Identifier (ID) elements were initially found associated with neural-specific genes, prompting the idea that they might be involved in cell-type-specific gene expression. Subsequently, ID elements were detected in various nonneural genes (including housekeeping genes), and the notion of an ID-dependent regulation of brain-specific gene expression was challenged (5). BC1 RNA is transcribed by RNA polymerase 111 (3). Because most SINE elements have RNA polymerase 111 promoters (6), it was originally thought that the abundant transcription product detected in Northern blots with ID sequence probes was due to cumulative transcription from many dispersed ID loci (1).It was later found, through cDNA cloning experiments designed to clone the full-length RNA polymerase III-derived transcript, that the BC1 RNA was actually generated almost exclusively from a single gene (7). All of the BC1 cDNAs cloned had not only the identical ID-related sequence at

A

ID-region

GGGGUUGGGGAUUUAGCUCAGUGGUAGAGCGCUUGCCUAGCAAGCG CAAGGCCCUGGGWCGGUCCUCAGCUCCG

3' A-rich region

AAAAAAAAAAAAAAAAAAAAAAGACAAAAUAACAAAAGACC-

unique region CAAGGUAACUGGCACACACAACCUUU

B

_____ _____

3'

ID body

A-rich

____

unique

FIG. 1. Sequence and schematic of the rat BC1 RNA gene (A) The coding region of the gene includes a 75-nt ID-body region, a 51-nt A-rich region, and a 26-nt unique region, and terminates with a typical RNA polymerase I1 terminator consisting of 4 T residues. The arrow in B shows the length and direction of the BC1 transcrlpt. The A-rich region is not pure A and T, but includes a few other bases interspersed. Only a few bases at each end of the A-rich region are shown. The last three bases of the unique region are also shown, and typical transcripts terminate with two to four U residues coded from the terminator. (B) The sequence elements are shown schematically with the transcript and its orientation indicated with an arrow.

EVOLUTION AND EXPRESSION OF RODENT

BC1 RNA

69

their 5' ends and the expected A-rich region found in SINE transcripts, but they also had a short segment at the 3' end that was not related to the ID repeats (see Fig. 1). When this segment was used to probe rat genomic Southern blots, it was found to be unique and was then used to isolate the BC1 genomic locus (8). A number of proposals have been made concerning the relationship of this single BC1 locus and the other dispersed I D elements, both in terms of functional models and evolutionary relationships. Although no specific function has been demonstrated for either the BC1 RNA or I D elements, a number of lines of investigation suggest that the BC1 RNA gene plays some functional role, probably within neurons, throughout the rodent order. The BC1 RNA gene has also been shown to be a master gene for I D repeat amplification and evolution (10).

1. Evolution of the BC1 RNA Gene

A. The BC1 RNA Gene Is Rodent-specific

The BC1 RNA is a major RNA species in rat, mouse, hamster, guinea pig (11), squirrel, and Peromyscus (D. Kass and P. Deininger, unpublished). Thus, it appears to exist in all rodent genomes. A small transcript with similar expression patterns, found in the primate brain (U), is related to a totally independent repeated DNA family specific for primates, and has no direct relationship with the rodent BC1 RNA gene (13). Other extensive hybridization experiments at the RNA level also failed to discover a related sequence in rabbit, bovine, or primates (14).Although the sequencing of the region orthologous to the BC1 locus would be necessary to demonstrate unambiguously the rodent specificity of the BC1 RNA gene, the existing experiments make a very strong case for the origin of this gene specifically in the rodent genome. Some investigators have suggested that the guinea pig should not be considered a rodent (15). However, the presence of BC1 RNA and the specific BC1 genomic locus clearly identifiable in the guinea pig, but in no other nonrodent species, makes a strong argument, along with other data, for the guinea pig's relationship to the rodents (14).

6. Origin of BC1 RNA If the BC1 RNA gene originated early in rodent evolution, where did it come from? It almost certainly arose by evolution from a tRNA progenitor (16). Although there is some question about the specific tRNA species that gave rise to the BC1 RNA gene, it seems most likely that it was derived from

70

PRESCOTT L. DEININGER ET AL.

a tRNA*'a gene or pseudogene. Figure 2 presents a comparison of the BC1 RNA sequence to a mouse tRNA*la gene (17). The origin of SINES from different tRNA genes has been reviewed (18, 19). The presence of the A-rich region immediately adjacent to the 3' end of the region having homology to a mature tRNA transcript suggests that the tRNA gene copy that eventually gave rise to the BC1 RNA gene was generated through a retroposition process. However, although the BC1 RNA gene was one of the very first ID-related sequences in the rodent genome (lo),it cannot be determined whether the BC1 RNA gene was directly derived G

A

C

GG C

G

U G

A

G

l

u x

G

G G C C C I

I

I

I

I

C C G G G C C uU .

-C

G

U G A

cu

-__

u

U

c

U -A

U -Q

C -G

C -G

G-U

U U

A

u u

C-G A G G A

~a ~a

G P G-C A C

1

, G A G C

G

uu

'G-C G- C G-U G-C A-U U-A U

l

gG

A

CUCG" I

G

B

G - c

U

C

A

U

C G C

t

a

A

a G C

uu

GGGGUUGGGGAU

IIIIIIIIIIII

t

U AGA AGCUCAG GGU GCGCUUGC~

IIIIIII Ill

IIIIIIII u Q(3)

FIG.2. The tRNA origin and RNA structure of the BC1 RNA. (A) The traditional tRNA cloverleaf structure of a mouse tRNAAh (dashes show the base-pairing. (Post-transcriptional base modifications are not shown.) The unpaired sequences at the 5' end of the transcript are from the gene sequence and are normally processed off in the mature tRNA. (B) A portion of the BC1 RNA placed into the same structure. Lowercase letters represent bases of the BC1 RNA that differ from the sequence of the tRNAA1d. No gaps need be placed in the sequence to maintain this alignment. (C) An alternate, and much more stable, structure that can be drawn for the BCI RNA. The arrows point to positions that have mutated, as shown for the specific ID subfamily sequences (subfamily shown in parentheses).

EVOLUTION AND EXPRESSION OF RODENT

BC1 RNA

71

from a tRNA gene, or whether there were one or more intermediate gene duplications in the process. There is seine similarity between the 5' end of the tRNA transcript that should be processed off the mature tRNA species, and the 5' end of the BC1 RNA gene. This region is too short to determine whether this sequence similarity is due to the BC1 RNA gene originating directly from this sequence, from chance, or from some selective constraints on the initiation sequences for some RNA polymerase 111-directed transcripts. It should be noted that this tRNA gene and the BC1 RNA gene match throughout their length without the need for any gaps to align their sequence. Thus, it seems likely that if this was not the particular tRNAAla gene that gave rise to the BC1 RNA gene, it was a closely related one. However, if the BC1 RNA gene was created through a retroposition process, there are no longer any clear flanking direct repeats to serve as a typical hallmark of the retroposition process. These may have been lost, considering its age, and would also mark it as much older than the majority of the I D copies. It is quite possible that the event giving rise to the original BC1 locus occurred prior to the divergence of rodents from the other mammalian orders. However, in this case, we must propose that it was lost from those other orders and specifically conserved in the rodents. As discussed in Section I,D, the BC1 RNA gene seeins to be under selective pressure in rodents, consistent with the possibility of an older tRNA-derived pseudogene having obtained functional significance (20) only in the rodent lineage.

C. Duplication of the BC1 RNA Gene in Guinea Pig As discussed in Section 11, the BC1 RNA gene is responsible for a large portion of the amplification of I D repeats via retroposition. However, we know in the guinea pig that the BC1 RNA gene and its flanking regions were duplicated at least once by some other mechanism, probably via a DNAmediated recombination mechanism (10).In contrast, retroposition events that generate an ID repeat derived from the BC1 RNA gene will only encode the I D portion of the repeat and a similar A-rich 3' region, not the unique portion of the BCl RNA molecule, nor the flanking sequences. Thus, the I D copies will be missing any contribution of the BC1 3' unique region and the flanking regions as far as gene expression and potential function (see Section 11,D). Both of the two guinea pig BC1 RNA genes were thus more likely to be functional initially. The coding regions of the genes have been relatively well conserved, but the flanking regions have been subjected to extensive deletions and mutations. We do not know whether any of these changes have significantly changed the expression pattern or potential functions of these genes. However, having a duplicate gene available has impor-

PRESCOTT L. DEININGER ET AL.

72

tant implications for the generation of new master genes and the potential divergence of one copy to a slightly different expression pattern or function.

D. Conservation of the BC1 RNA Gene There are two important aspects of the conservation of the BC1 RNA gene. The first is the conservation of tRNA-like (or other RNA structure) features within the BC1 RNA gene; the second is the evidence that conservation provides regarding regions that may be under functional selection. Figure 2 shows the relationship of the BC1 RNA gene to a tRNA*ld gene. Although the sequence conservation is very strong, particularly in the 5’ two-thirds of the RNA, many of the standard tRNA features have not been conserved. The anticodon and aminoacyl stems can no longer form a stable structure. On the other hand, the D-loop and pseudouridine (+) loop stems are still structurally sound. The loop stem has apparently had at least one compensating mutation to maintain its structure. We must be careful because the presence of the A and B promoter elements associated with the D and loops, respectively, may contribute to conservation of these regions rather than RNA stability. However, it is tempting to propose that the BC1 RNA has simply evolved to a somewhat modified structure. A much more stable possible structure for the BC1 RNA is shown in Fig. 2C. This structure has not been confirmed by biochemical studies. However, the relationship of the four ID subfamily mutations relative to this structure are quite interesting. The one base change in the Type 2 subfamily strengthens the base-pairing by changing a G.U to a G-C base-pair. The diagnostic mutation associated with the Type 3 subfamily would destabilize a base-pair near the loop region. Last, the two changes in the Type 4 Subfamily would increase the base-pair stability in the same stem as the Type 2 mutation and would affect the same base-pair as the Type 3 mutation. Thus, it seems likely that RNA structural considerations have played an important role in evolution of both the BCl RNA gene and the specific subfamilies of I D repeats in rat. Figure 3 shows the actual conservation between the rat BC1 locus and that of mouse, hamster, and guinea pig. In all cases, the RNA coding region is much better conserved than are the flanking sequences, with a decrease in sequence identity further from the coding region. This is very strong evidence that there is functional selection being placed on the coding region. Surprisingly, when analyzing the coding sequences in more detail (lo),there is no obvious difference in the conservation between the I D body of the gene and the A-rich or unique regions. This suggests that all portions of the RNA

+

+

EVOLUTION AND EXPRESSION OF RODENT -200

-100

J

I

BC1 coding

I

loo[ 95

-

I

73

BC1 RNA

I

100

200

-

I I

al

e

n

-

I

KEY:

rathamster ratlguinea pig

FIG. 3. Conservation of the BC1 RNA locus. The top line presents a schematic of the BCl RNA gene locus, with the shaded box representing the RNA-coding portion of the gene and the numbering representing bases either 5' flanking to the gene (negative numbers) or 3' flanking to the gene (positive numbers). The scale on the left represents the percent identity between the various regions of the locus shown, with the key representing the comparison of various rodent genomes with those of rat. Bars represent sequence similarity in the coding region or in 100-bp flanking segments. The black bars represent comparisons between rat/mouse; gray bars, between rat/harnster; and open boxes, between rat/guinea pig. The rat/guinea pig 5' and 3' flanking sequences represent identity of only the first 52 and 70 bases, respectively, because alignments beyond those points were difficult without excessive gaps. In all cases, the coding region has diverged significantly less than either flanking region.

are subject to selection. The higher level of similarity with the immediate flanking sequences (Fig. 3) suggests that the flanking regions of the gene may also be under some selective constraints. These would almost certainly have to be selected because of effects on the expression of the gene. Analysis of the 5' flank of the rat BC1 gene shows a TATA-like sequence at position -28, which is also conserved in the other rodent genes. It seems likely that this and other conserved stretches play a role in the high levels of expression or tissue specificity of this gene (see Section IV).

74

PRESCOTT L. DEININGER ET AL.

II. The BC1 RNA Gene As a Master Gene for ID Repeats A. The BC1 RNA Gene Is a Master Gene

Two features suggest that the BCl RNA gene might represent a master gene controlling I D family amplification. The most important is that the SINE amplification mechanism almost certainly requires an RNA intermediate (21).Thus, the very high levels of expression that can be generated from the BC1 RNA gene and, similarly, the relatively low levels of expression that must be coming from other ID loci make the BC1 RNA a likely intermediate in the amplification process. The presence of the BC1 RNA gene in all rodents and without the traditional direct repeats associated with retroposons also suggests that it has the appropriate age to have founded this SINE family. There are only about 200 I D sequences in the guinea pig genome. Thus, the BCl RNA gene was among the first of the ID-related sequences. In fact, although the guinea pig genome has two BC1 RNA genes, it has the lowest copy number of I D repeats of any of the rodents examined (Section 11,B). Analyses of guinea pig I D sequences show that some have diagnostic sequence differences specific to one of the BC1 RNA genes, and some to the other (10). This evolutionary pattern suggests that both copies of the guinea pig RNA gene have been able to make ID copies at some time during guinea pig evolution. The mouse genome has about 10,000 copies of I D elements. Once again, the sequences of these ID elements closely reflect the sequence of the BC1 gene in mouse. This confirms that the BC1 RNA gene has controlled the evolution of the I D family of repeats and represents a master gene for I D amplification (10). A similar analysis of I D repeats and the BC1 locus from Peromyscus (D. Kass and P. Deininger) is consistent with this role of BC1 as a master gene of rodent I D amplification. The dominance of the BC1 RNA gene as a master gene for I D elements in these rodents demonstrates the extremely low probability that a new I D insertion will be highly active at retroposition. Most such insertions are probably pseudogenes from the start (22) and any copies that are initially active will be silenced relatively quickly. It is very likely that the selective evolutionary constraints placed on the BC1 RNA gene have been important in maintaining its amplification potential throughout rodent evolution. This has allowed it to continue to make copies and therefore dominate the amplification process (23). The relationship between BC1 and I D sequences in the rat is much more complicated. About 10,000 copies of the rat ID elements have sequences

EVOLUTION AND EXPRESSION OF RODENT

BC1

75

RNA

consistent with having been generated by a BC1 master gene. However, several newer subfamilies (see Section 11,C) are inconsistent with amplifications using the BC1 HNA as the intermediate. Thus, although the BCl RNA gene has dominated the evolution of ID family members in most rodent genomes, in the rat genome, other loci have also contributed significantly. This suggests that there may be one or more ID loci in the rat that became highly effective master genes.

B. Identifier-element Copy Numbers and Times of Amplification

There is tremendous variation in the copy number of ID repeats found within various rodent species (see Fig. 4). This ranges from a minimum of about 200 copies in the guinea pig to about 130,000 in the rat, with numbers

RAT

MOUSE

-

GUINEA PIG

7 A

200

Ancestral BC1

RNA Gene

FIG. 4. Evolutionary relationship of the BC1 gene and ID repeats. The BC1 gene, represented as a heavy line, was founded early in the rodent lineage. Different mutations can be found in the modern BC1 gene in different species as shown by the different geometric shapes on the BC1 gene. Two BCI genes are present in guinea pig, with independent mutations. The lighter lines represent the ID elements present in those genomes, with the number represeuting the approximate copy number. Essentially the same diagnostic mutations are found in the ID elements as have occurred in the BC1 gene. One exception occurs in rat, where only the 10,000 copy number Type 1ID element matches the BC1 gene, and a series of subfamilies 2-4 have a successive series of newer diagnostic mutations.

76

PRESCOTT L. DEININGER ET AL.

around 2,000-10,OOO found in mouse and hamster (24). This seems consistent with a steady increase in amplification rate in the lineage leading up to the rat species. However, the situation is much more complicated than this, with significant copy number differences found even within specific rodent families. The ID family members that have been analyzed in most species also appear to be quite homogeneous in sequence, suggesting very recent times of amplification. This is particularly true with the very high copy number of rat ID repeats (10, 24), but is also true for the mouse ID sequences studied. These two observations-the copy-number variation and the recent formation of most ID repeats-are most easily explained with a model in which the ID repeats had very little amplification capability early in rodent evolution, and that certain stochastic events have increased amplification rates at different times and in different species.

C. Identifier-element Subfamilies There are many ID sequences available in the rat database. Analysis of these sequences demonstrates that there are distinct subfamilies of ID sequences, similar to that seen for a number of other mammalian SINEs (22, 23). These subfamilies can also be arranged in a sequential manner, in which each subfamily sequence has one or more diagnostic positions relative to the previous one (10). These subfamilies also show progressively less sequence divergence, consistent with increasingly younger average sequence age. These younger subfamilies, termed Types 2-4, represent over 100,000 copies and show an average of less than 3% from the consensus. This suggests extremely rapid and recent amplification. Thus, because these sequences are inconsistent with amplification with the BC1 RNA gene as the master gene, there must be one or more new master genes formed in rat that are even more efficient than the BC1 RNA gene. We believe that the major reason that rat has such a significantly higher copy number of ID repeats relative to other rodent genomes is the presence of additional master gene(s). We do not know whether such master genes were made through BC1 RNA gene-duplication events, such as seen for the guinea pig gene (lo),or whether an ID element inserted into a favorable locus for transcription and for further amplifications. The possible influences and limits of such a site have been extensively reviewed (26a). However, as discussed in Section II,A, new highly active master genes for ID repeats have not been detected in other rodent genomes and seem to be a rare event for other SINEs (22).Thus, it seems that the chance formation of one or two new master genes in rat has been responsible for this rapid increase in

EVOLUTION AND EXPRESSION OF RODENT

77

BC1 RNA

amplification rate. Although these subfamilies have been made over a relatively short evolutionary period, it is tempting to consider the possibility that the master loci for these subfamilies represent duplicates of the BC1 locus or ID copies that have adapted to a slightly different function than that of the BC1 RNA. This would allow such elements to be maintained by selection and perhaps they could adapt to a new expression pattern that allowed higher expression in the germ line, where sequence amplifications must occur. J. Kim, D. H. Kass and P. Deininger (26b) and others (7) have not detected any other major RNA species in rat cells that might represent the RNA intermediate for these newer subfamily copies. We cannot be sure that such transcripts do not exist in some specific germ-line cell type. We have detected a particular variant locus that represents the major form of RNA present in the BC2 fraction of' ID transcripts (26b). This is a relatively divergent ID copy that does not seem to be actively making ID copies, but does continue to show a significant level of brain specificity in its expression, Because of the lack of detection of any major transcript(s) specific to the newer subfamilies, we believe it likely that the master gene(s) making these ID elements must be relatively more efficient than the BC1 RNA gene at other steps in the retroposition process.

D. Mechanistic Considerations of the BC1 /ID Master Gene The finding that the BC1 RNA gene has served as a master gene for ID amplification and evolution demonstrates that this gene has a significant advantage in amplification capability relative to the many dispersed ID loci. It is obvious that the expression of the BC1 RNA is a prerequisite to its amplification. However, in other tissues, BC1 RNA expression is significantly reduced and the relative level of expression from other dispersed ID loci is more likely to be important. Thus, it seems likely that other factors,

unique .................................... AAAAAAAAAAACAAGGT

?.

3'

ID body

FIG. 5. Self priming of BC1 RNA. The primary transcript shown in Fig. 1 is folded into a structure that would allow self-priming of reverse transcription. The putative reverse transcript is represented by the dotted line. The numbers of U residues at the 3' end are expected to vary somewhat.

78

PRESCOTT L. DEININGER ET AL.

in addition to transcription, may also be important in selecting the active copies. We find it likely that the 3‘ end of the BC1 KNA also plays a significant role in the amplification capability. One proposal for the remarkable amplification capabilities of SINES was that the 3’ terminal uridines on the RNA polymerase 111 transcripts might efficiently prime reverse transcription on the 3’ A-rich region in an intramolecular reaction (26c) (see Figs. 1 and 5). We have tested the ability of BC1 RNA to undergo such an intramolecular priming event (M. R. Shen and P. Deininger, unpublished) and found that the RNA does undergo an extremely efficient self-priming reaction. However, this self-priming was found not to be a generalized priming on the A-rich region, but was instead found to involve a longer stretch of the 3’ end of the RNA, forming a very specific hairpin structure at the extreme 3’ end of the A-rich region (Fig. 5). It is likely that the templates for this self-priming are preferentially the subset of BC1 RNAs ending with only two U residues, as three or four U residues would result in mismatched bases near the 3’ end. As each ID copy will have a difterent 3‘ end, depending on the site of integration, it is unlikely that most copies could undergo an efficient self-priming reaction. Although this finding does not demonstrate the use of self-priming in the authentic retroposition mechanism, the finding that a demonstrated master gene for amplification is able efficiently to carry out a self-priming reaction is strong circumstantial evidence for the importance of this process to SINE amplification in general. The potential involvement of 3’ terminal sequences in SINE amplifications also implicates several aspects of RNA stability in the efficiency of amplification. It is obvious that if there is the need for a germ-line RNA intermediate in retroposition that a more stable RNA will build up to higher steady-state levels and therefore have a potential amplification advantage. Because the principal difference between different SINE transcripts that might form is in terms of the different 3’ unique sequences they might contain, these sequences are the most likely to play a potential role in differential stability of ID transcripts. In addition, a number of SINE RNAs undergo a 3’ processing or specific degradation reaction (27). If the 3’ end of the RNA is removed in this way, the potential self-priming sequences will also be removed. The BC1 KNA gene is clearly very stable in some cells. We have studied both BC1 transcripts and ID transcripts in rat brain and testes and found little processing of the BCl transcript, but extensive processing of other I D transcripts to forms with complete removal of the 3‘ sequences (26b).These studies suggest that the structure and stability of the BC1 RNA may be significant factors in its ability to serve as an ID master gene.

EVOLUTION AND EXPRESSION OF RODENT

BC1

HNA

79

111. Anatomical and Subcellular Distribution of BC1 RNA BC1 RNA was discovered about 13 years ago (1-4) in rat brains as a small cytoplasmic RNA. Subsequently, similar small cytoplasmic RNAs were found at much lower levels in a broad range of other cell types (28). Extensive mapping of BC1 RNA expression in the adult rat brain has established that it is expressed in neurons but not in glial cells and, significantly, that it is located not only in neuronal somata but also in dendrites of neuronal subpopulations (29). Studies with acutely isolated neurons have clearly confirmed the neuron-specific expression and the somatodendritic location of this RNA (29). Although primate BC200 RNA is not a homologue of rodent BC1 RNA (12, 13), it is interesting to note that in the human nervous system its distribution is very similar to that of BC1 RNA, even on a subcellular level (30).The onset of BC1 expression in the developing rat brain has also been extensively studied; significantly, we found that the beginning of BC1 expression in several types of neurons coincided with periods of developmental synaptogenesis (V. Liu, J. Brosius and H. Tiedge, unpublished). Using in situ hybridization techniques, the expression pattern of BC1 RNA in the adult rat nervous system was established with a probe that recognizes only BC1 RNA. Examples of these localizations are presented in Fig. 6. Strongly labeled were elements of the amygdaloid complex, including nuclei in the olfactory, medial, central, and basolateral amygdala, as well as the bed nucleus of the stria terininalis. Intense labeling was also observed in the septa1 nuclei; however, only moderate labeling was evident in the corpus striatum. The neocortex is labeled with medium intensities. A number of thalainic nuclei were strongly labeled, among them the paraventricular thalamic nucleus, the paratenial thalamic nucleus, and the medial habenular nucleus of the epithalamus. A similarly strong hybridization signal was observed in several hypothalamic nuclei, including the supraoptic nucleus, the paraventricular hypothalainic nucleus, the dorso- and ventralmedial hypothalamic nuclei, and several preoptic nuclei. In the visual system, intense labeling was observed in the ventral lateral geniculate nucleus (the dorsal lateral geniculate nucleus was only moderately labeled) and in the superior colliculus, here especially in the zonal layer. Other strongly labeled midbrain areas include the inferior colliculus, in particular the dorsal cortex, and the central gray. In the cerebellum, BC1 labeling is low to moderate. White matter areas throughout the brain, such as the lateral olfactory tract, the optic nerve, the anterior and posterior commissure, corpus callosum, the internal capsule, the sensory root of the trigeminal nerve, and the pyramidal tract, showed little or no labeling. This indicates that BC1 RNA is

80

PKESCOTT L. DEININGER ET AL.

FIG. 6. Location of BC1 RNA and FAP-43 mRNA in acutely isolated spinal cord neurons. Spinal cord neurons were isolated as described in Ref. 62. Epiluminescence micrographs (B, DF, H) show the location of autoradiographic silver grains over individual neurons. B, D, F, and H show single neurons; E shows a group of neurons. Phase contrast micrographs (A, C, G) corresponding to epiluniinescence micrographs B, D, and H, show the nerve cells with their processes. Overexposure (exposure times of > 8 weeks) of neurons hybridized with the probe complementary to GAP-43 mRNA produced little or no specific labeling of neurites, although it resulted in heavy labeling of neuronal perikarya and in higher levels of unspecific background labeling. The respective "sense strand" control probes (BCI and GAP-43) failed to produce any specific labeling of acutely isolated cells (data now shown). Cells were counterstained with cresyl violet and methylene blue. Magnification, 240X. From Ref. 29 with permission.

expressed at low levels, if at all, in axons or glial cells. Likewise, no more than background labeling was detected in a number of nonneural tissues, including liver, lung, kidney, spleen, and skeletal and cardiac muscle. However, developing germ cells in male and female gonads were found to express BC1 RNA at appreciable'levels (Z. Zakeri, J. Brosius and H. Tiedge,

EVOLUTION AND EXPRESSION OF RODENT

BC1 RNA

81

unpublished data). Germ-line expression of BCl RNA is in support of the BC1 RNA gene as the founder of ID repetitive elements (see Section 11,A). BC1 RNA was found to be localized in the inner plexiform layer of the rodent retina. It was then tested whether the BC1 labeling signal can be attributed to any particular type of neurite, in particular to differentiate between dendrites of ganglion cells and other neuritic processes in the inner plexiform layer. This area of the retina contains a dense neuritic plexus with synaptic contacts between axons of bipolar cells, dendrites of ganglion cells, and dendritelike processes of amacrine cells. Because these processes cannot be differentiated by light-microscope observation alone, we used an electriclesion protocol to sever the optic nerve unilaterally shortly after birth. This procedure results in the eventual degeneration of retinal ganglion cells, including their dendritic trees. When we compared the BC1 labeling signal in the inner plexiform layer of a retina 6 weeks after the operation, with the signal in the contralateral control eye, we found a significant reduction of the grain density. The signal remaining in the inner plexiform layer after transection of the optic nerve may be attributable to dendritic processes of amacrine cells (31). We have recently found the only exception (thus far) to the somatodendritic location of BC1 RNA in neurons: BC1 RNA is axonally transported from magnocellular hypothalamic neurons to neurosecretory nerve endings in the posterior lobe of the rat pituitary (32). Recently, axonal messenger RNAs have also been identified in the pituitary. They include mRNAs for oxytocin, vasopressin, dynorphin, and neurofilament (33-38).

IV. Transcriptional Regulation of the Rat BC1 RNA Gene On its discovery, rodent brain cytoplasmic BC1 RNA was thought to be a transcription product from many ID repetitive elements (1-4). This belief was, in part, based on the presence of internal RNA polymerase 111 promoter elements in the I D elements, at that time thought to be necessary and sufficient for all genes transcribed by RNA polymerase 111. Later, it was shown that BC1 RNA is a homogeneous RNA transcribed from a single gene (7) and that most of the ID repetitive elements are transcriptionally silent and only found in transcripts when located on larger hnRNAs or mRNAs (5). The notion that BC1 RNA has been recruited (or exapted; see Ref. 39a) into a function and is not an RNA product that is fortuitously expressed in a few rodent species is furthermore supported by its cell-type-specific transcription. The prevalent expression of BC1 RNA in the nervous tissue of rodents (apart from lower level expression in reproductive organs (Section

82

PRESCOTT L. DEININGER ET AL.

II,D) occurs in both sciurognathid rodents and guinea pig (14). As tissuespecific expression patterns are thus identical in both rodent suborders (Sciurognathi and Hystricognathi), its transcriptional regulation is also a conserved feature prevailing for about 55 million years. Our in vitro studies indicate that there are several control elements within the gene and in the 5’ flanking sequences (39b).Most of these elements, as shown by alterations on an individual basis, are necessary for efficient transcription. Thus, the persistence of the nerve-cell-specific transcription pattern of BC1 RNA for about 55 million years cannot merely be explained away by the presence of a “robust promoter. ” From analysis of the genomic structure, we identified several putative elements that previously had been shown to be necessary for RNA polymerase I11 transcription of various small RNA genes. Apart from the typical internal promoter elements, referred to as box A and box B, we detected octamer transcription-factor-binding sequences, a proximal sequence element (PSE, -53), and a TATA box (-27) upstream from the gene (Fig. 7). The latter three elements are also present upstream from the genes for 7SK RNA and U6 snRNA (40, 41) and, interestingly, they are not only necessary but also sufficient for transcription by RNA polymerase 111. It was expected, therefore, that a subcloned SacI-Mae11 fragment (pKK 415-1) located at positions -4 and -429 from the BC1 RNA coding region, would analogously support RNA polymerase I11 transcription when used as a template for in vitro transcription is a HeLa cell extract (a gift from S. Murphy and R. Roeder). Surprisingly, no transcription was observed using the upstream region alone in a HeLa cell extract (Fig. SA, lane 1) or in a rat brain extract at various conditions (not shown). Furthermore, the BC1 upstream region could not functionally replace the 7SK gene upstream sequences (Fig. SA, lane 4; Fig. 8B, lane 8) whereas, conversely, the 7SK

FIG. 7. Map of upstream regulatory region of the BC1 RNA gene. The 433-bp segment between the Sac1 site and the 5’ end of the gene (angled bar) is shown enhanced. The putative ocfamer transcription factor binding sites (OCTA), the proximal sequence element (PSE), and the TATA box are indicated. The positions that correspond to deletion points (deletions starting upstream) are marked above the enlarged map portion.

EVOLUTION AND EXPRESSION OF RODENT

A 1 2 3 456 789

BC1

83

HNA

C

0 1 2 3 4 5 6 7 8 9 *

*

1234567 890AB

*-a.r*l-*Ln*-a!-

*

sl

I -

-*

FIG. 8. In uitro transcription of the BCl RNA gene. The transcripts, radiolabeled with ["ZPlGTP (800 Ci/mniol), were separated on 6% acrylamide, 0.3%(bis)acrylamidegels containing 7 M urea. After drying, the gels were exposed for about 12 hours with an intensifier screen (A) HeLa cell extract was used for the fdlowing plasmid (p) templates (concentrations [mg/ml] in the reactions are given in brackets): (1)pKK415-1 [ S ] ; (2) pBC1:KS [5]; (3)pBC1:SK IS]; (4) pBCU7SK [lo]; (5) p7SK/BC1 [lo]; (6)p7SK [20]; (7) p7SK(pUC) [20]; (8)pBluescript KS [20]: (9) pBluescript SK [20]. pBCl plasmids contain the entire 1453-bp SmI-BarnHI fragment (see Fig. 7) in either orientation. pBClI7SK and p7SK/BC1 are hybrid genes with swapped regulatory regions. (B) Rat-brain whole-cell extract with the following templates (all S mgiml): (1) pBC1:KS: (2) pBCl:-Abox; (3)pBCl:-Bbox; (4) pBCl:-A/Bbox; (5) ptRNA:XP; (6) pBClltRNA; (7) p7SK; (8) pBCU7SK; (9) p7SL (see also text for ternplate descriptions). (C) Rat-brain wholecell extract with the following deletion (see Fig. 7) templates (all 5 mg/ml): (1) pBC1:KS; (2) pBC1:O; (3)pBC1:-17; (4) pBC1:-33; (5)pBC1:-53; (6)pBC1:-73; (7) pBC1:-97; (8)pBC1:-129; (9) pBC1:-173; (0) pBC1:-186; (A) pBC1:-273: (B) pBC1:-313.

upstream region was active when fused to the BC1 gene (Fig. 8A, lane 5). Unlike with U 6 or 7SK genes, the corresponding region from the BCl gene i s therefore not sufficient for transcription. The above results prompted us to test whether the internal promoter regions (box A and box B) were importaut for in vitro transcription of the BC1 RNA gene usiiig the homologous rat \ m i n extract. Deletions of either

84

PRESCOTT L. DEININGER ET AL.

box A or B alone, or in combination, virtually eliminated transcription (Fig. 8B, lanes 2-4). However, as is the case for the upstream promoter elements, the internal RNA polymerase I11 transcription elements are also not sufficient for transcription by themselves. Unlike many studied tRNA genes, the upstream region of the BC1 gene is clearly important, because a fragment corresponding to the coding region only (with box A B intact) is not transcribed in uitro (Fig. 8C, lane 2). To h r t h e r delineate which combinations of regulatory elements are necessary using the homologous rat brain extract in uitro, nested deletions of the 5' flanking region were generated. It could be demonstrated that the odtamer sequences are not necessary in vitro, but that the PSE (to a lesser extent) and the TATA box are important for efficient transcription in uitro (Fig. 8C). In order to study the effect of various control elements on celltype-specific BC1 RNA expression we are currently using various rat BC1 RNA gene constructs in transgenic mice. In these in uiuo experiments, we find that transcription efficiency strongly requires the presence of upstream sequences that include both octamers (42). This demonstrates that at least some of the elements found upstream from the BC1 RNA gene strongly modulate its expression in uitro. This fact is also supported by the following experiment. As shown in Fig. 8B, lane 5, a tRNALeUgene (43) is only weakly transcribed in uitro. When the upstream sequence of the tRNA gene was replaced with that of the BC1 gene, a strong enhancement on transcription of the tRNALeUgene was observed (Fig. 8B, lane 6). The BC1 gene (PSE, TATA, and box A + B necessary) belongs to the class of RNA polymerase I11 genes that shares elements with RNA polymerase 11genes. However, it must be grouped into yet another subclass because it differs from the 7SK RNA and U6 RNA genes (no internal elements necessary; see Ref. 44)and the selenocysteine tRNA(Ser)secgene (PSE, TATA, and box B necessary; see Ref. 45) in that it requires, in addition to the upstream elements, at least the internal elements (box A and box B). Our results in this in uitro analysis are also consistent with the observation that I D repetitive elements per se are transcriptionally silent, as the retroposition process will not carry these important flanking sequences to the new insertion locus. In addition to the above identified elements (TATA box, internal box A, and box B) that are vital for BC1 RNA transcription in uitro, we expect additional promoter elements to be present that are responsible for the developmental and nerve-cell-specific RNA polymerase I11 transcription of the B C l RNA gene. We are currently using transgenic mouse models to identify such element(s).

+

EVOLUTION AND EXPRESSION OF RODENT

BC1

RNA

85

V. Speculations on BC1 RNA Function The concept of local protein synthesis in dendrites has received increasing-experimental support in recent years (46). Polyribosomes are located beneath synaptic sites, most prominently at the base of dendritic spines, in dentate granule cells of the hippocampus (47, 48). It has also been demonstrated that RNA is actively transported into dendrites but not into axons of hippocampal neurons in culture (49, 50). Consistently, mRNAs for a limited number of dendritic proteins have recently been detected in dendrites (most mRNAs, whether they encode dendritic proteins or other components of nerve cells, are restricted to the cell body). One of the dendritic mRNAs codes for the large form of microtubule-associated protein 2 (MAPB; see Ref. 51).The large MAP2 is a tubulin-binding protein specifically associated with the dendritic cytoskeleton (52). Another dendritic mRNA encodes the a-subunit of Ca2+/calmodulin-dependent protein kinase type I1 (CaM-KII; see Ref. 53). CaM-KII is found at high concentrations in postsynaptic densities and has been implicated in signal transduction mechanisms and in the induction of long-term potentiation (54). Furthermore, the mRNA for the type I inositol 1,4,5,-triphosphate receptor has been detected at substantial levels in Purkinje cell dendrites in mice (55). Recent reports demonstrating active protein biosynthesis in a preparation of dendrites isolated from cultured hippocampal neurons or in a preparation containing synaptosoines (56, 57) strongly emphasize the importance of specialized protein synthetic machinery in postsynaptic domains of dendrites. Such a mechanism would enable neurons to synthesize selected dendritic proteins locally, close to the respective postsynaptic sites where they are required. This would facilitate a decentralized and more flexible regulation of protein repertoires in postsynaptic domains, for example, in response to local synaptic stimuli. A precondition for localized protein synthesis is that components of the translation apparatus are also localized within the same subcellular compartment. In addition, for temporal and conditional regulation of this process, special mechanisms are required to prevent constitutive translation. Our current working hypothesis is that rodent BC1 RNA may (as ribonucleoprotein complexes; see Refs. 58-60) regulate translation in postsynaptic compartments. Synapses target both dendrites and cell bodies. Location of the small RNAs (such as BC1 RNA) both in dendrites and in cell bodies is therefore consistent with our hypothesis. BC1 RNA is derived from tRNAAla(16). Thus its ancestry supports our hypothesis that the RNA molecule may be involved in regulatory aspects of dendritic protein biosynthesis, possibly before or during phases of translation. BC1 RNA, as an example of a recent gene duplication yielding an RNA

86

PRESCOTT L. DEININGER ET AL.

species with novel distributions and potentially novel function, demonstrates that RNA molecules are not merely remnants or fossils from the RNA world. In contrast, just as with proteins, new RNA species can be generated at any time during evolution (20, 3%). Another concept is emerging from our studies. In the past, retroposition has been thought to produce mainly “junk DNA” in the form of retropseudogenes and middle repetitive sequences, but it now seems likely that these mechanisms can occasionally give rise to novel genes or regulatory elements (20, 39u). Our work suggests that variants of existing RNAs have been co-opted into specialized functions by the evolving nervous system, just as variant proteins have been. Although many molecules that are important for nerve cell function are quite ancient (kinases, phosphatases, receptors, channels), other neuronal-specific molecules, such as microtubuleassociated protein or growth-associated protein (GAP-43), have so far not been detected in invertebrates. The young age (on an evolutionary scale) of BC1 suggests two possible scenarios. Either these RNA molecules have been recruited into an existing functional protein or RNP complex to enhance efficiency, or these RNPs play a role that is entirely novel to nerve cells. This modification or novel role has become indispensable and is now under selective pressure. Although this hypothetical function may not be essential for all nerve cells from invertebrates to primates, it is tempting to consider that nervous systems and some of their neurons must have undergone significant changes and “improvements,” even over the last tens of millions of years, which hardly could have been achieved without recruitment of “novel” macromolecules from the existing repertoire in the genome. The elucidation of the neural function for BC1 RNA is of particular interest, because (i) it constitutes the first neuron-specific nonmessenger RNA, and (ii) it exhibits an unusual subcellular distribution. In addition, knowledge of BC1 RNA function will foster studies on the more recent evolution of the nervous system in mammals. This may lead to recognition of parallels between the evolutionary appearance of this novel RNA and structural andlor functional features of the expressing neurons. As Arbas, Meinertzhagen and Shaw (61) have stated in their chapter on evolution in nervous systems, “Evolution is the unifying theme of biological thought. It is therefore surprising that until recently it has little shaped the ideas of those who have sought principles among the cells and circuits of nervous systems.”

Abbreviations BC1 RNA

major small, discrete brain cytoplasmic KNA species related to I D elements in rodents

EVOLUTION AND EXPKESSION OF RODENT

BC2 RNA BC200 RNA SINE I D element D-loop *-loop PSE snRNA tRNA(Ser)Sec MAP2 CaM-KII GAP-43 OCTA

BC1 RNA

87

less abundant I D element-related RNA species that is smaller and more heterogeneous than BC1 RNA small (200-base) brain cytoplasmic RNA related to Alu elements in primates Short INterspersed repetitive Element in DNA a SINE family found in rodents, termed “identifier elements,” initially thought to mark brain-specific genes dihydrouridine loop pseudouridine loop proximal sequence element small nuclear RNA selenocysteine transfer RNA microtubule-associated protein 2 Ca2+/calmodulin-dependent protein kinase type I1 a 43-kDa growth-associated protein octamer transcription factor binding site

REFERENCES 1 . J. 6. Sutcliffe, R. J. Milner, F. E. Bloom and R. A. Lerner, PNAS 79, 4942 (1982). 2 . J. 6. Sutcliffe, R. J. Milner, J. M . Gottesfeld and W. Reynolds, Science 225, 1308 (1984). 3. J. G. Sutcliffe, R. J. Milner, J. M. Gottesfeld and R. A. Lerner, Nature 308, 237 (1984). 4. R. J. Milner, F. E. Bloom, C. Lai, R. A. Lerner and J. G. Sutcliffe, PNAS 81, 713 (1984). 5. G. P. Owens, N. Chaudhari and W. E. Hahn, Science 229, 1263 (198s). 6. P. L. Deininger, in “Mobile DNA: SINES Short Interspersed Repeated DNA Elements in Higher Eucaryotes” (M. Howe and D. Berg, eds.), p. 619. American Society for Microbiology, Washington, D. C., 1989. 7 . T. M. DeCliiara and J. Brosius, PNAS 84, 2624 (1987). 8. J. A. Martignetti, Ph.D. Thesis. The Mount Sinai School of Medicine, City University of New York, New York, 1992. 10. J . Kim, J. A. Martignetti, M . R. Shen, J. Brosius and P. Deininger, PNAS 91, 3607 (1994). 11. K. Anzai, S. Kobayashi, Y. Suehiro and S . Goto, Mol. Brain Res. 2, 43 (1987). 12. J. B. Watson and J. G. Sutcliffe, M C B 7, 3324 (1987). 13. J. A. Martignetti and J. Brosius, PNAS 90, 11.563 (1993). 14. J. A. Martignetti and J. Brosius, PNAS 90, 9698 (1993). 15. D. Graur, W. A. Hide and W.-H. Li, Nature 351, 649 (1991). 16. G. R. Daniels and P. L. Deininger, Nature 317, 819 (1985). 17. T. Russo, F. Constanzo, A. Oliva, R. Ammendola, A. Duilio, F. Eposito and F. Cimino, EJB 158, 437 (1986). 18. P. L. Deininger and 6 . R. Daniels, Trends Genet. 2, 76 (1986). 19. N. Okada and K. Ohshima, /, ,4402. Eool. 37, 167 (1993). 20. J. Brosius, Science 251, 753 (1991). 21. A. M. Weiner, P. L. Deininger and A. Efstratiadis, ARB 55, 631 (1986). 22. P . L. Deininger, M. A. Batzer, C. A. Hutchison, 111 and M. H . Edgell, Trends Genet. 8, 307 (1992).

88

PRESCOTT L. DEININGER ET AL.

23. P. L. Deininger and M. A. Batzer, in “Evolutionary Biology: Evolution of Retroposon” (M. H. Hecht, R. J. MacIntyre and M. T. Clegg, eds.), Vol. 27, p. 157. Plenum, New York, 1993. 24. C. Sapienza and B. St.-Jacques, Nature 319, 418 (1986). 26a. C. Schmid and R. Maraia, Curr. Opin. Genet. Den 2, 874 (1992). 26b. J. Kim, D. H. Kass and P. L. Deininger, NARes 23, 2245 (1995). 26c. P. Jagadeeswaran, B. G. Forget and S. M. Weissman, Cell 26, 141 (1981). 27. R. J. Maraia, NARes 19, 5695 (1991). 28. R. D. Mckinnon, P. Danielson, M. A. D. Brow, F. E. Bloom and J. 6.Sutcliffe, MCBiol7, 2148 (1987). 29. H. Tiedge, R. T. Fremeau, Jr.. P. H. Weinstock, 0. Arancio and J. Brosius, PNAS 88,2093 (1991). 30. H. Tiedge, W. Chen and J. Brosius, J. Neurosci. 13, 2382 (1993). 31. H. Tiedge, U. C. Drager and J.Brosius, Neurosci. Lett. 141, 136 (1992). 32. H. Tiedge, A. Zhou, N. Thorn and J. Brosius, J. Neurosci. 13, 4214 (1993). 33. D. Murphy, A. Levy, S. Lightman and D. Carter, PNAS 86, 9002 (1989). 34. G. F. Jirikowski, P. P. Sanna and F. E. Bloom, PNAS 87, 7400 (1990). 35. E. Lehman, J. Hanze, M. Pauschinger, D. Ganten and R. E. Lang, Neurosci. Lett. 111, 170 (1990). 36. J. T. McCabe, E. Lehman, N . Chastrette, J. Hanze, R. E. Lang, D. Ganten and D. W. Pfaff, Mol. Brain Res. 8, 325 (1990). 37. E. Mohr, A. Zhou, N. A. Thorn and D. Richter, FEBS Lett. 263, 332 (1990). 38. E. Mohr and D. Richter, Eur. J. Neurosci. 4, 870 (1992). 39a. J. Brosius and S. J. Gould, PNAS 89, 10,706 (1992). 39b. J. Martignetti and J. Brosius, MCBiol 15, 642 (1995). 40. S. Murphy, C. Di Liegro and M. Melli, Cell 51, 81 (1987). 41. P. Carbon, S. Murgo, J.-P. Ebel, A. Krol, G. Tebh and I. W. Mattaj, Cell 51, 71 (1987). 42. W. Chen, Ph.D. Thesis. City University of New York, New York, 1994. 43. D. R. Makowski, R. A. Haas, K. P. Dolan and D. Grunberger, NARes 11, 8609 (1983). 44. S. Murphy, B. Moorefield and T. Pieler, Trends Genet. 5, 122 (1989). 45. P. Carbon and A. Krol, EMBO J. 10, 599 (1991). 46. 0. Steward and G . A. Banker, Trends Neurosci. 15, 180 (1992). 47. 0. Steward and W. B. Levy, J. Neurosci. 2, 284 (1982). 48. 0. Steward and T. M. Reeves, J. Neurosci. 8, 176 (1988). 49. L. Davis, G. Banker and 0. Steward, Nature 330, 447 (1987). 50. L. Davis, B. Burger, G . Banker and 0. Steward, J. Neurosci. 10, 3056 (1990). 51. C. C. Garner, R. P. Tucker and A. Matus, Nature 336, 674 (1988). 52. D. W. Cleveland, Cell 60, 701 (1990). 53. K. E. Burgin, M. N. Waxham, S. Rickling, S. A. Westgate, W. C. Mobley and P. T. Kelly, J. Neurosci. 10, 1788 (1990). 54. M. B. Kennedy, Cell 59, 777 (1989). 55. T. Furuichi, D. Simon-Chazottes, I. Fujino, N. Yamada, M. Hasegawa, A. Miyawaki, S. Yoshikawa, J.-L. Gunet and K. Mikoshiha, Receptors and Channels 1, 1124 (1993). 56. E. R. Torre and 0. Steward, J. Neurosci. 12, 762 (1991). 57. A. Rao and 0. Steward, J. Neurosci. 11, 2881 (1991). 58. S. Kobayashi, S. Goto and K. Anzai, JBC 266, 4726 (1991). 59. S. Kobayashi, N. Higashi, K. Susuki, S. Goto, K. Yumoto and K. Anzai,JBC 267, 18,291 (1992). 60. J. G . Cheng, H. Tiedge and J. Brosius. SOC. Neurosci. Abstr. 18, 624 (1992). 61. E. A. Arbas, I. A. Meinertzhagen and S. R. Shaw, Annu. Rev. Neurosci. 14, 9 (1991). 62. K. Murase, P. Ryu and M. Rudic, Neurosc. Lett. 103, 56 (1989).

Nutritional and Hormonal Regulation of Expression of the Gene for Malic Enzyme’ ALAN G. GOOD RIDGE,^ STEPHENA. KLAUTKY, DOMINICA. FANTOZZI, REBECCA A. BAILLIE, DEANW. HODNETT, WEIZUCHEN, DEBBIEC. THURMOND, GANGXu AND CESARRONCERO

’

Department of Biochemistry Uniuersity of ~ o w a Iowa City, Zowa 52242

I. Nutritional State Regulates Fatty-acid Synthesis and Activities of Lipogenic Enzymes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. The Animal Model.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Physiological Mechanisms ............ IV. Molecular Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Mechanisms for Regulating Transcription .............. VI. Chromatin Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........ VII. &-Acting Elements in the Malic-enzyme Gene VIII. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90 90 91 93 99 108 112 120 121

Abbreviations: T3, 3,3 ,5-triiodo-~-thyronine; IFG-1, insulin-like growth-factor-1; CAT, chloramphenicol acetyltransferase; CMV, cytomegalovirus; RSV, Rous sarcoma virus; LTR, long terminal repeat; HSV-TK, herpes simplex virus thymidine kinase; CPT, chlorophenylthio; HNF4, hepatic nuclear hctor 4; MLTF, major late transcription factor, CREB, cyclic-AMP response-element binding protein; TJRE, T3 response element. 2 To whom correspondence may be addressed. Progress in Nucleic Acid Research and Moleciilar Biology. Val. 52

89

Copyright Q 1996 by Acdemic Press, Inc. All rights of reproduction in any form reserved.

90

ALAN G . GOODRIDCE ET AL.

1. Nutritional State Regulates Fatty-acid Synthesis and Activities of Lipogenic Enzymes

The de novo synthesis of long-chain fatty acids is high in well-fed animals, especially if their diets contain high percentages of carbohydrate, and is low in starved animals and in those fed diets with a low percentage of carbohydrate (1, 2). Similarly, activities of the “lipogenic” enzymes are high in animals on high-carbohydrate diets and low in starved animals or those on lowcarbohydrate diets ( 1 , 2). We have concentrated on two lipogenic enzymes, malic enzyme and fatty-acid synthase. We work mainly with liver because this organ is the primary anatomic site for the de novo synthesis of fatty acids in birds (3-6). Malic enzyme [L-malate-NADP+ oxidoreductase (decarboxylating), E C 1.1.1.401 catalyzes the oxidative decarboxylation of malate to pyruvate, simultaneously generating NADPH from NADP+ . Fatty-acid synthase (EC 2.3.1.85)is a multifunctional polypeptide that catalyzes the final reactions in the synthesis of long-chain fatty acids. Starting with a primer of one molecule of acetyl-CoA, the enzyme catalyzes condensation with one molecule of malonyl-CoA, producing a compound lengthened by two carbons plus a molecule of CO,. The lengthened chain is then reduced with two molecules of NADPH and dehydrated. This process is repeated seven times, thus producing a molecule of the 16-carbon saturated fatty acid, palmitate, seven molecules of CO,, eight molecules of CoA, and 14 molecules of NADP+. In chicken liver, virtually all of the 14 molecules of NADPH required for this reaction are furnished by the reaction catalyzed by malic enzyme (3, 6). In this essay, we concentrate on malic enzyme.

II. The Animal Model Our objectives are to understand the physiologic and molecular mechanisms by which nutritional state regulates hepatic fatty-acid synthesis. An ideal system for this analysis would display a low basal rate of fatty-acid synthesis in the starved state and a high rate in the fed state. Unfortunately, enzyme activity or enzyme concentration decreases slowly, and starved animals do not survive long enough to achieve the basal state. We circumvented this problem by using unfed, newly hatched chicks as our model. The embryonic chick develops in a low-carbohydrate, high-fat environment; the rate of fatty-acid synthesis and the activities of the lipogenic enzymes are low in the liver and adipose tissue (5-6). Chicks feed on a mash diet high in carbohydrate and low in fat almost immediately after they hatch. Furthermore, newly hatched birds grow rapidly, depositing most of the

91

MALIC ENZYME GENE

I:.;c

MALIC

FATTY ACID SYNTHESIS

2

3 s

0

a

I

ENZYME

\ 2

K

1600

i x

I \ I \ I \ I \

30

z

90

I I

6

400

'?

1

3 5 7

::.

;

PO

0

w

2

a w U

-I

0

-. .

20 22 %4 26 28

5

\

f+ 200

II

I

0

I

1

3

5 7

,

I

,

20 22 24 26 28

i

AGE (DAYS)

FIG. 1. Correlation between fatty-acid synthesis from glucose and malic-enzyme activity in the liver of neonatal chicks. Left panel: incorporation of [U-f*C]glucose into total fatty acids in liver slices. Right panel: total activity ofhepatic malic enzyme. Results are for normally fed birds (0-0), starved birds (O...O) , and birds refed after 2 or 3 days ofstarvation (x---x). Each point represents the average of 4 to 11 experiments (from Ref. 24, with permission). TPN (NADP), Triphosphopyridine iiucleotide (nicotinamide adenine dinucleotide phosphate).

stored calories as fat. Selected meat chickens can grow from 50 g at hatching to 2 kg in 7 weeks and contain more than 85% of calories as triacylglycerol. When hatched chicks are fed, the rate of fatty-acid synthesis increases rapidly to 500 to 1000 times that in unfed chicks (5). Concomitantly, malicenzyme activity increases 70-fold (Fig. 1) (6). There is little or no fatty-acid synthesis or lipogenic enzyme activity in adipose tissue of fed chicks (5, 6). Both the increased rate of hepatic fatty-acid synthesis and malic-enzyme activity are regulated by feeding, not by developmental state, because both processes were inhibited when food was withheld from newly hatched chicks or withdrawn from fed chicks (7).Thus, regulation of hepatic enzyme activity in the newly hatched chick is similar physiologically to that observed in adult animals undergoing the transition between the starved and fed states.

111. Physiological Mechanisms

A. Insulin and Glucagon Macronutrients in the diet, or the products of their digestion in the gut, regulate the secretion of hormones that, in turn, regulate metabolic function in the liver and other organs. One of our goals has been to identify the humoral factors that regulate hepatic malic-enzyme activity during the tran-

92

ALAN G . GOODRIDGE ET AL.

sition between the starved and fed states. Insulin stimulates, and glucagon inhibits, fatty-acid synthesis both in vivo and in isolated liver preparations. Furthermore, administration of insulin in vivo causes an increase in the activity of malic enzyme, and glucagon blocks the increase in malic-enzyme activity caused by refeeding starved animals (1,2). Consistent with roles in the regulation of fatty-acid synthesis, plasma insulin levels are elevated in fed animals and lowered in starved ones. The opposite pattern is true for glucagon (8-13). Thus, insulin and glucagon are candidates to communicate the state of alimentation of the whole animal to its liver.

B. Thyroid Hormone The activity of hepatic malic enzyme and hepatic lipogenesis are elevated in hyperthyroid animals and decreased in hypothyroid animals (14, 15). Plasma levels of T3, the active form of thyroid hormone, are increased by feeding and decreased by starvation (13, 16,17).Thus, T3 is also a candidate to mediate the effects of diet on malic enzyme activity.

C. Unesterified Long-chain Fatty Acids Hormones may not be the only agents that regulate malic enzyme activity in the liver. The blood levels of several metabolic fuels are regulated by dietary state and are potential candidates for regulating fatty-acid synthesis in the liver. For example, the concentration of unesterified fatty acids in the blood is increased by starvation and decreased by feeding. Unesterified long-chain fatty acids inhibit the rate of fatty-acid synthesis in isolated hepatocytes (18).Furthermore, long-chain fatty-acyl-CoAs, which are direct metabolites of unesterified fatty acids, inhibit the activity of acetyl-CoA carboxylase, the probable rate-limiting enzyme for the de nmo synthesis of long-chain fatty acids (19-21). Thus, unesterified fatty acids also are potential regulators of hepatic malic-enzyme activity.

D. Development of a Cell-culture System When we began our studies, the direct effects of insulin, glucagon, T3, or long-chain fatty acids on malic-enzyme activity in hepatocytes had not been analyzed. The increase in malic-enzyme activity is not maximal for several days after feeding is initiated. Preparations of hepatocytes then in use survived only a few hours, so the direct effects of humoral agents on hepatic malic-enzyme activity could not be tested. With the technical advice of a colleague, B. P. Schimmer of the Banting and Best Department of Medical Research, University of Toronto, we developed a tissue-culture system for chick embryo hepatocytes in which the direct effects of hormones and fuels could be tested. Initially, these studies utilized a medium enriched with serum, but later we switched to a chemically defined but highly enriched

93

MALIC ENZYME GENE

TABLE I MALIC ENZYME IN HEPATOCYTES IN CULTURE^ Measurement

No addition

Insulin

T3

Enzyme activity Enzyme synthesis Transcription rate

1 1 1

2

40

1 1

73

-

+

Insulin T3

Insulin T3 glucagon

120 125

3

50

2

+

+

4

0 Hepatocytes were isolated from the livers of 19-day-old chick embryos and incubated in serum-free Waymouth medium MD 705/1 containing no additions, insulin (300 ng/ml), T3 (1pgiml), insulin plus T3, or insulin plus T3 plus glucagon (1 pglml) for 3 days. Enzyme activities and rates of enzyme synthesis were determined as described (27) and then recalculated as fold-increases by setting the results for hepatocytes incubated without hormones at 1.0. In the transcription experiments, cells were incubated with insulin (300 nglml) for about 20 hours. The medium was then changed to one of the same composition with or without T3 (1 pgfml) or glucagon (1pglml). This change in protocol did not change the magnitude of the effects of the hormones on malic-enzyme activity. Transcription rates were determined as described ( 4 1 )and then recalculated as fold-increases by setting the values for insulin alone to 1.0.

medium, Waymouth MD 705 (22).With or without serum in the medium, T3 and insulin increased and glucagon inhibited the rate of fatty-acid synthesis and the activity of malic enzyme (Table I) (23-25). Stearate inhibited the increases in fatty-acid synthesis and malic-enzyme activity in cells incubated with serum- and albumin-supplemented medium (23).Rapid metabolism of unesterified long-chain fatty acids in hepatocytes made it difficult to analyze their effects. In subsequent experiments, described later (Section V, C, 2) in this essay, transcription of the malic-enzyme gene was assayed during short incubations and was inhibited by unesterified long-chain fatty acids. These results are consistent with insulin and T3 being humoral agents that mediate stimulation of malic-enzyme activity by the fed state, and glucagon and unesterified long-chain fatty acids being humoral agents that communicate the starved state to the liver.

IV. Molecular Mechanisms A. Strategy For hormones that regulate hepatic malic-enzyme activity during the transition between the starved and fed states, we want to determine the molecular nature of each event between binding of the hormone to its relevant hepatic receptor and altered malic-enzyme activity. For fuels such as fatty acids, we want to determine the molecular nature of events between uptake of the fuel by the hepatocyte and altered malic-enzyme activity. This

94

ALAN G . GOODRIDGE ET AL.

includes determining whether the active signaling molecule is the fuel molecule itself or a molecule produced during the metabolism of the fuel. If the latter, we want to identify the metabolic intermediate that is the active signaling molecule. In other words, we want to define each of the intracellular signaling pathways that regulate malic enzyme activity. Our strategy has been to work backward along the signaling pathways, starting with the change in enzyme activity.

B. Enzyme Activity The activity of an enzyme can be regulated by controlling the catalytic efficiency of that enzyme, for example, by allosteric mechanisms or covalent modifications. Alternatively, enzyme activity can be regulated by controlling the number of enzyme molecules per cell. Chicken malic enzyme was purified, and a rabbit antibody was raised against the purified enzyme. Using immunological techniques, we showed that the change in activity that accompanies the increase in malic-enzyme activity when newly hatched chicks are fed is due exclusively to an increase in the concentration of malic enzyme in the liver (26).In culture, the increases in activity caused by insulin and T3 and the decrease in activity caused by glucagon also were due to altered enzyme concentration (27).

C. Enzyme Concentration The concentration of an enzyme can be regulated by controlling the rate constants for either synthesis or degradation of that enzyme. Using the antibodies raised against chicken malic enzyme as reagents for rapid purification of the enzyme, we measured the rate constants for synthesis and degradation of hepatic malic enzyme in newly hatched chicks that were fed or unfed, and in hepatocytes in culture that were treated with no hormone or with insulin, T3, insulin plus T3, or insulin plus T3 plus glucagon. Degradation of malic enzyme was unaffected by either dietary manipulation in vivo (26) or hormonal manipulation in culture (27). The magnitudes and directions of the changes in rates of synthesis of malic enzyme were the same as those for malic-enzyme concentration during both dietary manipulations in uivo and hormonal manipulation in culture (Table I). Thus, the concentration of malic enzyme is controlled by regulating its rate of synthesis.

D. Enzyme Synthesis Synthesis of an enzyme can be regulated by controlling either the abundance of the mRNA for that enzyme or the efficiency with which that specific mRNA is translated into protein. We cloned the cDNA for avian malic enzyme and used that cDNA in hybridization assays to determine the abun-

95

MALIC ENZYME GENE

dance of malic-enzyme mRNA in starved and fed chicks and in hepatocytes in culture incubated with no hormones or various combinations of insulin, T3, and glucagon. The nutritioiially and horinvnally induced changes in enzyme synthesis were accompanied by comparable changes in the abundance of malic-enzyme mRNA (Figs. 2 and 3) (28, 29).

E.

Other Animal Models

Similar studies have been carried out using intact rats and rat hepatocytes in culture. Changes in the activity of rat hepatic malic enzyme caused by starvation, refeeding, high-carbohydrate diets, induction of the diabetic state, and treatment of diabetic animals with insulin also correspond primarily to changes in enzyme concentration. These alterations, in turn, are due to changes in the synthesis rate of the rat enzyme that, in turn, are due to changes in abundance of rat malic-enzyme mRNA (30-33). Although the changes are much smaller in magnitude, insulin, T3, and glucagon also regulate the concentration, synthesis, and mRNA abundance of malic enzyme in adult rat hepatocytes in culture (34, 35).

u)

3940-

U

1830-

-= Q)

0

DF-

chick liver RNA FIG. 2. The effects of feeding on the level of hepatic malic-enzyme mRNA. Total polyadenylylated RNA was separated by size by electrophoresis in agarose gels, blot-transferred to nitrocellulose, and hybridized to 3zP-labeled, single-stranded malic-enzyme cDNA. RNA was extracted from the livers of 2-day-old chicks starved from hatching or fed for 24 hours as indicated on the figure. Each lane contained 20 p,g of RNA. DF, Dye front; Or, origin (modified from Ref. 28; taken from 28a, with permission of TheJournal ofNutrition).

96

ALAN G. GOODRIDGE ET AL.

OR-

3940-

620-

DFNORTHERN BLOT HEPATOCYTE POLY(A+I RNA PROBE: MALlC ENZYHE cDNA

FIG. 3. The effects of T3 and glucagon on the level of malic-enzyme mRNA. Total polyadenylylated RNA was separated by size and hybridized to 32P-labeled malic-enzyme cDNA as described in the legend to Fig. 2. RNA was extracted from hepatocytes isolated from the livers of 19-day-old chick embryos and incubated in serum-free Waymouth medium MD 705/1 containing insulin (300 ng/ml) (control), insulin plus T3 (1 pg/ml), or insulin plus T3 plus glucagon (1 pg/ml) for 3 days. Each lane contained 20 pg of RNA. DF, Dye front; Or, origin (modified from Ref. 28; taken from Ref. 28a with permission of The Journal of Nutrition).

F. Abundance of mRNA 1. APPEARANCERATE OF CYTOPLASMIC mRNA Our next objective was to determine the mechanism by which diet and hormones regulate mRNA abundance. This is formally similar to the analysis of mechanisms involved in regulation of enzyme concentration; the abundance of an mRNA can be regulated by controlling either synthesis or degradation of that mRNA. In actuality, however, it is somewhat different because

97

MALIC ENZYME GENE

the mRNA population relevant to synthesis rate of an enzyme is cytoplasmic mRNA. Synthesis of mRNA takes place in the nucleus. Thus, the abundance of a cytoplasmic mRNA is a function of its rate of appearance in the cytoplasm and its rate of degradation. The appearance rate of cytoplasmic mRNA is controlled by nuclear processes, including transcription of the gene, processing the primary transcript, and transport of the mature mRNA from the nucleus to the cytoplasm. We first examined degradation of cytoplasmic mRNA.

2. DEGRADATION OF mRNA in Vivo A kinetic method was used to estimate the half-life of hepatic malicenzyme mRNA in fed and starved chicks. The extent of the difference between basal and induced levels of an enzyme or mRNA is a function of changes in both the synthesis rate and the degradation rate constant (In 2/t,,,) of that enzyme or mRNA. The time to progress from one steady-state concentration to another is exclusively a function of the half-life of the enzyme or mRNA. When the abundance of an mRNA is caused to change, the half-life of that mRNA can be calculated from the rate of approach of mRNA abundance to its new steady state (36). This was determined in birds that were refed after a period of starvation, or in starved birds after ad libitum feeding. The calculated half-life of hepatic malic-enzyme mRNA in fed chicks was 3 to 5 hours; in starved chicks, it was about 1 hour (36, 37). This result suggested that part of the more than 50-fold increase in mRNA level could be attributed to regulation of the rate constant for degradation of malic-enzyme mRNA.

3. DEGRADATION OF mRNA

IN

CULTURE

We used a similar approach to estimate the half-life of malic-enzyme mRNA in hepatocytes in culture incubated with and without glucagon (38). Malic enzyme mRNA decayed with a half-life of 8 to 11 hours in cells treated with the transcription inhibitors, actinomycin D or ol-amanitin. In glucagontreated cells, malic-enzyme mRNA decayed with a half-life of 1.5 hours. These results suggested that part of the decrease in malic-enzyme mRNA caused by glucagon was due to an effect on mRNA stability. 4. TRANSCRIPTION in Vivo

We used the transcription “run-on” assay (39) to estimate the rate of transcription. We encountered technical problems in our measurements of the transcription rate of the malic-enzyme gene. A long “GC” tail added to the 5’ end of our largest cDNA during cloning and a small repetitive element in one of our genomic DNAs led to an initial, erroneous, conclusion that diet and hormones have no major effect on transcription of the malic-enzyme

98

ALAN G. GOODRIDGE E T AL.

gene (37). When our DNA probes were freed of repetitive elements, we discovered that diet had a major effect on transcription. Feeding caused a 40to 50-fold increase in transcription of the malic-enzyme gene (Fig. 4); the maximum rate of transcription was achieved within 3 hours after feeding starved chicks. Starvation of fed birds caused an equally rapid decrease in transcription rate (40).The increase in transcription rate caused by feeding was paralleled by a comparable increase in the concentration of nuclear precursors of malic-enzyme mRNA, consistent with a primary action of feeding on transcription of the malic-enzyme gene (40).

A

TJ

al

r

2

m

B 5

u

. I )

0

-

-

t

----/

4.8-5'

ME-4.8-5' ME-4.8-3' puc 19 wt ME-2.6 M13 mp18 Rf wt

P-actin

4.8-3'

Transcription Start Site

2.6

' t

Polyadenylation Signal H

5 kb

FIG. 4. Stimulation of transcription of the malic-enzyme gene by refeeding in chick liver. (A) Nuclei were isolated from livers of 12- to 14-day-oldchicks that were starved for 48 hours and then either starved for an additional 6 hours or fed for 6 hours. Nuclear run-on assays were Strips of Genescreen membrane containing identical amounts of performed as described (40). the indicated probes in slots were hybridized with 2 x 107 cpm/ml each of 3Wabeled nascent RNA isolated from liver nuclei of starved or refed chicks. The membranes were washed and subjected to autoradiography. Vector DNAs (M13mp18RF and pUC 19) were controls for nonspecific hybridization. The cDNA for p-actin was a control for selectivity; the level of hepatic p-actin is unaffected by starvation or refeeding. wt, Wild type; ME, malic enzyme. (B) Location of the various DNA probes within the malic-enzyme gene. Numerical designations of the malicenzyme DNA probes indicate their lengths in kilobases.

MALIC ENZYME GENE

99

5. TRANSCRIPTION IN CULTURE Our earliest measurements of transcription of the malic-enzyme gene in hepatocytes suggested that T3 and gluca.gon have little or no effect (38). Unfortunately, these experiments used a cDNA probe that contained a long G . C tail at the 5’ end. Use of probes free of repetitive elements revealed a 30- to 40-fold stimulation of transcription by T3; 80% of the maximal increase was achieved within 1 hour after adding T3, and ongoing protein synthesis was not required for the effect. The T3-induced increase in transcription was completely blocked by dibutyryl CAMP, an analog of CAMP, the intracellular mediator of the action of glucagon (Fig. 5) (41).These results established that regulation of transcription of the malic enzyme gene was responsible for the effects of T3 and glucagon on malic enzyme activity (Table I). Our results also indicate that the gene for malic enzyme is an immediate-early response gene with respect to the T3- and CAMP-mediated increases in transcription. We also analyzed the role of insulin in the transcriptional response of the malic-enzyme gene to T3 (41). Results of experiments measuring enzyme concentration and enzyme synthesis suggested that insulin had a small positive effect when added to the medium by itselfbut a much larger ampllfying effect on the response caused by T3 (27). Insulin alone had no effect on transcription of the malic-enzyme gene. It amplified the response to T3 in the first few hours after adding T3 but did not alter its maximal effect. The time courses of the responses of the abundance of inalic-enzyme mRNA to T3, or T3 plus insulin, suggested a similar conclusion; in the absence of insulin the T3-induced increase in abundance of malic-enzyme mRNA was delayed but eventually achieved essentially the same maximum level (D. A. Mitchell and A. G. Goodridge, unpublished results). IGF-1 and insulin have similar effects on T3-induced accumulation of malic-enzyme mRNA and transcription of the malic-enzyme gene; IGF-1 acts at more physiological concentrations (41, 42). In vivo, IGF-1 may be more relevant than insulin with respect to the regulation of the malic-enzyme gene. The mechanisms involved in these effects are unknown.

V. Mechanisms for Regulating Transcription

A. Protein Phosphorylation Before beginning an analysis of the promoter regions involved in regulating transcription of the malic-enzyme gene, we investigated the requirements for the T3- and glucagon-mediated responses and the extracellular factors that modulate those responses. When we began the phosphorylation experiments, there was debate as to whether the regulation of transcription

.-c

h

C .-

E

E

0

0

E a

z a I

I

9

5

d U +

d

-0

+

-I

0 [r c

7c!

h

5c! m

iz

iz

l-

TRANSCRIPTION RUN-ON ASSAY HEPATOCYTES IN CULTURE

07

I

,

4.812081 1.5A,7t"r7.5c

7’

H

4 8 1281

7

I

-

H

.

I

2 6

I,

0 6

I

3'[w

= 2 kb:

FIG. 5. Stimulation of transcription of the malic-enzyme gene by T3 and inhibition by dibutyryl CAMP. Chick embryo hepatocytes were isolated as previously described (41). The hepatocytes were incubated for 20 hours in Waymouth medium supplemented with insulin (300 ng/ml). The medium was then changed to one of the same composition. After 42 hours of incubation, T3 (1 kg/mI) was added to some of the plates without a medium change. After an additional 24 hours, dibutyryl CAMP (50 )LM) was added to some plates without a medium change, and the cells were harvested at 24 (control and T3), 24.5, and 26 hours after adding T3 (0,0.5, and 1.0 hours after adding dibutyryl CAMP). Nuclear run-on assays were performed as described (40). Strips of Genescreen membrane containing identical amounts of the indicated probes in slots were hybridized with 2 X lo7 cpm/ml each of3TIabeled nascent RNA isolated from liver nuclei from either starved or refed chicks. The membranes were washed and subjected to autoradiography. ME, Mdic enzyme; FAS, fatty-acid synthase; GAPD,

101

MALIC ENZYME GENE

-

ME-26

. I

M13mpl8Rf 1,

M E -4 a-5

Ir

M E -4 8-31

puc19

C

T3-lh

T3-lh HE-1h

C

T3-24h T3-24h H8-1h

FIG. 6. H8 inhibits T3-induced transcription of the malic-enzyme gene. Chick-embryo hepatocytes were isolated and maintained in culture in Waymouth medium supplemented with insulin (300 nglml). After 48 hours in culture, the medium was changed to one of the same composition with or without T3 (1pg/ml) for 1hour (left panel) or 24 hours (right panel). H8 (25 p M ) was added at the same time as T3 (left panel) or after 23 hours with T3 (right panel). Hepatocytes were harvested and nuclei prepared and incubated in oitro with [32P]UTP as described (46). Labeled transcripts were isolated and hybridized to 2 pg of DNA fixed to GeneScreen membranes. The membranes were washed and subjected to autoradiography. Control DNAs and abbreviations are the same as in the legend to Fig. 5, except that GAD is glyceraldehyde-3-phosphate dehydrogenase DNA and C is control (no T3 or H8) (from Ref. 46, with permission of The Journal of Biological Chemistry). Numerical designations of the malicenzyme DNA probes indicate their lengths in kilobases and can be located on the gene maps in Figs. 4 and 5.

caused by cAMP used a phosphorylation mechanism, as observed for all known effects of cAMP on enzyme activities in eukaryotes, or a proteinbinding mechanism, as observed for the effect of cAMP on transcription in prokaryotes (43). We tested the phosphorylation hypothesis with isoquinoline sulfonamides H8 and H7 (44) and microbial alkaloid (staurosporine) protein-kinase inhibitors (45). We were surprised to find that H8 (Fig. 6) and staurosporine were potent and selective inhibitors of the stimulatory effect of T3 on transcription of the malic-enzyme gene (46). ~~

~~

glyceraldehyde-3-phosphatedehydrogenase. Vector DNAs (Ml3mplSrfand pUC 19)were controls for nonspecific hybridization. P-Actin, glyceraldehyde-3-phosphatedehydrogenase, and fatty-acid synthase cDNAs were controls for selectivity. Transcription rates of the hepatic genes for P-actin and glyceraldehyde-3-phosphate dehydrogenase are unaffected by T3 or CAMP. Transcription of the fatty-acid-synthase gene is stimulated by T3 and cAMP (modified from Ref. 41, with permission of TheJournaZ of BiolugicuZ Chemistry). The map at the bottom ofthe figure indicates the locations of the various DNA probes within the malic-enzyme gene. Numerical designations of the malic-enzyme DNA probes indicate their lengths in kilobases.

102

ALAN G. GOODRIDGE ET AL.

Because induction by T3 is required before the inhibitory effect of CAMP can

be observed, we were unable to test the hypothesis as initially stated. From the work of others, it is now apparent that the positive transcriptional effects of cAMP in vertebrate tissues are mediated by the catalytic subunit of protein-kinase A, the same type of phosphorylation event that mediates the effect of this intracellular signaling agent on enzyme activity (47, 48). The negative effects of cAMP may use the same intracellular signaling pathway, but definitive experimental evidence is lacking. The selective requirement for ongoing phosphorylation suggests that some component of the T3 response machinery must be phosphorylated to be active. It also suggests a potential mechanism by which the T3 response could be regulated by other signaling pathways.

B. Regulation of Responsiveness to T3 1. RESPONSIVENESSTO T3 DECREASES WITH TIMEIN CULTURE

If T3 is added to the culture medium between 20 and 68 hours after the isolated hepatocytes are put into culture, malic-enzyme activity increases 30to 40-fold. This response decreases with time in culture; after 7 days, a 48hour incubation with T3 has no effect on malic-enzyme activity (49). The change in responsiveness of enzyme activity is mediated by a decrease in the ability of T3 to stimulate transcription of the malic-enzyme gene. These results suggest that a protein or metabolite essential for the T3 response is present in excess in vivo before the isolated cells are prepared, but is not made, or is made only very slowly, in the hepatocytes in culture. Alternatively, a negative-acting protein or metabolite may accumulate in hepatocytes in culture. The rate at which responsiveness to T3 decreases (half-life of about 24 hours) is consistent with the rate-limiting component being a protein.

2. GLUCOCORTICOIDS PROLONG RESPONSIVENESS TO T3 Corticosterone has no effect on the activity, mRNA abundance, or transcription rate of the malic-enzyme gene when added in the absence of T3, whether the cells are incubated for 3 or 5 days. In cells incubated with T3 from 20 to 68 hours of incubation, corticosterone has little or no effect on the response to T3. In cells incubated from 68 to 116 hours with T3, however, corticosterone causes a substantial increase in the responsiveness (Fig. 7) (49). The response to T3 is eventually lost whether corticosterone is present or not; it just takes substantially longer when cells are incubated with the glucocorticoid. Intracellular accumulation of long-chain fatty acids or long-chain acylCoAs probably does not cause the loss of responsiveness to T3 or the stimula-

103

MALIC ENZYME GENE

IT3 - 20 to 68 HOURS IT3 - 68to 116 HOURS

FIG. 7. Malic-enzyme activity in hepatocytes treated with T3 for 48 hours and treated with or without corticosterone for the entire incubation period. Hepatocytes were isolated and incubated with insulin (INS, 50 nM) or insulin plus corticosterone (CORT, 1 pM) (49). After 20 hours of incubation, the medium was changed to one of the same composition, and T3 (1.5pM) was added to one set of plates with corticosterone and to one set without corticosterone. At 68 hours of incubation, sets of plates with or without corticosterone and with or without T3 were harvested. At 68 hours of incubation, media in additional sets of plates with or without corticosterone were changed to ones of the same composition with or without T3 (1.5 pM); cells from these sets of plates were harvested at 116 hours. Malic-enzyme activity and DNA were measured as described (49). The results are expressed as units of malic-enzyme activity per milligram of DNA and represent the mean t SE of four experiments, each of which was performed in duplicate (from Ref. 49, with permission of Archives of Biochemistry and Biophysics).

tion of that responsiveness by corticosterone, because adding 0.5%serum albumin (to lower the concentration of unbound fatty acids) or long-chain fatty acids (0.25-0.5 mM) to the medium is without effect at the concentrations of T3 used in these experiments. Nuclear binding of T3 did not decrease in these cells in the absence of corticosterone, nor did corticosterone cause an increase in T3 binding. Thus, changes in the levels of the T3 receptor are unlikely to be involved in the loss of responsiveness that occurs as a function of time in culture or in the increase in responsiveness caused by corticosterone. Our working hypothesis is that corticosterone stimulates production of a

104

ALAN G . GOODRIDGE ET AL.

factor required for T3 responsiveness, or inhibits production of an inhibitor of that process. In preliminary experiments, the glucocorticoid-sensitivecisacting element appears in the same 200-bp fragment of the malic-enzyme gene that mediates the T3 response. Identification of the corticosteroneregulated factor may provide a greater understanding of the factors involved in the ability of the ligand-bound T3 receptor to stimulate transcription of linked genes.

3.

CARNITINE PROLONGS THE RESPONSIVENESS TO

T3

Carnitine, a cofactor involved in the oxidation of fatty acids, also stimulates responsiveness to T3 (49). The effects of carnitine and corticosterone are at least additive and possibly synergistic, suggesting different mechanisms. Carnitine may increase the rate of fatty-acid oxidation, suggesting that a fatty-acid metabolite may regulate responsiveness to T3. Alternatively, a metabolite, the concentration of which is controlled by the rate of fatty-acid oxidation, may regulate responsiveness to T3.

C. Unesterified Fatty Acids Inhibit T3-induced Tra nscription 1. LONG-CHAIN FATTYACIDS

Unesterified long-chain fatty acids inhibit the de nmo synthesis of fatty acids in hepatocytes incubated in simple solutions of buffered salts (50). In addition, the levels of fatty-acyl-CoAs, the immediate product of fatty-acid activation in hepatocytes, are elevated by starvation or induction of diabetes (51-53) or in hepatocytes in culture treated with glucagon (50),all conditions associated with inhibition of fatty-acid synthesis. The activity of the probable pace-setting enzyme in fatty-acid synthesis, acetyl-CoA carboxylase, also is inhibited by fatty-acyl-CoA (19-21). These observations suggest that the concentration of plasma unesterified fatty acids may play an important role in regulating-fatty acid synthesis. It seems reasonable, therefore, that these agents may regulate transcription of the lipogenic genes. However, at the concentrations of T3 used in most of our experiments, unesterified long-chain fatty acids have no effect on transcription of the malic-enzyme gene, even under conditions where the concentration of the fatty acid is unlikely to be affected significantly by its relatively rapid rate of metabolism (49, 54).The concentration of T3 in our experiments was 1.6 p,M, about 1 0 3 higher ~ than that required to saturate the T3 receptor. In our early experiments we measured enzyme activity. Due to the enzyme’s long half-life, the hepatocytes had to be incubated with T3 for 2 or 3 days to achieve a substantial degree of induction. T3 is degraded rapidly in serumfree Waymouth medium; after 24 hours in culture-with or without cells-

MALIC ENZYME GENE

105

T3 is undetectable in the medium (A. G. Goodridge, unpublished results). In order to maintain a significant level of hormone for a prolonged period of time, we add high concentrations of the hormone to the hepatocytes in culture. When we discovered that T3 caused near maximal induction of transcription of the malic-enzyme gene within 2 hours after adding the hormone, we performed a series of experiments at physiological concentrations of T3. Binding of T3 to its nuclear receptor and transcription of the malic enzyme gene were measured in parallel tissue-culture plates during a %hour incubation with T3. The dose-response relationships between binding of T3 to its receptor (Fig. 8A) and T3-mediated stimulation of transcription of the malicenzyme gene (Fig. 8B) are very similar. Furthermore, at 200 pM T3 (sufficient to occupy 80% of the nuclear T3 receptors), 0.5 mM dodecanoate inhibited both transcription of the malic-enzyme gene and binding of T3 to its nuclear receptor (Fig. 9). Long-chain fatty acids and their acyl-CoA derivatives inhibit binding of T3 to its nuclear receptor (55, 56). Their effects are competitive with T3, so that it is unlikely that inhibition would be observed at concentrations of T3 ~ than those necessary to saturate the receptor. This may that are 1 0 3 greater explain why long-chain fatty acids are inhibitory at physiological concentrations of T3 but ineffective at high concentrations. Our results suggest that, in vivo, the changes in plasma levels of unesterified fatty acids that occur during the transitions between the fed and starved states may play important roles in the regulation of transcription of the lipogenic genes.

2. MEDIUM-CHAIN FATTY ACIDS When we initiated the analysis of the actions of fatty acids, we were concerned about maintaining effective concentrations of long-chain fatty acids during incubations of 1or 2 days duration. The physiologically important long-chain fatty acids are quite insoluble in aqueous media. To achieve even the modest concentrations found in the plasma of fed animals, it is necessary to bind the fatty acids to albumin. Medium-chain fatty acids such as hexanoate or octanoate, on the other hand, are much more soluble; it is possible to achieve concentrations of 5 or 10 mM without adding albumin. As a result, we decided to test the effects of unesterified medium-chain fatty acids in our cells. Octanoate and hexanoate inhibit the induction of malic-enzyme activity by T3 (54). These effects are mediated at the level of transcription and are manifest within 30 minutes after adding the fatty acid. Inhibition by such fatty acids is specific with respect to the structure of the fatty acid, selective with respect to the genes that are inhibited, and readily reversible by changing the medium to one lacking the fatty acid (54). Saturated fatty acids with

106

ALAN G. GOODRIDGE E T AL.

0

0.4

0.8

1.2

1.6

2.0

[T31 (W [T3](nM)

0.0

0.16

1.6

-

16

0

-

160

1600

C

1) -ME-2.6 -M13mp18Rf

I )

I) I ,

-ME-4.8-5'

rD,

-ME-4.8-3'

- pUC19 +

=am-

c

- FAS - GAD

FIG.8. T3 binding to nuclear receptor (A) and transcription of the malic-enzyme and fattyacid synthase genes (B) as a function of T3 concentration. Hepatocytes were isolated and incubated for 3 days with insulin (50 nM) plus corticosterone (1p M ) (49). On day 3, the medium was changed to one of the same composition; 1 hour later, [ lz5I]T3 (A) or unlabeled T3 (B) was added for an additional 2 hours at the concentrations indicated. After the cells were harvested, nuclei were isolated and assayed for radioactivity (A) or transcription (B). Assay procedures were as described (49). Nonspecific binding was measured by simultaneous incubation of cells with a 1000-fold molar excess of nonradioactive T3. Nonspecific binding was less than 5% of total binding (at 200 pM T3) and was subtracted from total binding to obtain the specific binding shown in A. Numerical designations of the malic-enzyme DNA probes indicate their lengths in kilobases and can be located on the gene maps in Figs. 4 and 5 .

chain lengths of six to eight carbons are the most effective inhibitors. Butanoate and decanoate are less effective than hexanoate or octanoate. 2-Bromooctanoate, 2-bromopyruvate, six- and eight-carbon dicarboxylates, and branched-chain fatty acids and keto acids derived from the metabolism of

107

MALIC ENZYME GENE

-

-ME -2.6

-

- Ml3mpl8Rf -ME -4.8-5’

-ME -4.8-3’

Br-

-puc 19 FAS

i l l ) -

o r n u

r

e

12%T3 BOUND. ....263f7 238+1

-

-

-GAD

- PACTIN

139+16 62k12

FIG 9 Inhibition of the transcription of the malic enzyme and fatty acid synthase genes in the presence of 200 pM T3. Hepatocytes were isolated and incubated in a chemically defined medium containing insulin (50 nM) and corticosterone (1 K M ) (54). At about 20 hours of incubation, the medium was changed to one of the same composition. At 66 hours of incubation, T3 (200 pM), with or without fatty acid (0 5 inM), was added to the incubation medium The cells were harvested at 68 hours of incubation Isolation of nuclei, transcription run-on assays, and binding of [IZ5I]T3to nuclear receptors wa\ carried out as described in the legend to Fig. 8. Results of the binding assay5 are expressed as femtomoles T3 bound per milligram of DNA HEX, Hexanoate, OCT, octanoate, DODEC, dodecanoate, ME, malic enzyme, M13inp18Rf. replicative form of M13mp18 vector DNA, FAS, fatty-acid synthase, GAD, glyceraldehyde-3phosphate dehydrogenase (from Ref 54, with permission of The Journal of Biological Chemistry). Numerical designations of the malic-enzyme DNA probes indicate their lengths in kilobases and can be located on the gene maps in Fig\. 4 and 5

branched-chain amino acids are slightly stimulatory, ineffective, or only slightly inhibitory (54). Subsequently, we tested a wide variety of related compounds for their effects on the stimulation of malic-enzyme activity by T3. Compounds with inhibitory effects similar in magnitude and potency to those of mediumchain fatty acids are those that are structurally similar to hexanoate or octanoate or that can b e converted to hexanoate or octanoate by intracellular metabolism (Table 11). Despite the fact that medium-chain fatty acids are not present in chicken blood at concentrations that inhibit transcription of the malic-enzyme gene, this inhibition may reflect a physiological regulatory mechanism. The mechanism by which medium-chain fatty acids regulate transcription of the malic enzyme gene is probably different from that for long-chain fatty acids. At 200 pM T3, hexanoate has no effect on binding of T3 to its

108

ALAN G. GOODRIDGE ET AL. TABLE I1

EFFECTOF 0.5 MM OF Inhibit (>50%)

Hexanoate Hexanal Heptanoate Octanoate Octanal 1-Octanol 2-Hydroxyoctanoate Lipoate Dihydrolipoate Monooctanoylglycerol Octyl-P-glucoside 1,2-Dioctanoylglycerol 1,3-Dioctanoylglycerol

COMPOUND ON

MALIC-ENZYME ACTIVITY*

No effect (<25%)

Minor effect (25-40%)

Butyrate Suberate Adipate Bromohexanoate 2-Octanol Bromooctanoate Monooctanoyl-PC 2-Aminooctanoate 8-Aminooctanoate Lipoamide Octanoylcarnitine Dioctanoyl-PC Benzoate Dithiothreitol Acetylsalicylate Octyl-P-thioglucoside

Monobutyrin Decanoate Decanoylglycine Butyrylcholine 2-Octynoate Octyl-a-glucoside

0 Chick embryo hepatocytes were isolated as previously described (41). The hepatocytes were incubated for 20 hours in Waymouth medium supplemented with insulin (300 nglml). At 24 hours, the medium was changed to one of the same composition, with or without T3 (1 pglml) or the indicated compounds at 0 . 5 mM. The cells were harvested at 48 hours and assayed for malic-enzyme activity and protein (27).

nuclear receptor, and octanoate has only a small effect (Fig. 9). Furthermore, the medium-chain fatty acids are equally inhibitory at 1.6 p M and 200 pit4 T3. Inhibition by hexanoate is partially reversed by adding carnitine to the medium, consistent with the active inhibitor being a product of the metabolism of octanoate (54).

VI. Chromatin Structure A. DNase-I Hypersensitivity in Vivo In intact chromatin, constitutive and inducible DNase-I-hypersensitive sites often indicate sites of protein-DNA interactions involved in the regulation of transcription (57). We mapped four such regions centered at about -200 bp, -1.65 kb, -3.1 kb, and -3.8 kb in the 5'-flanking DNA of the malic-enzyme gene (40, 58). Regions 2, 3, and 4 are detected in nuclei from liver, but not in nuclei from heart, brain, or kidney; DNase-I hypersensitivity of these regions is the same in hepatic nuclei from fed and starved

MALIC ENZYME GENE

109

animals. In contrast, region 1, which includes the start site for transcription, is barely detectable in hepatic nuclei from starved chicks. This region becomes very sensitive to DNase-I in hepatic nuclei from fed chicks. DNase-I hypersensitivity of region 1in nuclei from heart, brain, and kidney of fed or starved chicks is similar to that in hepatic nuclei from starved chicks. This result is consistent with a low and unregulated rate of transcription of the malic-enzyme gene in the nonhepatic organs. Feeding causes a 40- to 50-fold increase in transcription that achieves a new steady state about 3 hours after feeding the animals. Substantial stimulation of transcription is observed within 1.5 hours after feeding is started (40). DNase-I hypersensitivity of region 1 increases with kinetics similar to that for the increase in transcription. By 4.5hours after the withdrawal of food from fed birds, transcription has returned to the level in birds starved 24 or 48 hours. A decrease in DNase-I hypersensitivity at region 1 occurs with the same kinetics. These changes in chromatin structure appear to take place in the absence of replication. The hypersensitive region near the start site for transcription was analyzed in more detail using both DNase-I and specific endonucleases, and shorter parent fragments were used to increase the accuracy with which we could map the positions. Mapping of chromatin with DdeI and HueIII revealed sites at +72 bp and at -320 bp and further upstream that are not affected by nutritional state. Other DdeI and HueIII sites between -50 and -245 bp, as well as DNase-I sites at -220, -170, -130, and -70 bp, are altered by feeding (Fig. 10) (58). Using micrococcal nuclease, we detected nucleosomes associated with this region of DNA in the hepatic nuclei of starved birds, but not in those from fed chicks. The length of the region of DNA that undergoes nutritionally regulated changes in chromatin structure is sufficient to accommodate two nucleosomes. Thus, feeding may induce removal of or substantial alteration in the structure of two nucleosomes positioned at and just upstream of the start site for transcription (58). The observed change in chromosome structure could cause or be caused by the accompanying change in transcription of the malic-enzyme gene. Unfortunately, transcription rates and DNase-I hypersensitivity were measured in the livers of different birds. When refeeding starved birds, it takes different amounts of time for each bird to find food and begin to eat. In addition, there may have been significant bird-to-bird differences in the time for transit of food from mouth to small intestine. Furthermore, when initiating a period of starvation, it is difficult to withdraw food from each bird at the same time relative to its last period of food intake. As a consequence, we did not determine whether the changes in chromatin structure preceded the changes in transcription or vice versa.

A

Dde I (Ulml)

400

F’

2000 4000

SvF

SvF

s'

Pml I

.

-700

\

--

I

+loo Probe

[+300 Pml I -500

4 bL

-390

-1po

+1

I

po

-

DNase I hypersensitivity

b1

Restriction endonuclease sensitivity

FIG.10. Accessibility of the proximal region of the promoter of the malic-enzyme gene to restriction endonucleases. (A) Nuclei from livers of chicks starved for 24 hours or fed continuously were treated with DdeI at the indicated concentrations for 1 hour on ice. After purification, D N A (10 kg) from starved (lanes S) or fed (lanes F) chicks was digested with PouII to completion, separated on a 3% NuSieve agarose gel, and transferred to a Nytran membrane. DdeI fragments were detected by indirect end-labeling using an RsaIIPstI probe (+4 bp to +222 bp) (58).The DdeI sites are shown on the niap of 5'-flanking D N A of the malic-enzyme gene at the right, (B) A summary of the nuclease digestion data in A and from other sources (see Ref. 58). The scale of the drawing is shown at the top. The numbers refer to positions relative to the start site for transcription of the malic-enzyme gene. DNase-I-hypersensitive sites are depicted by arrows. An open arrow indicates a site not affected b y nutritional status, whereas the solid arrows indicate sites induced by feeding. Circles and squares mark the positions of DdeI and Hue111 sites, respectively. Solid symbols indicate cleavages repressed in the starved state; open symbols indicate cleavages unaffected by iiutritional status (from Ref. 58, by permission of Oxford University Press).

MALlC ENZYME GENE

111

B. DNase-l Hypersensitivity in Hepatocytes in Culture Chick embryo hepatocytes incubated with insulin have low rates of transcription of the hepatic malic-enzyme gene. T3 causes a 30- to 40-fold increase in transcription; the further addition of glucagon or CAMPreturns the transcription rate to the basal level (41).The pattern of DNase-I sensitivity in nuclei from hepatocytes in culture is the same as that in hepatic nuclei from fed chicks, irrespective of the hormonal milieu of the cells in culture (41). Thus, region 1is hypersensitive to DNase-I whether the transcription rate is high (insulin plus T3) or low (insulin alone, or insulin plus T3 plus glucagon or CAMP).Transcription rates in nuclei isolated from hepatocytes incubated in culture with insulin alone, or insulin plus glucagon or cyclic AMP, are siinilar in magnitude to those in nuclei from the livers of starved chicks. The rate of transcription in nuclei from hepatocytes incubated with insulin plus T3 is similar to that in nuclei from the livers of fed chicks. These results suggest that, in culture at least, transcription rate does not regulate the DNase-I hypersensitivity of this region of the gene, or vice-versa. Nevertheless, a DNase-I-hypersensitive structure of the proximal region of the malicenzyme promoter may be necessary but not sufficient for elevated rates of transcription. Region 1is not DNase-I hypersensitive in nuclei prepared from the livers of 19-day-old chick embryos, consistent with the low rate of transcription of the malic-enzyme gene under these conditions. The change in hypersensitivity occurs during preparation of the isolated cells. We have not identified conditions of cell preparation that prevent the increase in DNase-I hypersensitivity or culture conditions that restore a low level of hypersensitivity to region 1.

C. DNase-I Hypersensitivity in the Rat Malic-enzyme Gene DNase-I hypersensitivity of the rat malic-enzyme gene has also been examined (59, 60). Three hypersensitive sites near the start site for transcription are in locations similar to three of the sites in region 1 of the chicken gene. Two of these sites (-310 and -50 bp) are present in the liver nuclei of both hypothyroid and hyperthyroid rats; one site (at - 170 bp) is observed only in liver nuclei from hypothyroid rats. Thus, thyroid status alters chromatin structure near the start site for transcription in the rat malic-enzyme gene (59). The effects of starvation or feeding a high-carbohydrate diet were not examined. The rat gene contains a second area of hypersensitivity about 4.1 kb upstream from the start site for transcription. Thyroid status had no effect on this region, which is located at about the same distance upstream from the start site as region 4 of the chicken malic-enzyme gene.

112

ALAN G. GOODRIDGE ET AL.

D. Role of Glucagon and T3 in Vivo Based on the large body of evidence reviewed earlier (Section II1,A) in this essay, it appears that glucagon, a hormone characteristic of the starved state, plays an important role in communicating the state of alimentation of the whole animal to its liver. The finding that starvation alters both transcription and chromatin structure of the hepatic malic-enzyme gene, whereas glucagon in culture alters transcription but not chromatin structure, casts doubt on a major role for glucagon. A similar argument could be used to question the role of T3 in the changes in malic-enzyme expression that accompany the transition between the fed and starved state. One additional piece of evidence is inconsistent with a major role for glucagon in mediating the effects of diet on the hepatic lipogenic enzymes. Starvation and feeding of chicks cause altered rates of transcription of the gene for fatty-acid synthase that are similar in kinetics and magnitude to those for transcription of the malic-enzyme gene (61). However, in culture, glucagon or CAMP do not inhibit transcription of the fatty acid synthase gene and, in fact, usually lead to a substantial stimulation of transcription (42, 54). The glucagon-mediated decrease in fatty-acid synthase activity observed in chick embryo hepatocytes (62)appears to be a post-transcriptional phenomenon. This result is not consistent with a major role for glucagon in regulating expression of fatty-acid synthase during the transition between the starved and fed states. If the same hormones regulate both malic enzyme and fattyacid synthase during this transition in uiuo, these results suggest that glucagon may not play a major role. The "other" humoral factors that regulate transcription of the fatty-acid synthase gene are also likely to regulate the malic-enzyme gene.

VII. cis-Acting Elements in the Malic-enzyme Gene

A. Identifying Sequence Elements Our observations that changes in chromatin structure may play a role in

the nutritional regulation of the malic-enzyme gene and that a number of factors in addition to insulin, T3, and glucagon regulate transcription of the malic-enzyme gene may provide constraints on any proposed regulatory mechanism(s). However, these findings do not carry us any further along the intracellular signaling pathway from changes in malic-enzyme activity to interaction of a factor with its relevant receptor. The next step in that analysis is to define the DNA sequences that bestow responsiveness of the malicenzyme gene to each of the extracellular regulatory factors or conditions that we have defined.

113

MALIC ENZYME GENE

In our analysis of cis-acting elements, we have introduced chimeric genes into chick embryo hepatocytes in culture, using lipofection (63).Our DNA constructs contain the bacterial chloramphenicol acetyltransferase (CAT) gene linked downstream to DNA fragments from the 5’-flanking region of the malic-enzyme gene. As controls for transfection efficiency, we cotrans-

ME5.8CAT -5.8 kb t

EcoRl

Pstl

-4

BStXl

Small t intron

+1

I

V

Hindlll

CAT

[Test

‘

POb A

signal

Frag ment]-TKC AT small t intron

minimal

prom’oter

[Test

Pol; A signal

Fragment]-MEl47CAT Smpll

t

lntron

!

minimal

\

promoter

pol; A

signal

FIG. 11. Top: Structure of pME-5.8CAT, the chimeric expression plasmid used to construct the 5’ deletion mutants described in Fig. 12. CAT, Structural gene for chloramphenicol acetyltransferase; small t intron and polyadenylation signal are derived from SV40. Sequences from the malic-enzyme gene include 5.8 kb of 5’-flanking DNA and 31 bp of 5’ untranslated DNA (up to, but not including, the start site for translation). Middle: Structure of p[test fragment]-TKCAT, a chimeric expression plasmid used to test upstream fragments of the 5’-Aanking DNA of the malic-enzyme gene linked to a minimal promoter from the thymidine kinase gene of herpes simplex virus; C. A. transferase, structural gene for chloramphenicol acetyltransferase. Other abbreviations are as described above for pME-5.8CAT. Bottom: Structure of p[test fragment]-MEl4’7CAT, the chimeric expression plasmid used to test upstream fragments of the 5’-flanking DNA of the malic-enzyme gene linked to a minimal promoter from the chicken malic-enzyme gene. Abbreviations are as described above.

114

ALAN G. GOODRIDGE ET AL.

fect chimeric DNAs containing either p-galactosidase or luciferase as reporter genes linked downstream of hormone-neutral promoter/enhancers from C M V or the RSV LTR (63).Our unpublished studies are summarized in the following paragraphs. The initial construct contained 5.8 kb of 5'-flanking DNA from the chick-

5' Flanking DNA of Chicken Malic Enzyme Gene kb

- 5

- 4

-3

- 2

Ins

Ins

+

T3

Ins

+

+ T3 Lns + T 3 CAMP + Hex

- 1

6f0.5

PstI StuI

I6+6

165f82 1 5 f 3

55f26

5f1

I21 f 5 2

36f12

54+37

8+3

105f30 25flO

18+28

4f2

7+3

7f4

I4+9

14f5

ilk3

6f4

27+15

6f3

I2+8

19f12

11+7

I3k4

I6f7

14fl2

26+4

I8f5

24f9

26f18

36k13

-

Pstl-

26+5

7f2

BglIl EcoNI

00

FIG. 12. Effects of T3, dibutyryl CAMP, and hexanoate on expression of CAT activity in hepatocytes transfected with pME-5.8CAT and several 5' deletions derived therefrom. Hepatocytes were incubated in chemically defined medium (2 ml/plate) containing insulin (300 ng/ml) and corticosterone (1 pM). At 24 hours of incubation, lipofectin or TransfectACE (30 pg), ME.CAT DNA (2.5 pg of ME-5.8CAT or molar equivalent of shorter constructs), and RSVluciferase or CMV-P-GAL DNA (0.5 pg) were added to each plate. Each 30-mm tissue culture plate contained about 0.8 x 10" hepatocytes. After an overnight incubation (15 hours), the lipofectin or TransfectACE and DNA were removed with a medium change. The new medium contained insulin and corticosterone plus (1)no additions, (2) T3 (1.6 pM), (3)T3 plus dibutyryl CAMP(50 pM), or (4) T3 plus hexanoate (5 mM). After an additional 48 hours of incubation, the cells were harvested and assayed for CAT, luciferase, or P-Gt\L activities and protein. Activities of the promoter constructs were initially expressed as CAT activity divided by luciferase or P-GAL activity (each per milligram of protein). The results are expressed as a percentage of the corrected CAT activity in hepatocytes transfected with ME-5.8CAT and treated with T3. Each result is the average 2 S. E. of three to seven independent sets of hepatocytes. The ovals on t h Z restriction map of pME-5.8CAT DNA represent DNase-I-hypersensitive sites (40). Ins, Insulin; Hex, hexanoate.

115

MALIC ENZYME GENE

5' Flanking DNA of Chicken Malic Enzyme Gene

kb ME-5800

-5 I

-4 I

-3

- 6 9

I

-2

INS+T3

-1

I

M E - 4 1 20/-2700ME-147

ME-I47

-3616

INS+T3 +CAMP

100

19+2

144f29

31fl

34f4

27f3

FIG. 13. The DNA element that confers responsiveness of the malic-enzyme gene to T3 and CAMP is located between 2700 and 4120 bp upstream of the start site for transcription. Isolation and incubation of the hepatocytes, transfection and assay of the reporter gene activities, and expression of the results are described in the legend to Fig. 12. ME, Malic enzyme.

en malic-enzyme gene linked to CAT (Fig. 11). After transfection of this construct into hepatocytes, T3 caused a 10- to 20-fold increase in CAT activity. Unlike many other studies of T3 response elements, this result did not depend on cotransfection of an expression vector encoding the T3 receptor. Dibutyryl CAMP and chlorophenylthioether CAMP (CPT CAMP) inhibited T3 induced CAT activity by about 80 and 90%, respectively. Hexanoate inhibited CAT activity by about 75% (63).

B. T3 Response Unit

We constructed several plasmid DNAs that contained deletions from the 5' end of the malic enzyme sequence in pME-5.8CAT3 DNA. Constructs containing at least 3.9 kb of DNA showed a 10- to 20-fold response to T3. Constructs containing no more than 2.7 kb of DNA failed to respond to T3 (Fig. 12). These results provide negative evidence for a T3 response element(s) between 3.9 and 2.7 kb upstream of the start site for transcription. In other experiments, a fragment of DNA from 1-31to - 147 bp of the malicenzyme gene supported a low and T3-unresponsive rate of transcription. When DNA from -147 to -60 bp was deleted, transcription became undetectable. We linked DNA from -4.1 to -2.7 kb to the 5' end of the malicenzyme sequence in pME-147CAT DNA (Fig. 11)to determine if the putative regulatory region would confer responsiveness to T3 on a minimal hormone-insensitive promoter. Responsiveness of the -4. I/ -2.7-kb fragment linked to ME-147CAT in either orientation was equal to or larger than Plasrnids containing 5'-flanking DNA ofthe rnalic-enzyme gene are named as follows: ME designates malic enzyme; a single nuinl)er designates the 5' end of the fragment; the 3' end is at +31 bp CAT is the chloraniphenicol acetyltrairsferase reporter gene.

116

ALAN G . GOODRIDGE ET AL.

that for pME-5.8CAT (Fig. 13). In a similar experiment, we linked the -4.U -2.7-kb fragment to a heterologous minimal promoter, the thymidine kinase promoter from herpes simplex virus (HSV-TK) (Fig. 11).Responsiveness to T3 was 60-fold or greater (Fig. 14). Further deletions from both the 5’ and 3’ ends of the regulatory DNA fragment defined a region from -3.9 to -3.7 kb that bestowed T3 responsiveness to a linked CAT gene. This fragment confers a 60- to 90-fold response to T3 when linked to TKCAT and is contained within DNase-I-hypersensitive region 4 of the avian malic-enzyme gene. As noted earlier (Section III,C), the rat malic-enzyme gene also has a hypersensitive site at about the same location with respect to the start site for transcription. It has not been tested for functionality. Working with both natural and artificial response elements, other investigators have defined two types of T3REs. Both contain two copies of the hexameric sequence, RGGYCA. In one, the 6-bp half-sites are direct repeats separated by 4 bp of any kind. In the other, the half-sites are inverted repeats with no spacing between them (reviewed in Ref. 61). In many natural T3REs, one of the half-sites shows considerable variation from the consensus sequence. Additional flanking sequence also may be important to the functionality of some T3REs (64,65). The 200-bp T3 response region of the malic-enzyme gene contains several sequences that are identical or very similar to a half-site of a consensus T3RE. None of these half-sites have the inverted repeat structure, but several have upstream or downstream direct repeats that have varying degrees of degeneracy and are separated from the consensus half-site by 4 bp. We have used 5’ and 3’ deletions and site-specific mutations in the half-sites to define the relative activity of these putative T3REs. One site, centered on

Enzyme‘ Gene kb ME-5800

-5 I

M E - 4 1 20/-2700TK

TK

-4 I

-3 I

INS

-2 I

INS+T3

-1 0

I

v-

llf2

14f2

INS+T3

+HEX

100

30*7

1240*355

410i100

24f10

25f8

FIG. 14. The DNA element that confers responsiveness of the mdic-enzyme gene to T3 and hexanoate is functional with a heterologous minimal promoter. Isolation and incubation of the hepatocytes, transfection and assay of the reporter gene activities, and expression of the results are described in the legend to Fig. 12. ME, Malic enzyme; TK, thymidine kinase.

MALIC ENZYME GENE

117

-3856 bp, is responsible for most of the stimulation caused by T3. Mutations in the 5' or 3' half-sites or deletion of both sites cause at least a 90% decrease in responsiveness to T3. A 26-bp fragment containing a single copy of this T3RE bestows a 30- to 40-fold response when linked to TKCAT. Five additional half-sites appear to be responsible for T3 responsiveness of up to twofold each. Thus, this region appears to be a T3 response unit composed of one major and several minor T3REs. Additional weak T3REs may be present between -3.46 and -2.70 kb. The most active T3RE we found is not the sequence most similar to the consensus T3RE determined by others. Two of the T3REs downstream from the most active site are more closely related to the sequence of a consensus T3RE than the site at -3856 bp. Nevertheless, mutations to 3 bp of each site that should eliminate binding of the T3 receptor cause only twofold changes in responsiveness to T3. All three sites contain two pairs of Gs separated by 10 bp, suggesting that this cannot be the only important determinant for a robust T3 response (66, 67). The optimal consensus T3RE may not be the same for all genes or all cell types. The T3 receptor appears to function most effectively as a heterodimer, and a large number of proteins can dimerize with the T3 receptor (68).The exact sequence of the most effective T3RE for a particular cell type may be related to the levels of expression of proteins that heterodimerize with the T3 receptor. Flanking sequences also may be important (64, 65).

C. The Negative cAMP Response Element

The ability of cAMP or agents that elevate intracellular CAMP levels to simulate transcription has been analyzed in detail for several genes. Cyclic AMP response elements have been identified; they bind CREB (CAMP response-element binding protein) or related proteins. The transcriptional activation activity of these binding proteins is regulated by phosphorylation catalyzed by the catalytic subunit of protein kinase A (reviewed in Ref. 69). A number of mechanisms that modulate the ability of CREB to regulate basal or CAMP-mediated transcription have evolved; these include complex mechanisms that repress CAMP-activated genes (69).Nevertheless, little is known about mechanisms by which CAMP inhibits the expression of genes. Malic enzyme (41) and L-pyruvate kinase (70) are examples of genes negatively regulated by elevated levels of intracellular CAMP.The negative element for the rat L-pyruvate kinase has been localized to a region that contains adjacent elements that bind the transcriptional activating factors, HNF4 and MLTF (71).The mechanism for the negative regulation is not known. The ability of the chimeric malic-enzyme-CAT gene to be inhibited by CAMPwas lost when sequences between -3.9 and -2.7 kb were removed from the 5' end of malic-enzyme flanking DNA linked to CAT (Fig. 12).

118

ALAN G. GOODRIDGE ET AL.

When the -3.91-2.7-kb fragment was linked to either homologous (ME-147CAT) (Fig. 13) or heterologous (HSV-TKCAT) minimal, hormoneinsensitive promoters, the ability of cAMP to inhibit the T3-induced response was restored. Responsiveness to cAMP is inhibited by overexpression of the peptide inhibitor of protein kinase A and is mimicked by overexpressing the catalytic subunit of protein kinase A. It is thus probable that the action of glucagon is mediated by a phosphorylationldephosphorylationevent. Additional experiments provide evidence that there are two or more negative cAMP elements, one between -3.4 and -2.7 kb and another within the ZOO-bp T3 response unit. We have searched the nucleotide sequence in this region for sites similar to consensus CREs or HNF4 binding sites, but have found none that are sufficiently similar to warrant consideration.

ME-5800CAT

kb

-5 I

-4 I

-

t co-transfection

-3 I

0

-2 I

-1 I

0

CAT

I

100

200

100

200

Insulin Insulin + T3 Insulin + T3 + cAMP Insulin + T3 + glucagon

onaie n rcsto-

of T3 recentor DNA

Insulin Insulin + T3 Insulin + T3 + cAMP Insulin + T3 + glucagon

CAT

activity

(Percentage of insulin + T3 without co-transfection) FIG. 15. Overexpression of the T3 receptor prevents expression of the inhibitory action of dihutyryl CAMP.Isolation, incubation, and transfection of the hepatocytes were the same as described in the legend to Fig. 12, except that half of the plates were cotransfected with 2.0 pg pRSVc-ErhAa DNA. This plasmid is an expression vector for the 0: isoform of the chicken T3 receptor. Assay ofthe reporter gene activities and expression of the results are described in the legend to Fig. 12.The results are the averages of two (glucagon)to five independent experiments.

MALIC ENZYME GENE

119

When an expression vector encoding the T3 receptor is cotransfected with our test constructs, the expected decrease in basal expression and increase in T3-induced expression are observed (72).More importantly, however, the ability of CAMP to inhibit T3-induced expression is diminished or lost (Fig. 15). This result and the localization of one of the negative CAMP elements in the T3 response region raise the possibility that cAMP inhibits T3-induced transcription of the malic-enzyme gene by inhibiting some aspect of the function of the T3 receptor essential for transcriptional activation of linked genes. We constructed a synthetic T3RE that contains five copies of a “perfect” direct repeat of the TSRE with 4-bp spacing. A second synthetic T3RE that contains two copies of the perfect inverted repeat motif was obtained from R. M. Evans (72). Both constructs were linked to TKCAT. CAT activity in chick embryo hepatocytes transfected with either construct responded robustly to T3, but did not respond to dibutyryl CAMP. This result suggests that CAMP does not work by altering the function of the T3 receptor per se. It remains possible that CAMPregulates the activity of a protein that heterodimerizes with the T3 receptor; this assumes that the ability of the CAMPsensitive heterodimer to bind to a T3RE depends on the sequence of the T3RE. Alternatively, cAMP may cause phosphorylation of a protein that binds to the T3 receptor or one of its heterodimeric partners and “squelches” its transcription-activating activity. A third possibility is that the CAMPactivated protein binds near the T3RE and interferes with binding of the T3 receptor.

D. Inhibition by Medium-chain Fatty Acids The ability of the chimeric malic-enzyme-CAT gene to be inhibited by hexanoate was lost when sequences between -3.9 and -2.7 kb were removed from the 5’ end of malic-enzyme flanking DNA linked to CAT (Fig. 12). When the -3.91-2.7-kb fragment was linked to either homologous (ME-147CAT)or heterologous (HSV-TKCAT)minimal, hormone-insensitive promoters, the ability of hexanoate to inhibit the T3-induced response was restored (Fig. 14). Unlike the result with CAMP, overexpression of the T3 receptor had no effect on the ability of hexanoate to inhibit CAT activity directed from constructs containing the T3 response region described in Section 111,B. Furthermore, T3-induced rates of expression of CAT activity directed by synthetic T3 response units containing either five copies of the direct repeat with 4-bp spacing or two copies of the inverted repeat with no spacing also were inhibited by hexanoate, although to a somewhat lesser extent than the natural T3RE. Those deletion and site-specific mutations in the T3 response unit that altered responsiveness to T3 had little or no effect on responsiveness to hexanoate. Our inability to localize the negative ele-

120

ALAN G . GOODRIDGE ET AL.

ment to specific sequences within the T3 response region suggests that hexanoate may inhibit an action of the T3 receptor itself. As noted in Section III,B, constructs that contain less than 3.9 kb of the 5‘-flanking DNA of the malic-enzyme gene fail to respond to T3 when transfected into chick embryo hepatocytes. However, when a construct with its 5’ end at -412 bp was cotransfected with an expression vector that encoded the T3 receptor, T3 caused a 6- to 20-fold increase in CAT activity. The “cryptic” T3RE responsible for this response is located between -236 and -147 bp. Hexanoate did not inhibit TSinduced expression of CAT when the test construct was 412 bp of 5‘-flanking DNA linked to CAT, and it was cotransfected with the expression vector encoding the T3 receptor. Therefore, it is unlikely that hexanoate or the active metabolite derived therefrom interacts directly with the T3 receptor-unless the interaction is dependent on the nucleotide sequence of the T3RE to which the T3 receptor is bound. Different heterodimeric forms might yield such a result.

VIII. Summary We have provided a historical and personal description of the analysis of physiological and molecular mechanisms by which diet and hormones regulate the activity of hepatic malic enzyme. For the most part, our analyses have been reductionist in approach, striving for increasingly simpler systems in which we can ask more direct questions about the molecular nature of the signaling pathways that regulate the activity of malic enzyme. The reductionist approaches that were so successful at analyzing molecular mechanisms in cells in culture may now provide the means to analyze more definitively questions about the physiological mechanisms involved in nutritional regulation of gene expression. In addition to physiological questions, however, there are still many aspects of the molecular mechanisms that have not been elucidated. Despite considerable effort from many laboratories, the molecular mechanisms by which T3 regulates transcription are not clear. Similarly, the molecular details for the mechanisms by which glucagon, insulin, glucocorticoids, and fatty acids regulate gene expression remain to be determined. The role of fatty acids is particularly interesting because it may provide a model for mechanisms by which genes are regulated by metabolic intermediates; this is a form of transcriptional regulation widely used by prokaryotic organisms and extensively analyzed in prokaryotic systems, but poorly understood in higher eukaryotes. At any specific time, there is, of course, only one rate of transcription for each copy of the malic-enzyme gene in a cell. Our longterm objective is to understand how signals from all of the relevant regulatory pathways are integrated to bring about that rate.

121

MALIC ENZYME GENE

ACKNOWLEDGMENTS Research from the Goodridge laboratory reported in this essay was supported by grants from the National Institutes of Health (DK21594 and AA08738) and by the Core Facilities of the Diabetes and Endocrinology Research Center in the College of Medicine at the University of Iowa (DK25295).

REFERENCES S. J. Wakil, J. K. Stoops and V. C. Joshi, ARB 52, 537 (1983). F. B. Hillgartner, L. M. Salati and A. G. Goodridge, Physiol. Rev. 75, 47 (1994). A. G. Goodridge and E. G. Ball, Am. J. Physiol. 211, 803 (1966). A. G. Goodridge and E. G. Ball, Am. J. Physiol. 213, 245 (1967). 5. A. G. Goodridge, BJ 108, 655 (1968). 6. A. G. Goodridge, BJ 108, 663 (1968).

1. 2. 3. 4.

7. A. G. Goodridge, BJ 108, 667 (1968). 8. S. J. H. Ashcroft, Ciba Found. Symp. 41, 117 (1976). 9. B. Szepesi, A. K. Kamara and S. D. Clarke, J. Nutr. 119, 161 (1989). 10. G. F. Cahill, M. G. Herrera, A. P. Morgan, J. S. Soeldner, J. Steinke, P. L. Levy, G. A. Reichard, Jr. and D. M. Kipnis, J. Clin. Znoest. 45, 1751 (1966). 11. G. Sitbon and P. Mialhe, H o w . Metab. Res. 10, 117 (1978). 12. G . Sitbon and P. Mialhe, Horm. Metab. Res. 11, 123 (1979). 13. A. 6. Goodridge, D. W. Back, S. B. Wilson and M. J. Goldman, Ann. N.Y. Acad. Sci. 478, 46 (1986). 14. R. Frenkel, Curr. Top. Cell. Reg. 9, 157 (1975). 15. H. M. Tepperman and J. Tepperman, Am. J. Physiol. 206, 357 (1964). 16. G. I. Portnay, J. T. O’Brian, J. Bush, A. G. Vagenakis, F. Azizi, R. A. Arky, S. H. Ingbarand

L. E. Braverman, J. Clin. Endocrinol. Metub. 39, 191 (1974). 17. A. G. Vagenakis, G. I. Portnay, J. T. O’Brian, M. Rudolph, R. A. Arky, S. H. Ingbar and L. E. Braverman, J . Clin. Endocrinol. Metab. 45, 1305 (1977). 18. A. 6. Goodridge, JBC 248, 4318 (1973). 19. S. Numa, E. Ringelmann and F. Lynen, Biochern. Z. 343, 243 (1965). 20. A. G. Goodridge, JBC 247, 6946 (1972). 21. C. A. Carlson and K.-H. Kim, ABB 64, 490 (1974). 22. P. A. Kitos, R. Sinclair and C. Waymouth, Erp. Cell Res. 27, 307 (1962). 23. A. G. Goodridge, A. Garay and P. Silpananta, JBC 249, 1469 (1974). 24. A. G. Goodridge, FP 34, 117 (1975). 25. A. G. Goodridge, in “Molecular Basis of Insulin Action” (M. P. Czech, ed.), p. 369. Plenum, New York, 1985. 26. P. Silpananta and A. G. Goodridge, JBC 246, 5754 (1971). 27. A. 6. Goodridge and T.G. Adelman, JBC 251,3027 (1976). 28. L. K. Winberry, S. M. Morris, Jr., J. E. Fisch, M. J. Glynias, R. A. Jenik and A. G. Goodridge, JBC 258, 1337 (1983). 28a. A. G. Goodridge, J. F. Crish, F. B. HillgartnerandS. B. Wilson,]. Nutr. 119,299(1989). 29. S. M. Morris, Jr., L. K. Winberry, J. E. Fisch, D. W. Back and A. G. Goodridge, MCBchern 64, 63 (1984). 30. A. Katsurada, N. Iritani, H. Fukuda, T. Noguchi and T. Tanaka, BBRC 112, 176 (1983). 31. J. J. Li, C. R. Ross, H. M. Tepperman and J. Tepperman, JBC 250, 141 (1975).

122 32. 33. 34. 35. 36. 37. 38. 39. 40.

ALAN G. GOODRIDGE E T AL.

M. A. Magnuson and V. M. Nikodem, JBC 258, 12712 (1983). H. C. Towle, C. N. Mariash and J. H. Oppenheimer, Bchern 19, 579 (1980). C. Molero, M. Benito and M. Lorenzo, J. Cell. Physiol. 155, 197 (1993). H. C. Towle and C. N. Mariash, FP 45, 2406 (1986). C. M. Berlin and R. T. Schimke, Mol. Phamnacol. 1, 149 (1965). M. J. Goldman, D. B. Back and A. G. Goodridge, JBC 260, 4404 (1985). D. W. Back, S. B. Wilson, S. M. Morris, Jr. and A. G. Goodridge, JBC 261, 12555 (1986). G. S. McKiiight and R. D. Palmiter, JBC 254, 9050 (197'9). X.-J. Ma, L. M. Salati, S. E. Ash, D. A. Mitchell, S. A. Klautky, D. A. Fantozzi and A. G. Goodridge, JBC 265, 18435 (1990). 41. L. M. Salati, X.-J. Ma, C. C. McCormick, S. R. Stapleton and A. 6. Goodridge, JBC 266, 4010 (1991). 42. S . R. Stapleton, D. A. Mitchell, L. M. Salati and A. G. Goodridge, JBC 265, 18442 (1990). 43. Y. Nagamine and E. Reich, PNAS 82, 4606 (1985). 44. H. Hidaka, M. Inagaki, S. Kawamoto and Y. Sasaki, Bchem 23, 5036 (1984). 45. J. M. Herbert, E. Seban and J. P. Maffrand, BBRC 171, 189 (1990). 46. J. Swierczynski, D. A. Mitchell, D. S. Reinhold, L. M. Salati, S. R. Stapleton, S. A. Klautky, A. E. Struve and A. 6. Goodridge, JBC 266, 17459 (1991). 47. J. R. Grove, D. J. Price, H. M. Goodman and J. Avruch, Science 238, 530 (1987). 48. R. N. Day, J. 0.Walder and R. A. Maurer, JBC 264, 431 (1989). 49. C. Roncero and A. 6 . Goodridge, ABB 295, 258 (1992). SO. A. G . Goodridge, JBC 248, 4318 (1973). 51. W. M. Bortz and F. Lynen, Biochem. Z. 339, 7'7 (1963). 52. P. K. Tubbs and P. B. Garland, BJ 93, 550 (1964). 53. A. L. Greenbaum, K. A. Gumaa and P. McLean, ABB 143, 617 (1971). 54. C. Roncero and A. G. Goodridge, JBC 267, 14918 (1992). 55. W. M. Wiersinga and M. Platvoet-ter Schiphorst, Znt. J. Biochern. 22, 269 (1990). 56. A. Inoue, N. Yamamoto, Y. Morisawa, T. Uchimoto, M. Yukioka and S. Morisawa, EJB 183, 565 (1989). 57. D. S. Gross and W. T. Garrard, ARB 57, 159 (1988). 58. X.-J. Ma and A. G. Goodridge, NARes 20, 4997 (1992). 59. S. J. Usala, W. S. Young 111, H. Morioka and V. M. Nikodem, Mol. Endocrinol. 2, 619 (1988). 60. H. Morioka, G. E. Tennyson and V. M. Nikodem, MCBiol 8, 3542 (1988). 61. D. W. Back, M. J. Goldman, J. E. Fisch, R. S. Ochs and A. G. Goodridge,JBC 261,4190 (1986). 62. S. B. Wilson, D. W. Back, S. M . Morris, Jr., J. Swierczynski and A. G. Goodridge, JBC 261, 15179 (1986). 63. R. A. Baillie, S. A. Klautky and A. G . Goodridge, J . Nutr. Biochenz. 4, 431 (1993). 64. R. W. Katz and R. J. Koenig, JBC 268, 19392 (1993). 65. R. W. Katz and R. J. Koenig, JBC 269, 18915 (1994). 66. B. Desvergne, K. J. Petty and V. M. Nikodem, JBC 266, 1008 (1991). 67. A. Farsetti, B. Desvergne, P. Hallenbeck, J. Robbins and V. M. Nikodem, JBC 267, 15784 (1992). 68. M. A. Lazar, Endocrine Reu. 14, 184 (1993). 69. E. Lalli and P. Sassone-Corsi, JBC 269, 17359 (1994). 70. S. Vaulont, A. Munnich, J, Marie, G. Reach, A.-L. Pichard, M.-P. Simon, C Besmond, P. Barbry and A. Kahn, BBRC 125, 135 (1984). 71. M.-0. Bergot, M.-J. M. Diaz-Guerra, N. Puzenat, M. Raymondjean and A. Kahn, NARes 20, 1871 (1992). 72. K. Damm, C . C. Thompson and R. M. Evans, Nature 339, 593 (1989).

Oxidative Chemical Nucleases DAVIDM. FERRIN, ABHIJIT MAZUMDERAND DAVIDS. S I G M A N ~ Department of Biological Chemistry, School of Medicine Department of Chemistry and Biochemistry Molecular Biology lnstitute University of California, Los Angeles Los Angeles, California 90024

... I. Chemistry of Scission . . . . 11. Reactivity of the Untargeted 2:l 1,lO-Phenanthroline-cuprous Complex . . . . . ...................................... A. Specificity for Metal Ion and Phenanthroline Structure . . . . . . . . . B. Clinical Applications of the Chemical Nuclease . . . . . . . . . . . . 111. Conformational Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. (OP),Cu+ As a Footprinting Reagent . . . . . . . ....... .............. V. Discovery of a New Class of Transcription Inh VI. Site-specific Targeting of DNA Scission . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Nucleic-acid-directed Scission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Protein-targeting of DNA Scission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

125 126 126 128 130 131 133 138 138 142 148 149

Nature has been extolled for her wisdom in selecting the phosphodiester bond as the backbone for D N A and RNA ( 1 ) . Because it is hydrolytically stable and charged, the molecule containing the master code will neither degrade nor diffuse readily from cellular compartments. But while chemical stability is important in living systems, the capacity for change is also essential for evolution. The sensitivity of the four bases and the (deoxy)ribose moiety to oxidation may be partially responsible for the spontaneous mutations essential for improvement and adaptation. For a particular individual, however, these processes have a downside and contribute to the biochemical changes associated with aging and cancer. 1

To whom correspondence may be addressed

Progress in Nuclcic Acid Hesearch and Sl"lrcnhr B d o g ) ; Vol 52

123

Cupyright 0 19% b y Academic Prmr. Ioc. All right, of reproduction in any form reserved.

124

DAVID M. PERRIN ET AL.

Chemical nucleases are redox-active coordination complexes that cleave

the phosphodiester bond by oxidative attack on the deoxyribose moiety. Despite the heuristic example provided by the crystal structures of the active sites of nucleases (2), chemists have not been able to develop simple bioorganic systems that hydrolyze phosphodiester bonds facilely (3, 4). For example, it takes 120 hours to observe a 26% reaction of the “labile” P-y bond of ATP with the nucleophilic hydroxymethyl group of the zinc complex of 1,1O-phenanthroline-2-carbinolvia an intracomplex nucleophilic phosphate transfer reaction (5). Nevertheless, many coordination complexes that can cleave DNA and RNA by an oxidative mechanism have now been identified (6). In contrast to the inefficiency of hydrolytic nucleases, the “inert” phosphodiester backbone of DNA and RNA is completely degraded on incubation with a fraction of the concentration of 1,lO-phenanthroline-copperin the presence of thiol and oxygen (7, 8). The phosphodiester bond is not directly attacked, but the series of elimination reactions triggered by the oxidative attack results in cleavage of the C-0 bond of the phosphodiester backbone. This review focuses on the nucleic acid chemistry of 1,lOphenanthroline-copper. Its ability to degrade DNA was discovered through a series of serendipitous discoveries that were made during the course of examining the inhibition of Escherichia coli DNA polymerase by 1,lOphenanthroline (OP) (7,9,1O). Our studies showed that the inhibition of this enzyme as well as other DNA polymerases is due to the oxidative scission of DNA by the 2:l 1,lO-phenanthroline-cuprous complex in the presence of hydrogen peroxide. The inhibition of the polymerase activity could be explained by the generation of 3‘ phosphorylated termini that are dead-end inhibitors of the enzyme (11)(Fig. 1).

1,lo-Phenanthroline (OP)

2,2’-Bipyridine (BP)

FIG. 1. Structure of 1.10-phenanthroline(OP) and 2,2’-bipyridine. The cuprous complex of 1,lO-phenanthroline,but not bipyridine, forms nucleolyticallyactive chelates.

125

OXIDATIVE CHEMICAL NUCLEASES

1. Chemistry of Scission The chemical nuclease activity of 1,lO-phenanthroline-copperwas novel chemistry and prompted us to study this reaction because we felt that it would provide a new approach to examine DNA and RNA structure/function relationships. Precedent for the oxidative scission of DNA was provided by the mechanism of action of the antitumor drug bleomycin (12) and much earlier studies by Zamenhof and colleagues, who reported that ferrous-EDTA destroys the transforming ability of Huemophilus infuenzue DNA (13). Research on the chemical nuclease activity of 1,lo-phenanthrolinecopper has developed in two directions. One line of investigation has focused on the scission reaction by the tetrahedral 2:1 1, lo-phenanthrolinecuprous complex. The second has involved the site-specific cleavage of DNA by the 1:l 1,lO-phenanthroline-coppercomplex linked to a targeting ligand such as a protein or an oligonucleotide in the presence of a reductant (e.g., mercaptopropionic acid or ascorbic acid) and hydrogen peroxide (Fig. 2).

FIG.2. The 2:1 1,lO-phenanthroline-cuprouscomplex (left) and 2:1 2,9-dirnethyl-l, 10phenanthroline-cuprous complex (NC),Cu (right). +

126

DAVID M. PERRIN ET AL. 5’ end

G+4

b

..

b

..

t

5-MJ?

FIG.3. Chemical mechanism of DNA scission by (OP)2Cu+.MF, 5-Methylene furanone.

In both cases the reaction proceeds through a noncovalent intermediate [Eq. (1)l: Kd k, CN + DNA CN-DNA + nicked products

K,

=

(CN)(DNA)/(CN-DNA)

(1)

where CN is chemical nuclease. However, in the former case, the specificity of scission is dictated by the binding of the hydrophobic tetrahedral cation (OP)&u+, whereas in the latter case, the high-afEnity binding of the carrier ligand predominantly dictates the scission specificity. In the untargeted cleavage, the second phenanthroline is presumably essential for generation of a coordination complex with affinity for B-DNA (14). Because the 1:l phenanthroline complex is competent for cleavage when linked to a highaffinity ligand such as a protein or a hybridizing nucleotide (15, 16), the second phenanthroline is not essential for the chemical reactivity of the copper. It does not play an essential role in modulating the redox potential of the cuprous ion as ligands do in other circumstances (17). The chemical mechanism of the scission of DNA by the untargeted coordination complex has been investigated. Based on the detailed analysis of the reaction products, the C-1 hydrogen has been inferred to be the initial site of oxidative attack (18).The key finding in these studies was the identification of 5-methylenefuranone as the product of deoxyribose oxidation (19). An additionally compelling finding was the trapping of the intermediate indicated in the reaction scheme (18) (Fig. 3).

II. Reactivity of the Untargeted 2:l 1,lO-Phenanthroline-cuprous Complex A. Specificity for Metal ion and Phenanthroline Structure The curious feature of the chemical nuclease activity of the untargeted 1,lO-phenanthroline copper is its specificity for both 1,lO-phenanthroline

OXIDATIVE CHEMICAL NUCLEASES

127

and copper. For example, no other ligand system substitutes for 1 , l O phenanthroline. The cognate ligands bipyridine and terpyridine are inactive as are hydroxyquinoline and calcein analogs. Nevertheless, not all phenanthroline derivatives cleaved DNA (9, 14, 20). Functionalization of the 2 or 9 position with any group destroyed the nucleolytic activity (7).Negatively charged substituents such as carboxylates or sulfonates (21) on the phenanthroline also destroyed the cleavage activity (20, 22). Substitution of copper by iron or cobalt yielded no cleavage-active coordination complexes. Even with hydrogen peroxide, these coordination complexes were not as reactive as the cuprous complex. An additional explanation for their lack of reactivity is that they form octahedral coordination complexes, which would not be expected to have the same binding specificity as the tetrahedral complexes. The noncovalent intermediate of Eq. (1) is essential for scission, and copper is the only redox-active metal ion that can form a tetrahedral chelate with 1,lO-phenanthroline. The dependence of the cleavage reaction on ligand substituents can be explained in the context of the kinetic scheme. The second-order rate constant for the cleavage reaction is the ratio k , l K d . Negatively charged phenanthrolines are unreactive because the electrostatic repulsion between these cuprous chelates and the negatively charged phosphodiester backbone destabilizes the noncovalent intermediate and causes an increase of Kd. The dissociation constant will also be increased by substituents that block the binding of the chelate to the minor groove, the site of the redox-sensitive hydrogen. Molecular models indicate that phenyl groups at the 4 and 7 positions and methyl groups at the 3 and 8 positions inhibit the access of the coordination complex to the minor groove (20). However, substituents at the 2 and 9 positions reduce the magnitude of kl because a steric clash develops in the square-planar cupric complex with substituents ortho to the chelating nitrogens (Fig. 4). Although the reaction of (OP),Cu+ with H,O, on the surface of DNA is formally equivalent to Fenton chemistry, diffusible hydroxyl radicals are not generated as they are in the reaction of ferrous EDTA with hydrogen peroxide. The failure of small organic quenching reagents such as glycerol, mannitol, and guanidine to block the reaction provides experimental evidence for this conclusion (14). In addition, oxidative products produced by (OP),Cu+ and H,O, are formed as much as four orders of magnitude less than hydroxyl radicals generated b y pulse radiolysis. The precise chemical structure of the oxidative species responsible for scission is not known, and it has been suggested that it may be formed at concentrations too low to be detected (23). The chemical nuclease activity is remarkably hardy. Although the reaction is inhibited by chelating agents, the addition of excess copper ion reacti-

128

DAVID M. PERRIN ET AL.

part A

part B

P

FIG. 4. Crystal structure of the 2,9-dimethyI-l,lO-phenanthroline-cuprous complex bromide (46).

vates the reaction. If EDTA is present, magnesium ion can be added to sequester the EDTA as the unreactive magnesium chelate. Catalase inhibits the reaction under all conditions because it destroys the essential coreactant H,Oz (7, 24). As long as a source of hydrogen peroxide is assured, the reaction proceeds readily. The reactivity of the chemical nuclease is not inhibited by protein denaturants (e.g., guanidine or urea), which irreversibly inactivate biological catalysts such as DNase I or micrococcal nuclease.

B. Clinical Applications of the Chemical Nuclease The chemical nuclease of (OP)&h+ has two interesting medicallclinical applications. The fact that the nuclease is resistant to chaotropic reagents, which normally inactivate biological materials, is the basis of one important diagnostic use of (OP)2Cu+.It has been used to prepare minicircles of Trypanosoma cruzi kinetoplast DNA for analysis by the polymerase chain reac-

129

OXIDATIVE CHEMICAL NUCLEASES

tion (25).The diagnostically important minicircles are normally resistant to PCR techniques due to their high degree of concatenation and supercoiling. Because it is necessary to lyse and store whole-blood specimens collected in the field in 6 M guanidine-HCV0.2 M EDTA, biological nucleases (e.g., DNase I) cannot be used to linearize the minicircles prior to amplification. In contrast, the nuclease activity of (OP)&u+ is not inhibited by guanidine and can be used to digest the minicircles. The (OP),Cu+/PCR method of analysis has proved more sensitive than xenodiagnosis or serology and should be useful in clinical and epidemiological studies of chronic Chagas’ disease (Fig. 5). The second clinically relevant application of the nuclease activity of 1 , l O phenanthroline-copper is in the inactivation of RNA and DNA viruses present in blood products isolated from large numbers of donors. Because (OP),Cu+ preferentially reacts with DNA and RNA relative to protein, viral nucleic acids can be cleaved without destroying the functional and structural integrity of the protein. In effect, the chemical nuclease can be used to “sterilize” proteins without grossly destroying their intrinsic activity.

MINUTES

ha

0 10 20 30 40 50 60

FIG.5. Scission of Trypanosoma cruzi kinetoplast minicircles by (OP),Cu+. Frequency of single-strand nicks in (OP),Cu+-linearized minicircle DNA. Trypanosoma cruti kinetoplast DNA (%DNA)networks in a guanidine-EDTA lysate of blood were cleaved with (OP),Cu+ for increasing periods of time. After cleavage, the DNA was electrophoresed in a denaturing glyoxal gel (1%agarose), blotted onto a nylon membrane, and probed with [32P]kDNA.At time 0, a few decatenated minicircles undergoing replication can be seen as a 1.4-kb band. After (OP)zCu+ cleavage, the minicircles are released from the kDNA networks as double-stranded linearized molecules. By denaturating the linearized minicircles, it can be seen that the minicircles are increasingly being nicked as a function of digestion time. After 30 min, 90% of the minicircle fragments is larger than 310 bp. After 60 min, 50% of the fragments is larger than 310 bp and 80% of the fragments is larger than 118 bp.

130

DAVID M. PEKRIN ET AL.

Membrane-enveloped RNA and DNA viruses such as HIV, influenza, Epstein-Barr, and cytomegalovirus are the most sensitive to (OP)&u+. Nonenveloped viruses, such as poliovirus, were not inactivated presumably because the protein coat served as an effective barrier to the approach of the (OP)&u+ to the viral RNA. Hepatitis B virus treated with (OP)2Cu+lost its infectivity when tested in chimpanzees. All plasma proteins treated with (OP)2Cu+,including Factor VIII preparations, retained their activity. All the reaction conditions essential for the inactivation of the viruses corresponded to those required for the in vitro cleavage of RNA and DNA.

111. Conformational Sensitivity In its reaction with DNA, (OP)&h+ has shown a reaction preference for DNAs of different secondary structure in addition to a sequence-specific variability in its reactivity with B-DNA (11, 14, 26). For example, B-DNA is the preferred substrate whereas A-DNA is substantially less reactive. Z-DNA and single-stranded DNA, blocked from forming metastable loops by glyoxal modification, are unreactive (21).Examination of molecular models indicates that only B-DNA has a minor groove that can serve as a favorable binding site for the coordination complex near the redox-sensitive C-1 hydrogen. The minor groove of A-DNA is flat and shallow and so provides only a poor binding site for (OP)&u+. Z-DNA and single-stranded DNAs have no minor groove comparable to that present in B-DNA and are therefore unreactive to the chemical nuclease. These findings are consistent with the postulated reaction paradigm of minor-groove binding and C-1 hydrogen attack proposed in Fig. 3. Because the experiments that explored the reactivity of (OP)&u+ with DNAs of different secondary structure were carried out in the absence of carrier DNAs, the lack of cleavage of Z-DNA (even though the C-1 hydrogens are accessible to solvent) indicates that the oxidative species formed by (OP)2Cu+is not diffusible and must react with DNA at its site of formation. Unless (OP)zCu+ binds to DNA, no cleavage will be observed. The reaction of (OP)&u+ with B-DNA exhibits sequence-dependent variability (26-31). Although cleavage proceeds equivalently at all four nucleotides because of the deoxyribose-directed chemistry, all nucleotides are not equivalently reactive. Hyper- and hyporeactive sites are observed as seen in pancreatic DNase I digests. The principal determinant in the cleavage preference of B-DNA is the nucleotide that is 5’ to the site of attack. Because the DNA is double-stranded, the kinetic effect may actually be exerted by the nucleotide that is base-paired to the 5’ neighbor. If the cleavage patterns of several DNAs are analyzed statistically, 5‘ cytosines are

OXIDATIVE CHEMICAL NUCLEASES

131

most inhibitory possibly because the hydrogen bond with G in the minor groove may perturb the binding of (OP),Cu+. NMR has been used to examine the structure of the noncovalent complex of (OP)&u+ with DNA under anaerobic conditions. These studies have suggested an intercalative component in the binding of (OP),Cu+ to the DNA in the minor groove (32).However, it is not possible to determine if the spectroscopically observable complex formed (CN-DNA’) is the structure through which the reaction is funneled or in rapid equilibrium with the kinetically competent complex, CN-DNA [Eq. (2)]. CN

+ DNA

Kd

CN-DNA + nicked products

It

CN-DNA’

(2)

The fact that substitution at the 5 position does not greatly alter reactivity argues against the likelihood of an intercalative component involving the central phenanthroline ring, but this remains a viable possibility. Detailed kinetic studies of the chemical nuclease activity have not been pursued in our laboratory because of the extraordinary number of equilibrium and rate constants that must be determined. A partial list includes the stability constants for the formation of the cupric and cuprous chelates, the aflinity of each of these complexes for the DNA at a given sequence position, the rates for the oxidation of the complex on the DNA-bound chelate, and the binding affinity of 3-inercaptopropionate or ascorbate for the complex (33). Preformed (OP),Cu+ could be used, but this would require meticulously anaerobic conditions for the experiments to be reliable.

IV. (OP),Cu+ As a Footprinting Reagent The chemical nuclease activity of 1,lO-phenanthroline-coppermakes the complex useful as a footprinting reagent. In conjunction with dimethyl sulfate and DNase I, (OP),Cu+ can be used to demonstrate whether a highaffinity DNA ligand binds to the major or minor groove or both. For example, if the major groove is the principal binding site for a ligand, a footprint will be obtained using dimethyl sulfate and DNase I, but not (OP),Cu+. On the other hand, if a ligand such as netropsin binds in the minor groove (18), the reactivity of dimethyl sulfate will be unaffected, but scission by DNase I and (OP),Cu+ will be inhibited. Because DNase I binds to the major groove, but hydrolyzes the phosphodiester bond from the minor groove, it is the best footprinting reagent for identification of binding sites (34). However, it cannot pinpoint which groove may be involved. Because (OP)&u+ and DNase I

132

DAVID M. PERRIN ET AL.

footprints with DNA binding proteins are similar, the accessibility of the (OP)&u+ to the minor groove must be blocked even if the principal sites of interaction are in the major groove (35).Hydroxyl-radical footprinting using ferrous-E DTA has proved valuable because uniform cleavage patterns have been obtained (36).However, despite its widespread use, its chemical mechanism of scission is not known (37). Therefore, the component of the nucleotide structure protected from scission by ferrous-EDTA cannot be specified. (OP)&u+ has two unique features as a footprinting reagent. The first is that because it is small it can readily diffise into acrylamide matrices and

+RNA Poi.

!

2

3

4

FIG. 6. Scission of E . coli RNA polymerase-lacUV-5 complex by the cuprous complexes of OP, 5-phenyl-OP, and 3,4,7,8-tetramethyl-OP. Lane 1, Maxam-Gilbert G A sequencing lane; lanes 2-4, scission of the open complex by (5+OP)zCu+ (lane 2), (44OP),Cu+ (lane 3), or (3,4,7,8-Me,0P),Cuf (lane 4).

+

OXIDATIVE CHEMICAL NUCLEASES

133

react with DNA-protein complexes after they have been separated through the widely used gel-retardation assay (38).As a result, the sequence-specific contacts of electrophoretically discrete species can be obtained. This application has been reviewed and is not discussed further here (39-42). The second feature is its unanticipated reactivity with kinetically competent open complexes formed between RNA polymerases and promoters. This reaction preference was initially discovered when the binding of E . coli RNA polymerase to the E . coli wild type, Ps, and lacUV-5 promoter was footprinted simultaneously with DNase and (OP)&u+ (43).These promoters vary in their dependence on the cyclic-AMP receptor protein (CRP) for optimal transcription. The wild type is the most sensitive to this positive activator. The Ps promoter, which dfiers by a single base change at position -9 within the Pribnow box, shows an intermediate dependence, and the ZucUV-5 is fully independent of CRP for its transcriptional activity (44). In all cases, the footprinting experiments with DNase I and (OP)&u+ faithfully reflect the expected dependency of RNA polymerase on the presence of CRP for high-&nity binding to the promoter. However, the scission pattern in the presence of (OP)&u+ contained strong scission sites on the template strand only at positions -4 to -6 under conditions where maximal transcription is observed. No corresponding scission sites were observed on the nontemplate strand. Subsequent studies that coupled gel retardation and (OP)&h+ footprinting indicated (Fig. 6) that these hypersensitive sites were not observed when the polymerase-promoter complexes were footprinted in the presence of high concentrations of EDTA (38).The protection pattern obtained under these conditions was that of the closed complex. Excess magnesium ion is required to observe the intense reactivity on the template strand. These sites of cleavage correspond to sequence positions previously demonstrated to be single-stranded, using base-specific modification reagents (45).

V. Discovery of a New Class of Transcription Inhibitors

Why does the tetrahedral cuprous complex of 1,lO-phenanthroline cleave the kinetically competent complex formed with RNA polymerase? One explanation posits that the transcription bubble formed at the active site of RNA polymerase creates a binding site or sites for this chelate. This hypothesis predicts that the cuprous complex of 2,9-dimethyl-l, 10phenanthroline [or neocuproine; (NC),Cu+] should be a potent inhibitor of transcription because it forms an exchange- and redox-stable tetrahedral coordination complex (46-48). However, the 3:l chelate composed of 1 , l O -

134

DAVID M. PERRIN ET AL.

phenanthroline and ferrous ion (OP),Fe2+ should neither inhibit the cleavage reaction observed in the open complex nor block transcription because it forms an octahedral complex (Fig. 7). This is precisely what has been observed (49). (OP),Fe2+ neither inhibits transcription nor blocks cleavage of the template strand. Moreover, the cuprous complex of 2,g-dimethyl derivatives of 1,lO-phenanthrolines whose redox-active analogs do not generate the hypersensitive sites, such as 4,7-diphenyl-l, 10-phenanthroline, neither inhibits transcription nor protects against open-complex-specific scission. What is the generality of open-complex-specific scission by the cuprous complexes of 1,lO-phenanthroline and its derivatives? The open-complexes of other prokaryotic transcription units, such as gal and mer R, are cleaved (50, 51). The adenovirus late promoter also shows these hypersensitive sites

-

Tanplate Strand

H20 2

~i&d

catalyticOxidahve scission

-

Stable

Open Complex DNA

and

Inhibition

FIG.7. Competitive binding of (OP),Cu+ and (NC),Cu+ to E . coli RNA polymeraseZacUV-5 open complex.

OXIDATIVE CHEMICAL NUCLEASES

135

(52). However, not all kinetically open complexes are hyperreactive to (OP)&h+. One interpretation for these findings is that the same binding site is created for the tetrahedral complex but the coordination complex is not oriented for oxidative attack at the C-1 H of the deoxyribose. Alternatively, the steady-state level of the open complex is low relative to the closed complex and the redox chelate does not bind with sufficient stability to cause the polymerase-promoter complexes to accumulate. In the case of eukaryotic transcription units, only a fraction of the input DNA will form catalytically competent open complexes. As a result, it would be difficult experimentally to detect hypersensitive bands in the background scission of the added DNA. The cleavage of the single-stranded DNA formed on the enzyme surface would have to be at least 100-fold more intense than the free DNA and the steady-state level of the open complex would be high relative to the closed complex. One approach to determine if all transcription units form a binding site for a tetrahedral coordination complex would be to establish whether (NC)&u+ is a general inhibitor of transcription. The results are unambiguous. These complexes inhibit transcription by all RNA polymerases examined (53),including the bacteriophage T-7 and SP-6 RNA polymerases, the RNA polymerase I1 of HeLa extracts, and the E . coli RNA polymerase. Inhibition is observed for all promoters transcribed by these enzymes. Inhibition by (NC)&h+ is a more general phenomenon than open-complex scission by (OP)&u+. In the case of RNA polymerase I1 of HeLa cell extracts, transcription inhibition is observed with TATA-containing promoters as well as promoters that contain only the initiator element (54). Synthetic promoters activated by chimeric transcription factors are inhibited as well as biologically functional promoters with multiple binding sites for diverse transcription factors (55). The experiments presently available suggest that the Hill coefficient for for inhibition by the inhibition of in vitro RNA synthesis is roughly 2. The (NC)&u+ is about 40 p M . The effects of substitution of the phenanthroline nucleus are currently being explored. The cuprous complexes of the 5-phenyl and 4-phenyl derivatives have I,,, values of approximately 4 p M . Generally, the I,, values of the cuprous complex of various neocuproine derivatives parallel the efficiency of their redox-active analogs in cleaving the open complex of the ZacUV-5 open complex. The sole exception discovered to date is the chelate of 2,3,4,7,8,9-hexamethyl neocuproine, which has an I,, of about 80 p M but whose redox-active analog [3,4,7,8-tetramethyl (OP),Cu+] is an exceptionally strong cleaver of the open complex. In all cases, the redox-inactive analogs protect the single-stranded DNA of the open complex from cleavage by their redox-active isosteres. This suggests

136

DAVID M. PERRIN ET AL.

+

I DNase -

"'-3 - +

1

2

3

4

-1

~~

0

5

1

3

10 30 (00 :

t t t

+

t :

pM(5@-NeocuproineI2

RNA Polymerase

6 7 8 9 1 0

FIG.8. Protection of (5+OP)zCu+ lacUV-5 open-complex scission by (5+NC),Cu+. In all cases, the template strand of lacUV-5 is 5'-phosphorylated. Lane 1, Maxam-Gilbert G A; lane 2, DNase I digestion; lane 3, DNase footprint of open complex; lane 4 (5+OP)2Cu+ in the absence of RNA polymerase; lanes 5-10, (5+0P),Cu+ in the presence of indicated concentrations of (5+NC),Cu+.

+

that all the tetrahedral analogs bind at overlapping sites. Because the redoxactive analogs cleave the DNA, this site must be at or within the transcription bubble (Fig. 8). The (NC)$2u+ analogs are unique inhibitors of prokaryotic transcription. In comparison to other DNA ligands (such as netropsin, which binds to the minor groove; 9-aminoacridine, which is a classical intercalating agent; and daunomycin, which binds by intercalation and to the minor groove), (NC)&u+ does not have high affinity for DNA. It binds only to a site created on open-complex formation When gel-retardation assays of the E . coli RNA polymerase ZacUV-5 promoter are carried out, (NC)&u+ stabilizes the binding of the enzyme to the DNA. In contrast, the other DNA ligands promote the dissociation of the enzyme from the DNA. The only other ligand that

137

OXIDATIVE CHEMICAL NUCLEASES

forms a stable ternary complex as measured by gel retardation is rifampicin, but this potent inhibitor of transcription binds to RNA polymerase exclusively. However, it does induce footprintable changes in the open complex (56) (Fig. 9). The development of inhibitors targeted to the single-stranded DNA of the open complex promises to provide an interesting direction for the development of gene-specific inhibitors. The reactivity of (OP)2Cu+for transcription start-sites and the inhibition of transcription by (NC)2Cu+reinforce insights relative to the mode of action of the chemical nuclease. The demonstration that the redox-inactive cuprous complexes bind to the transcription start site supports the conclusion that binding dictates scission specificity. To a first approximation, the observed reaction preferences primarily reflect the binding specificity of the tetrahedral hydrophobic cuprous complex. However, orientation must also play a role in the scission because (1) the different derivatives of (OP),Cu+ do not cleave the ZacUV-5-E. coZi RNA polymerase open complex identically, and (2) (NC)&u+ inhibits all transcription units that do not generate hypersensitive sites. Additional support for the importance of the orientation effect is that cleavage is observed only on the template strand. If the reaction were mediated by a diffusible species, cleavage would be expected on both strands.

- - - - - + Well

-

Open Complex

-

Free DNA

-

-

-

22

t

t

t

1

2

3

4

t

t

t

:

ApA+UTP

:

(~,~M~,-OP),CU+(~M)

22

67 200

t

-

t

t

t

t :RNAPol.

5

6

7

8

9

67 200

FIG. 9. Stability of (NC),Cu+ open complex determined by gel retardation. Gel retardation of open and initiation complexes in the presence of (Me,OP),Cu+. Lane 1, 5'-32P-labeled ZacUV-5 DNA; lane 2, 5'-32P-labeled ZacUV-5 incubated with RNA polymerase; lanes 3-5, open complex incubated with 22 pM (lane 3), 67 pM (lane 4), or 200 pM (lane 5) (Me,OP),Cu+; lane 6, open complex incubated with ApA and UTP; lanes 7-9, open complex incubated with ApA, UTP, and 22 pM (lane 7), 67 pM (lane 8), or 200 pM (lane 9) (Me,OP),Cu+.

138

DAVID M. PERRIN ET AL.

VI. Site-specific Targeting of DNA Scission Our research has focused on two approaches for targeting. the chemical nuclease activity of 1,lO-phenanthroline-copper. In the first, oligonucleotides have been derivatized with 1,lO-phenanthroline either at the 5’ end by coupling the terminal phosphate with 5-glycylamido-1,10phenanthroline using a carbodiimide coupling method (57),or by alkylating a thiophosphate attached to the 5’ end with polynucleotide kinase and 5-iodoacetamido-l,~O-phenanthroline (58).Interior labeling of nucleic acids has been achieved by synthesizing oligonucleotides either chemically or enzymatically with abiological nucleotides containing pendant amino groups. They are then derivatized with N-succinimidyl 3-(2-pyridyldithio)propionate as a cross-linking reagent, which can be alkylated with 5-iodoacetamido-l,lO-phenanthroline following reduction with dithiothreitol (59). The second approach for targeting scission has been to convert DNA binding proteins into site-specific scission reagents (60, 61). The first step in this protocol is the identification of a residue in the amino-acid sequence adjacent to the minor groove and then conversion of it into a cysteine residue using site-specific mutagenesis. The second step is to derivatize the cysteine with either 5-iodoacetamido-l,lO-phenanthrolineor 5-iodoacetamido-Palanyl- 1,lO-phenanthroline.

VI 1. Nucleic-acid-directed Scission The first example of the targeting of the chemical nuclease activity of 1,lO-phenanthroline was accomplished by linking 5-glycylamido-1,10phenanthroline to the 5’ end of a deoxynucleotide complementary to sequence positions + l to +20 of the nontemplate strand of the E . coli lac operon in single-stranded M-13 (15). Multiple sites (six) of scission were observed at the expected sequence positions. These six cleavage points could be due to the flexibility of the linker arm or the diffusibility of the oxidative species. Because cleavage was observed at all four nucleotides-A, C,T, and G-the cleavage chemistry must have been deoxyribose-oriented. However, in targeted scission, the titer of sites is necessarily low, and it is difficult to isolate the oxidized deoxyribose product. Nevertheless, these experiments strongly suggest that the 1:1 phenanthroline-copper complex is cleavage competent. The chemical mechanism of the cleavage targeted by nucleic acids and proteins is under investigation. Presently, all of our experiments suggest that the mechanism established in the untargeted case is applicable to the targeted scission as well.

139

OXIDATIVE CHEMICAL NUCLEASES

In this review, we summarize experiments in which oligonucleotides derivatized at the 5' end have been used to establish that oligonucleotides complementary to the template strand of open complexes can hybridize with high stringency (62).The purpose of these experiments was to establish the feasibility of using complementary oligonucleotides as gene-specific transcription inhibitors. Other experiments using nucleic acids to target scission focused on developing a method for the double-stranded scission of DNA that could be used to cleave any preselected target sequence (59). This method would find use in genome mapping. The accessibility of the template strand in the open complex composed of E . coli RNA polymerase and the ZucUV-5 promoter to the (OP)2Cu+ and (NC)2Cu+ indicated that ligands in solution can bind to this steady-state intermediate. We reasoned that the template stand should also be accessible to a complementary oligonucleotide. Such hybridization suggests a new di-

(50OP),

DNase a 1-4-

-

-

t

11-

-

-

t

Cut

0.5 2.5

7

t

t

+

6

7

8

'

: :

UGGAA(pM)

R N A polymerase

-

--a _. 4

1

2

3

4

5

FIG. 10. inding of UGGAA measured i protection of (S+OP),Cu+ scission- 1 5-Phenyl-1,lO-phenanthroline-copperfootprinting of UGGAA associated with the open complex. Lane 1, Maxam-Gilbert G + A sequencing ladder; lanes 2 and 3, DNase footprint of promoter and open complex, respectively; lanes 4 and 5, (5+0P)&u+ footprint of promoter and open complex, respectively; lanes 6-8, (Sc$OP),Cu+ footprint of UGGAA association with open complex with increasing concentrations of UGGAA (0.42, 2.1, and 5.9 pM, respectively); (Sc$OP)&u+ is 10 pM in all cases.

140

DAVID M. PERRIN ET AL.

B

OP-NAc-SP-UGGAA

~

Coding Strand

Template Strand

FIG. 11. LacUV-5 specific scission by OP-UGCM-UGCAA-directed oxidative scission of the open complex. (A) Structure of targeted scission reagent, OP-NAc-SP-UGGAA. (B) Experimental design: schematic of oligo-directed scission. (C) Results: autoradiograph. Lane 1, 186-bp lacUV-5 promoter fragment labeled on template strand; lane 2, Maxam-Gilbert G + A sequencing ladder; lanes 3 and 4,DNase footprint of promoter and open complex, respectively; lanes 5 and 6, (5t$OP).&u+ (10 p M ) footprint of promoter and open complex, respectively, lane 7, negative copper control, 0.75 p,M CuSO,; lane 8, negative OP-Cu control, 0.75 p M CuSO,, 1 p M OP; lane 9, 0.75 pM CuSO,, 1pM OP-NAc-SP-UCGAA. Scission in lanes 7-9 was initiated with 2.5 mM sodium ascorbate, continued for 25 minutes at 37"C, and then quenched by 3 pl of 40 mM 2,9-dimethyl-I,lO-phenanthroline. The scission sites in lanes 6 and 9 are at positions -6 to -4 relative to the start of transcription.

rection for the synthesis of gene-specific transcription inhibition in which recognition depends on the catalytic activity of RNA polymerase and the sequence of the hybridizing oligonucleotide. Because the inhibition of tran-

141

OXIDATIVE CHEMICAL NUCLEASES

C

-

2.5mM Ascorbate I

(500P)*Cu+ v)

a

DNase

eQ a+ ' -

1

2

3

t

4

'

-

5

t

6

- - t’: lpMOP-NAc-SP-UGGAA - : lpMOP +

t t

7

t t

8

+

:

t :

0.75pMCu+2 RNA polymerase

9

FIG.11-Continued

scription by (NC)2Cu+ is general, this approach would be applicable to eukaryotic and prokaryotic transcription units. A variety of experimental approaches were used to establish the validity of this concept. The simplest method for establishing the binding of a short oligoribonucleotide to the open complex exploits the perturbation of the characteristic scission pattern observed with the cuprous complex of 5-phenyl-l,lO-phenanthroline,(5+OP)zCu+. The concentration dependence of the protection of the scission by UGGAA indicates that the pentamer binds with a & of approximately 1 pM (Fig. 10). Binding can also be demonstrated by a gel-retardation assay where the pentamer is added to the open complex and then subjected to a gel mobility-shift assay. If the UGGAA is the only component of this ternary system label that is labeled with 32P, the migration of the radioactivity is demonstrated to be dependent on the concentration of UGGAA. Both these assays established high-affinity binding of UGGAA and also indicated that the corresponding oligodeoxyribonucleotide did not interact with comparable afhity. Binding affinity to the open complex is clearly not completely governed by Watson-Crick basepairing. The active site of RNA polymerase may preferentially stabilize the A-helix of RNA-DNA heteroduplexes within the open complex. However, neither experiment establishes unequivocally whether UG-

142

DAVID M. PERRIN ET AL.

GAA binds in a parallel or an antiparallel orientation with respect to the template strand. This important question was resolved by preparing the OPlinked derivative of UGGAA (Fig. 11).When its scission reaction is activated by the addition of cupric ion and ascorbate, cleavage upstream of the start of transcription is observed as anticipated for antiparallel orientation. The concentration range in which the pentamer exerts all the effects described is the same as that in which it inhibits nascent transcription synthesis using y-labeled ATP (62).

VIII. Protein-targeting of DNA Scission There are several goals in developing methods for converting DNAbinding proteins into site-specific nucleases (60, 61). The first is the generation of a new family of restriction enzymes that would be useful for digestion of genomic DNA in larger fragments than is currently possible with restriction enzymes. DNA-binding proteins typically recognize sequences of 16 bp, roughly twice the length of the most specific restriction enzyme. The second goal is the development of a chemical method for the evaluation of structural inferences based on structural models derived from X-ray crystallography and NMR. Finally, the transformation of a DNA-binding protein into a scission reagent would provide a method for identifying its recognition site within a genome. For example, if genomic DNA were used as a substrate, the site of scission could in principle be identified b y ligation-mediated PCR (63). The most efficient scission reagent that we have generated is derived from the E . coli Ti-p repressor. This protein, responsible for the tryptophandependent regulation of the trpEDCBA, aroH and trpR operons (64, 65), is composed of two 108-aminoacid subunits and requires tryptophan for high&nity binding. The X-ray structures of both the holoprotein and a cocrystal with trpEDCBA have been solved (66, 67). The cocrystal structure has provoked controversy because it proposes the indirect readout of sequence information and a stereochemistry of binding on the trpEDCBA that others claim to be incorrect (68). Examination of molecular models suggests that the amino acid at position 49, which is a glutamate residue in the wild-type protein, is a promising residue to convert into a cysteine residue for modThe amino acid at posiification by 5-iodoacetamido-l,lO-phenanthroline. tion 49 does not contribute directly to protein binding, yet both subunits place the 1,lO-phenanthroline moieties at the center of the dyad axis at an optimal distance for reaction with the C-1 hydrogen if the cysteine, modified with 1, 10-phenanthroline, is directed into the minor groove. The modified trp repressor E49C-OP binds to TrpR-regulated operators with a Kd of ap-

143

OXIDATIVE CHEMICAL NUCLEASES

proximately 7 x 10-9 M under conditions wherein nonspecific binding is observed at 5 x 10-7 M . Both single- and double-stranded cleavage of the recognition sequence approach 90%. The scission reaction is so efficient that the acrylamide gel slice containing the trpR-aroH complex isolated using conditions typical for a mobilityshift assay shows strong sites of scission in the presence of copper without the addition of any reducing agent (69). At first, we wondered if hydrolytic scission of the phosphodiester bond was being observed. However, further work indicated that the cleavage is inhibited by catalase. The scission was therefore activated by reducing equivalents within the gel matrix that generated trace amounts of the coreactant hydrogen peroxide. This experiment indicates that a redox-active metal ion held in close proximity to the phosphodiester backbone can cause the nicking of DNA by an oxidative mechanism, and reinforces the idea that hydrolytic stability does not make DNA impregnable to strand scission (Fig. 12). Although copper is the most efficient metal ion for cleaving the DNA

(

2

3

4

5

6

7

8

9

FIG. 12. Activation of scission by reducing equivalents in the acrylamide matrix. Scission of 3'-labeled uroH within t h e gel slice without added thiol and hydrogen peroxide. Following isolation of the Trp repressor E4SC-OP-oroH complex by a gel-retardation assay, the slice of the acrylamide matrix was incubated for 16 hours. Single-stranded scission was assayed using a denaturing sequencing gel. Lanes 1 and 6, scission at 25°C after 5 hours (lane 1) or 16 hours (lane 6); lanes 2 and 7, scission at 37°C after 5 hours (lane 2) or 16 hours (lane 7);lanes 3 and 8, scission at 50°C after 5 hours (lane 3) or 16 hours (lane 8);lanes 4 and 9, scission at 61°C after 5 hours (lane 4) or 16 hours (lane 9); lane 5, Mauam-Gilbert G A chemical sequencing lane.

+

144

DAVID M. PERRIN ET AL.

with TrpR E49C-OP, iron also promotes scission at approximately V 4 to V 3 the rate of copper ion (69) (Fig. 13). In either case, both ascorbate and 3-mercaptopropionate can activate the scission. Other redox-active cations, such as Pb(II), Hg(II), and Mn(II), exhibited marginal reactivity, whereas Co(II) and Ni(I1) gave no detectable cleavage. The possible interference of trace levels of copper and iron in these results cannot be excluded, even though neocuproine and deferoxamine were added to sequester them (69).

1 2 3 FIG.13. Scission of aroH by TrpRE49COP-FeZ+, Scission of 3'4abeled aroH by Trp repressor E4QC-OPactivated by Fez+, isolated by gel retardation in the presence of 3 mM 3-mercaptopropionic acid and 3 mM H 2 0 2 for 3.5 hours. Analysis of single-strandednicks: lane 1, probe alone; lane 2, scission by E49C-OP complexed with Fe2+;lane 3, Maxam-Gilbert G + A chemical sequencing lane.

OXIDATIVE CHEMICAL NUCLEASES

145

The lack of reactivity of Co(1I) and Ni(I1) may be due to their 102 to 103 higher avidity for the protein-bound OP than those of copper and iron. The other metals do not bind OP as copper and iron do, and may therefore be unable to inhibit the coordination of these redox-competent metals. The scission pattern of trpE49C-OP has provided unambiguous evidence in support of the positions of the trpR on the trpEDCBA operator proposed in the cocrystal structure (60, 70). This stereochemistry is not an artifact of crystallization, as critics of the X-ray structure have claimed (68). The dyad axis contacts the recognition sequence precisely as postulated. The scission pattern also reveals that there are two additional binding sites on this operator. Their existence had been anticipated from the work of Gunsalus and colleagues (71),who proposed tandem binding sites for the protein on the aroH and trpEDCBA operators, using methylation interference and DNase I footprinting. Our results, as well as X-ray studies (72), have further confirmed this hypothesis. The DNA-binding specificity of the Trp repressor has been the subject of a series of genetic studies by Youderian et al. (73),who devised a “phage challenge assay” to identify Trp R mutants with new or extended binding specificities (74).Position 79 has been a particular focus of interest. Mutation of the wild-type isoleucine to a lysine residue changes the preferred recognitions sequence from

5’-TCATCGAACTAG’lTAACTAGTACGC AAG-3' to

5’ -TCATCGACCTAGlTAACTAGGACGCAAG-3’, Genetic studies show that the wild-type protein has weak affinity for the recognition sequence of the TrpR79K, and the mutant protein has weak afEnity for the recognition sequence of the wild-type TrpR. We wondered if the cleavage patterns of the chimeras prepared from TrpRE49C and TrpRE49CI79K reflect this genetically determined specificity (75).The gelretardation assays and the scission pattern are fully consistent with these conclusions and provide additional evidence for the tandem binding model (Fig. 14). The gel-retardation studies indicate the anticipated complementary affinities of the proteins with their respective operator. The cleavage confirms the altered binding specificity of the mutant protein because the second site in the mutant protein does not have an A*Tto COGchange, which increases the afinity of the central site. The decreased efficiency of scission in the second pocket in the 2:1 complex could be due to a new interaction of the mutant repressor with the mutant operator at or near base-pair 7 of the

A

u‘t

7c

7 4 3 c2 TCGAACUGT~JAACUGITCGA

J.

C

J.

G

FIG. 14. Redesigning the scission specificity of TrpR-OP chimeras (75). (A) Ribbon diagram of the TrpR dimer structure, showing the Ile-79 and Ala-80 side-chains and the 1,10phenanthroline-copper adduct with Cys-49 (OP-Cu) aligned with the consensus operator (courtesy of Ralf Landgraf). The consensus, minimal trp operator sequence carried by a challenge phage (center) corresponds to the central of three tandem binding sites for Trp holorepressor in the natural trp operator with highest affinity (bottom). The symmetric “7C” change inactivates the minimal, consensus trp operator, as well as the central of three sites in the natural trp operator. (B) Gel retardation of E49C-OP (wt) and 179K E49C-OP mutant Trp repressors bound to the natural (wt) and mutant (“7C”)trp operators (75).All lanes contain a restriction fragment with the 5’-labeled trpEDCBA operator. Lanes a and b, labeled wild-type operator alone; lanes c and d, complexes of 0.2 and 0.6 nM wild-type TrpR E49C-OP protein with wild-type operator, respectively; lanes e and f, complexes of 0.2 and 0.6 nM mutant 179K protein with wild-type operator, respectively; lanes g and h, labeled mutant (“7C”) operator alone; lanes i and j, complexes of 0.2 and 0.6 nM wild-type protein with mutant operator; lanes k and 1, complexes of 0.2 and 0.6 nM mutant protein with mutant operator. In both cases (wild type + wild type and mutant + mutant), two protein-DNA complexes are formed. The higher (more slowly migrating) complex has 2: 1 repressor dimer/operator fragment stoichiometry, whereas the complex with intermediate mobility has 1:lstoichiometry. (C) Scission patterns of 1,IO-phenanthrolinecopper endonucleases within the wild-type and mutant trp operators. All scission chemistry was carried out within the gel matrix. Lane a, Maxam-Gilbert G + A chemical sequencing lane of the wild-type operator; lane b, wild-type operator; lane c, wild-type operator + wild-type endonuclease, 2:1 complex; lane d, wild-type operator + wild-type endonuclease, 2:l complex;

147

OXIDATIVE CHEMICAL NUCLEASES

prime dimer-binding site that inhibits cleavage at the second binding site. Other possible causes include a poorer or less intimate interaction between the mutant repressor and this second site, or a combination of these factors. In any case, these experiments demonstrate that once a protein can be

B

wt operator

I

TrpR (wt)

'I

TrpR I79K

7 ~operator "

II

TrpR (wt)

b

c

d

e

wtTrpR/wt operotor

a b c d

f

g

I

n n

n n

a

TrpR I79K

h

i

j

k

l

I79K/ "7C"operotor

e f gh

lane e, Maxam-Gilbert G + A chemical sequencing lane of the mutant operator; lane f, mutant operator; lane g, mutant operator + mutant endonuclease, 1:1 complex; lane 11, mutant operator mutant endonuclease, 2:l complex. Note that the efficiency of cleavage is high for all four complexes; uncleaved operator fragment (top of gel) accounts for less than 20%of the total label in lanes c, d, g, and h.

+

148

DAVID M. PERRIN ET AL.

transformed into a highly efficient nuclease, a family of efficient cleavers with unique cleavage sites can be generated by altering the DNA-binding specificity. The strategy for transforming TrpR into a site-specific nuclease involves attachment of the 1,lO-phenanthroline at rigidly fixed sites adjacent to the minor groove near the dyad axis of the protein’s binding site. Ebright et al. pioneered an alternate approach that attaches the 1,lO-phenanthroline to an amino acid that is close to the DNA in a specific high-af€inity complex, but remote from the DNA in a nonspecific complex (76). The protein used in testing this paradigm was the E . coli catabolite activator protein (CAP), which binds DNA to its correct recognition sequence. They could cleave kilobase and megabase DNA substrates efficiently without any detectable background. The absence of nonspecific scission is further evidence that the oxidative species produced by 1,lO-phenanthroline-copperis not diffusible. Our group has shown that the E . coli factor for inversion stimulation (fis)also can be converted into a cutter by using a similar strategy, although the efficiency is not as impressive as that obtained with CAP (77). However, the cutting yield was sufficient to discover new binding sites for this accessory protein.

IX. Conclusion The chemical nuclease activity of 1,lO-phenanthroline-copper has provided interesting insights into nucleic acid structure/function relationships as an untargeted 2:1 1,10-phenanthroline-cuprouscomplex or when linked to a high-affinity specific carrier. Untargeted, the specificity has been dictated by the binding of the tetrahedral hydrophobic cation. This apparently featureless cation has exhibited a remarkable set of preferential binding sites. First, it is a minor groove ligand. Second, and more surprisingly, this cation has an affinity for the single-stranded structure that forms at transcription start sites. This interaction appears to be unprecedented in nucleic acid chemistry. The structural determinants that govern the affinity of the cation to DNA bound to the active sites of RNA polymerase are unknown. The specificity that the coordination complex has demonstrated for singlestranded RNAs is possibly related to this binding, observed at kinetically competent transcription start sites. The efficiency of the scission of the 1:l complex when linked to a highaffinity DNA ligand promises to lead to a new family of rare cutters, which should facilitate the manipulation of genomic DNA. Possibly, the conversion of physiologically important proteins, such as those containing homeodomains, might facilitate the identification of the binding site of these tran-

OXIDATIVE CHEMICAL NUCLEASES

149

scription factors, which are essential for development in inulticellular organisms.

ACKNOWLEDGMENTS Research from our laboratory has been supported by NIH GM21199 and the Office of Naval Research.

REFERENCES 1. 2. 3. 4. 5. 6.

7. 8. 9. 10. 11.

12. 13. 14. 15. 16. 17.

F. H. Westheimer, Science 235, 1173 (1987).

T. A. Steitz, Q. Reu. Biophys. 23, 208 (1990).

L. S. Beese and T. A. Steitz, EMBOJ. 10, 25 (1991). J. Chin, Accts. Chem. Res. 24, 145 (1991). D. S. Sigman, G. M. Wahl and D. J. Creighton, Bchem 11, 2236 (1972). D. S. Sigman, A. Mazumder and D. M. Perrin, Chem. Reu. 93, 2295 (1993). D. S. Sigman, D. R. Graham, V. D’Aurora and A. M. Stern, JBC 254, 12269 (1979). D. S. Sigman, Accts. Chem. Res. 19, 180 (1986). V. D’Aurora, A. M. Stern and D. S. Sigman, BBRC 78, 170 (1977). V. D’Aurora, A. M. Stern and 13. S. Sigman, BBRC 80, 1025 (1978). L. M. Pope, K. A. Reich, D. R. Graham and D. S. Sigman, JBC 257, 12121 (1982). E. A. Sausville and S. B. Honvitz, in “Effects of Drugs on the Cell Nucleus” (H. Busch, S. T. Crooke and Y. Daskal, eds.), p. 181. Academic Press, New York, 1979. S. Zamenhof, H. E. Alexander and G. Leidy, J. E r p . Med. 98, 373 (1953). L. E. Marshall, D. R. Graham, K. A. Reich and D. S. Sigman, Bchem 20, 244 (1981). C.-h. B. Chen and D. S. Sigman, PNAS 83, 7147 (1986). C.-h. B. Chen and D. S. Sigman, Science 237, 1197 (1987). S. J. Lippard, and J. M. Berg, “Principles of Bioinorganic Chemistry.” University Science

Books, Mill Valley, CA, 1994. 18. M. Kuwabara, C. Yoon, T. E. Goyne, T. Thederahn and D. S. Sigman, Bchem 25, 7401 (1986).

19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32.

T. E. Goyne and D. S. Sigman, JACS 109, 2846 (1987). T. B. Thederahn, M. D. Kuwabara, T. A. Larsen and D . S. Sigman, JACS 111, 4941 (1989). D. S . Sigman, A. Spassky, S. Rimsky and H. Buc, Biopolymners 24, 183 (1985). D. S. Sigman, Mol. Struct. Bioenergetics 10, 281 (1989). G . R. A. Johnson and N. B. Nazhat, JACS 109, 1990 (1987). D. R. Graham, L. E. Marshall, K. A. Reich and D. S. Sigman, JACS 102, 5419 (1980). H. A. Avila, D . S. Sigman, L.M. Cohen, R. C. Millikan and L. Simpson, Mol. Biochem. Parasitol. 48, 211 (1991). C. Yoon, M. D. Kuwabara, A. Spassky and D. S. Sigman, Bchem 29, 2116 (1990). H. R. Drew and A. A. Travers, Cell 37, 491 (1984). H. R. Drew, J M B 176, 535 (1984). C. Yoon, M. D. Kuwabara, R. Law, R. Wall and D. S. Sigman JBC 263, 8458 (1988). J. M. Veal and R. L. Rill, Bchem 27, 1822 (1988). J. M. Veal and R. L. Rill, Bchetn 28, 3243 (1989). J. M. Veal and R. L. Rill, Bchem 30, 1132 (1991).

150

DAVID M. PERRIN ET AL.

33. J. M. Veal, K. Merchant and R. L. Rill, NARes 19, 3383 (1991). 34. D. Suck and C. Oefner, Nature 321, 620 (1986). 35. D. S. Sigman, M. D. Kuwabara, C.-h. B. Chen and T. W. Brnice, in “Methods in Enzymology,” Vol. 208, p. 414. Academic Press, New York, 1991. 36. T. D. Tullius, Nature 332, 663 (1988). 37. R. P. Hertzberg and P. B. Dervan, Bchem 23, 3934 (1984). 38. M. D. Knwabara and D. S. Sigman, Bchem 26, 7234 (1987). 39. D. S. Sigman and C.-h. B. Chen, ARB 59, 207 (1990). 40. A. Mazumder, in “Footprinting of Nucleic Acid-Protein Complexes,” p. 45. Academic Press, New York, 1993. 41. M. J. Garabedian, F. LaBaer, W.-H. Liu and J. R. Thomas, in “Gene Transcription-A Practical Approach (B. D. Hames and S. J. Higgins, eds.). IRL Press, Oxford, 1993. 42. A. 6 . Papavassiliou, Methods Mol. B i d . 30, 43 (1994). 43. A. Spassky and D. S. Sigman, Bchem 24, 8050 (1985). 44. H. Buc, M. Amouyal, M. Buckle, M. Herbert, A. Kolb, D. Kotlarz, M. Menendez, S. Rimsky, A. Spassky and E. Yeramian, in “RNA Polymerase and the Regulation ofTranscription. A Steenbock Symposium” (W. S. Reznikoff, ed.), p. 115. Elsevier, New York, 1987. 45. K. Kirkegaard, A. Spassky, H. Buc and J. Wang, PNAS 80, 2544 (1983). 46. J. F. Dobson, B. E. Green, P. C. Healy, C. H. L. Kennard, C. Pakawatcliai and A. H. White, J . Chem. Soc. Dalton Trans. 1985 37, 649 (1984). 47. C. L. Luke and M. E. Campbell, Anal. Chem. 25, 1588 (1953). 48. A. A. Schilt, “Analytical Applications of 1.10-Phenanthroline and Related Compounds.” Pergamon, Oxford, 1969. 49. A. Mazumder, D. M. Perrin, K. H. Watson and D. S. Sigman, PNAS 90, 8140 (1993). 50. A. Spassky, S. Rimsky, H. Buc and S. Busby, E M B O ] . 7, 1871 (1988). 51. B. Frantz and T. V. OHalloran, Bchem 29, 4747 (1990). 52. S. Buratowski, M. Sopta, J. Greenblatt and P. A. Sharp, PNAS 88, 7509 (1991). 53. D. M. Perrin, L. Pearson, A. Mazumder and D. S. Sigman, Gene 149, 173 (1994). 54. S. T. Smale and D. Baltimore, Cell 57, 103 (1989). 55. W. Wang, M. Carey and J. D. Gralla, Science 255, 450 (1992). 56. A. Mazumder, D . M. Perrin, D. McMillen and D. S. Sigman, Bchem 33, 2262 (1994). 57. B. C. F. Chu, G. M. Wahl and L. E. Orgel, NARes 11, 6513 (1983). 58. C.-h. B. Chen, A. Mazumder, J.-F. Constant and D. S. Sigman, Bioconjugate Chem. 4,69 (1993). 59. C.-h. B. Chen, M. B. Gorin and D. S. Sigman, PNAS 90, 4206 (1993). 60. D. S. Sigman, T. W. Bruice, A. Mazumder and C. L. Sutton, Accts. Chem. Res. 26, 98 (1993). 61. C. Q. Pan, R. Landgrafand D. S. Sigman, Mol. Microbiol. 12, 335 (1994). 62. D. M. Perrin, A. Mazumder, F. Sadeghi and D. S. Sigman, Bchen 33, 3848 (1994). 63. G. P. Pfeifer, S. D. Steigenvald, P. R. Mueller, B. Wold and A. D. Riggs, Science 246, 810 (1989). 64. C. Yanofsky, Nature 289, 751 (1981). 65. R. P. Gunsalus, A. G. Miguel and G . L. Gunsalus, j . Bact. 167, 272 (1986). 66. R. W. Schevitz, Z. Otwinowski, A. Joachimiak, C. L. Lawson and P. B. Sigler, Nature 317, 782 (1985). 67. Z. Otwinowski, R. W. Schevitz, R. G. Zhang, C. L. Lawson, A. Joachimiak, R. Q. Marmorstein, B. F. Luisi and P. B. Sigler, Nature 335, 321 (1988). 68. D. Staacke, B. Walter, B. Kisters-Woike, B. von Wilcken-Bergmann and B. Muller-Hill, EMBO J . 9, 1963 (1990). 69. A. Mazumder, C. L. Sutton and D. S. Sigman, Inorg. Chem. 32, 3516 (1993).

OXIDATIVE CHEMICAL NUCLEASES

151

C. Sutton, A. Mazumder, C.-h. B. Chen and D. S. Sigman, Bchem 32, 4225 (1993). A. A. Kumarnoto, W. 6. Miller and R. P. Guiisalus, Genes Der;. 1, 556 (1987). C. L. Lawsoii and J. Carey, Nature 366, 178 (1993). S. Bass, V. Sorrells and P. Youderian, Science 242, 240 (1988). P. Youderian, A. Vershon, S. Bouvier, R. T. Sauer and M . M. Susskind, Cell 35,777 (1983). J. Pfau, D. N. Arvidson, P. Youderian, L. L. Pearson and D. S. Signian, Bchem 33, 11391 (1994). 76. P. S . Pendergrast, Y. W. Ebright and K. H . Ebright, Science 1994, 959 (1994). 77. C. Q. Pan, J. Feng, S. E. Finkel, R. Landgraf, R. Johnson and D. S. Sigman, PNAS 91, 1721 (1994).

70. 71. 72. 73. 74. 75.

The Decay of Bacterial Messenger RNA’ DONALD P. NIERLICH*.~ AND GEORGEJ. MURAKAWA~ ‘Department of Microbiology and Molecular Genetics, and Molecular Biology Institute University of California, Los Angeles Los Angeles, California 90024 fDepartment of Dennotology University of California, San Francisco San Francisco, California 94143

I. Kinetics of Decay and Decay’s Basic Paradigm . . . . . , . . . . . . . ...... A. The Discovery of mRNA . . . . . . . . . B. Messenger Turnover-Early Studies C. Measurement of Functional Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Functional Decay E. Measurement of C F. Chemical Decay .. . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Rate of Bulk mRNA Decay .................... H. Assaying the Amount of mRNA . . . . . . . . . . . . , . . , . , . . , . . . . . . . . . I. Amount of mRNA and Its Regulation . . . . . , . . 11. How mRNA Decays: Pathways and Determinants . . . . . . . . . . . A. Different mRNAs Decay in Different Ways . . . . . , . , . . . . . . . . . . . . B. Decay of Some Representative Messengers C. Regulation of mRNA Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Determinants of Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. 5 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . . . . B. 3’ Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Internal Determinants and the Accumulation of Intermediates in Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Role of Translation and Ribosomes in Decay ... . .... E. Interactions among Determinants . . . . . . . . . . . . . . . . . . . , . . . . . . . . . IV. Enzymes of mRNA Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. RNase I1 and PNPase ........... B. RNase E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I

155 155 156 157 159 161 163 164 165 166 167 167 168 176 179 181 184 190 193 1% 196 197 199

Abbreviations and terminology: IPTG, isopropyl P-D-thiogalactoside; NMP, NDP, NTP, nucleoside mono-, di-, and triphosphate; PNPase, polynucleotide phosphorylase; RBS, ribosome binding site; r-protein, ribosomal protein; UTR, untranslated region. It is customary to distinguish between RNA decay and processing, a step that converts an RNA precursor to a product, e.g., the processing of tRNA. Although we find this distinction often useful, it is sometimes difficult to make. Cleavages of polycistronic mRNAs often lead to both functional and inactive products, or to products whose functions are not established. 2 To whom correspondence may be addressed. Progress in Nucleic Acid Research and Molecular Biology, Vol. 52

153

Copyright 0 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.

154

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

E. Other RNA Binding Proteins . ........................ V. Mechanism of mRNA Decay . . . . . . . . . . . . . . . . . . . . . . . . A. Models: Killer Ribosomes, the “Degradosome,” and the Runoff of Ribosomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Concerted Decay ........................ References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .....

205

206 209

The turnover of messenger RNA is central to the mechanism allowing single-celled organisms to adapt rapidly to environmental change. Not surprisingly, studies of the physiology, enzymology, and molecular mechanisms of mRNA decay have been underway since its discovery in 1961. However, despite this long history, only recently have mechanistic insights emerged. The many demonstrations that the Escherichia coli m e gene product (RNase E) is involved in mRNA decay have given real impetus to the study of this unusual protein’s properties and role. Indeed, recent models that envisaged a relatively straightforward process by which enzymes in the cytoplasm attack a somehow exposed messenger now seem perilously restrained. Continuing studies of the RNase E protein and a broader view of the decay process itself suggest a complex mechanism. Moreover, the recent attention given to the polyadenylylation of bacterial mRNAs and the discovery that polyadenylylation plays a role in the turnover of an E . coli plasmid-specific RNA show that bacterial mRNA decay has similarities to the eukaryotic process. Further along this same theme, there are hints that messenger decay, both in bacteria and in higher organisms, involves complex multicomponent interactions. A few reviews of messenger decay have appeared. Some of these deal with the topic generally (1-3); others focus on either the molecular process (4, 5 ) or the enzymes involved (6-9). Several reviews address stable RNA processing but also encompass mRNA decay (10-13). In addition, a monograph (14) deals broadly with the topic. Functional decay of a messenger refers to the inactivation of a messenger’s template activity; chemical decay refers to the degradation of the mRNA itself. Functional decay, as a process distinct from the cleavages that initiate degradation, has been demonstrated for only a few mRNAs. Bacterial messengers are inactivated and degraded rapidly: for E . coli, which can grow at 37°C with doubling times from 20 minutes to over 12 hours, the two processes appear exponential with average half-lives of 1-2 minutes. As a consequence, changes in the expression of specific genes occur rapidly.

DECAY OF BACTERIAL MESSENGER RNA

155

Specific messengers have half-lives that range from about 0.5 to 2O.minUtes. The rates of decay of many messengers are actively regulated, thus affecting the yields of the proteins synthesized. Unique to bacteria, portions of polycistronic mRNAs may decay at different rates, independent of the size of the segment or its position in the transcript. Overall, the decay of mRNA in bacterial cells is a major metabolic function. The turnover of mRNA constitutes about half of the cells’ RNA synthesis-the actual amount is related to their growth rate (15).

1. Kinetics of Decay and Decay’s Basic Paradigm The study of mRNA decay is challenging in two basic ways. First, decay is inextricably a dynamic process, and thus the subject revolves about kinetics and their measurement. Second, the view that one obtains of the mechanism differs a great deal depending on the vantage point-whether it be of the messengers of different bacterial operons, or of bacterial and eukaryotic genes. There is, however, cine unifying perspective: that is of the “RNA world.” Decay shares with messenger splicing, ribozymes, and post-transcriptional regulation the intricacies of complex nucleic-acid structures, protein-nucleic acid recognition, and possibly, nucleic-acid templates.

A. The Discovery of mRNA As many readers of this chapter know, messenger RNA was discovered after periods of both groping and systematic searching. Many researchers, including notably Monod, Jacob, Speigelman, Brenner, Watson, and their associates, recognized that the composition of newly made bacterial protein could change rapidly as cells responded to different stimuli, and yet the components likely to be involved in the synthesis of the new proteinsDNA and ribosomes-appeared stable (16-18). The ease of functional reprogramming suggested that some component is rapidly degraded, perhaps even after a single round of translation (19). In fact, a number of experimental results had presaged the discovery of messenger: these experiments indicated the presence of a labile RNA fraction in bacteriophage T4-infected E . coli as well as in E . coli itself (16u, 1 7 ~ ) . For example, cells that had incorporated radioactive uracil into RNA after a very short labeling redistributed it between uracil and cytosine within minutes. Similarly, a culture that received fluorouracil, a uracil analog, synthesized altered proteins after only a short lag (10). Such observations support the notion of a uracil-containing, labile intermediate in protein synthesis.

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

156

6. Messenger Turnover-Early

Studies

As a consequence, when messenger was discovered, its rapid turnover was fully appreciated; its metabolic lability had effectively hidden it from easy detection (16, 174. An example is seen in Fig. 1, a classical "pulsechase" experiment. The results show that when cells are labeled for a short

0

c?r n

(D

0 0.1

Tube number

Tube number

0.3

3

8 0.2

2:

'?r

3

E, 0

Q

0

I$

0.1

10

20

30

Tube number FIG.1. Sucrose gradient separation of E . coli RNA labele2 in a pulse-chase experiment. A culture growing at 28°C was labeled with [3H]uridine and an excess of nonradioactive uridine was added after 45 seconds. At the times indicated, portions were harvested, the cells disrupted, and the ribosomes sedimented in a sucrose gradient containing 0.1 mM Mg2+ so that ribosome complexes were dissociated (sedimentation right to left). 0, RNA in fractions (optical density at 260 pm); 0 , 3H incorporation in nucleic acid. (A) 2 minutes; (B) 6 minutes; (C) 20 minutes. Redrawn from Ref. 19.

DECAY OF BACTERIAL MESSENGER RNA

157

period (the pulse) and then allowed continued growth for 20 minutes without isotope (the chase), the label is present in the stable RNA species, ribosomal and transfer RNA (19). However, when the chase is short, 2 or 6 minutes, the labeled RNA sediments as a broad peak. Because it was presumed (and proved true) that 2 to 6 minutes is sufficient to allow the completion of most transcripts, the redistribution of species proved that RNA degradation and resynthesis were occurring. The heterodispersed peak of pulse-labeled RNA observed (Fig. 1A) includes growing and completed molecules of ribosomal and transfer RNA, and growing, completed, and decaying molecules of the cells’ diverse messengers (19-21). The lack of a means to separate the different fractions effectively has arguably been the major impediment in characterizing mRNA decay in bacteria. Measurements as simple as the determination of half-life are indirect and dependent on assumptions (22). As a consequence, some ingenuity has been used to measure messenger turnover. We think of them as falling into two classes:

1. The most convenient way to measure mRNA turnover, whether functional or chemical, is to stop the ongoing synthesis of the messenger and then measure loss of its function (23), or its mass disappearance (20). (Examples are in the following pages.) For specific messengers, the fraction remaining at different times generally follows an exponential function, e - k t (20, 24, 25). From this, the messenger’s half-life can be obtained as In Zlk, or graphically by plotting the time course on semilog paper. Sometimes decay commences only after a lag. Such lags may have simple experimental explanations or more complicated mechanistic implications (Section 11,B, 1). 2. I n steady-state growth, messenger synthesis equals decay plus a generally small increment proportional to the rate of growth. This relationship can be exploited to measure decay in different ways (21, 24, 26-28). One example is to measure the concentration of a newly produced mRNA as it approaches steady state. A gene produces mRNA at a constant rate whereas the messenger decays exponentially, and this difference over time can be used to determine the turnover rate.

C. Measurement of Functional Decay In characterizing the functional decay of the lac operon messenger (Fig. 2), the classic experiments of Kepes and contemporaries revealed basic attributes of the mechanism of decay that are valid today (21,23,29-32). Figure 3 shows an experiment in which E . coli cells were induced briefly with IPTG

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

158

a

Y

Z

OP

A

?’(

1000

0

L

E

2000

D

3000

4000

C

B

5000

A

FIG. 2. The organization of the ZmZYA (a) and trpEDCBA (b) operons. The stop and start codons of ZacZY and ZmYA are separated by 51 and 63 bases, respectively. Stable hairpins follow each of the three lac cistrons; that of the lmYA sequence is a REP sequence (155).The stop and start codons of the trpD and trpC genes are separated by zero bases and those of the trpC and trpB genes are separated by 11 bases; the remaining genes have overlapping stop and start codons. Data from GenBank files ECOLAC and ECTRPX.

and the “elementary wave” of appearance of P-galactosidase was measured (24).The rate of appearance of enzyme, delayed by the time of transcription and translation, increases and then decreases. The first reflects the appearance of messenger, and the second mirrors its decay. Replotting the data at each time point to show the (log) amount of enzyme yet to appear ( E , E,)/Emyields the time course of functional decay; this has also been referred to as decay in the capacity for enzyme synthesis (Fig. 3, right) (4,21,31,33). It also shows that decay is exponential, implying a random inactivation of individual molecules. Notwithstanding its historical importance, the use of an “elementary wave” of synthesis is not the best way to measure decay rate nor to establish that inactivation follows a random pattern (34).For lac2 message, both determinations are, however, supported by other experiments (Section I,D) (4, 23, 30, 35-37). A related procedure is better, because it measures directly the amount of template remaining (21, 36). In this case, after deinduction or otherwise blocking further messenger synthesis, the remaining template is assayed by transferring cell samples at intervals to individual culture tubes containing a radioactive amino acid. After further incubation, the amount of isotope incorporated into protein is measured.

159

DECAY OF BACTERIAL MESSENGER RNA

- A

c

I u 0

0

f-+

/

0

n,

L ._ c .-

m

E

2 2. m

Q

9 A

1 2 3 4 5 6 ’ 12 Time of addition of Chlorarnphenicol (rnin)

+

0

1 2 3 4 Time (rnin)

FIG. 3. The elementary wave of f.3-galactosidase synthesis after a brief induction. A culture was induced with IPTG for 20 seconds and then the culture was diluted 50-fold with prewarmed medium to restrict further induction. Samples were taken at the times indicated and were added to warmed tubes containing chloramphenicol. (This step effectively stops polypeptide synthesis but allows completed molecules to polymerize to the active form.) Enzyme was measured in all samples at 18 minutes (left). Data plotted as ( E , - E J E , (right). Redrawn from Ref. 24.

The Achilles heel of such measurements is the means by which RNA synthesis is blocked to expose mRNA decay. The goal is, of course, to minimize the extent to which metabolism is otherwise altered. For inducible or repressible operons, such as lac, it is often possible to use inducers or repressors, or their analogs (“gratuitous” compounds), without affecting metabolism significantly. Alternatively, synthesis of all RNA can be inhibited by various means (23, 35)-for example, with antibiotics such as rifampicin or actinomycin D (36, 38, 39). Rifampicin has an advantage in that it blocks the initiation of transcription but allows transcribing molecules to be completed (39).

D. Functional Decay Such measurements, as described in the previous section, have been made for a number of messengers of E . coli and Bacillus subtilis, the prototypes of Gram-negative and Gram-positive organisms respectively, and a moderate but diverse number of other eubacteria (Bacteria). In the most general terms, the results resemble those obtained with the lac genessome examples of special interest are discussed in Section II. Little has been done with members of the Archaea either in terms of functional or chemical decay (40, 41). The lac2 messenger-one of the best characterized-has a functional half-life of about 1-1.5 minutes at 37°C in E . coli B strains; it decays some-

160

DONALD P. NIERLICH AND GEORGE J . MURAKAWA

what faster (40-60 seconds) in K-12 strains (42, 43). The decay rate falls with temperature (44), but is remarkably unaffected by growth rate-that is, by nutrition-at a given temperature. Indeed, the rate of message decay (as with the rate of RNA and protein chain elongation) is constant when E . coli is grown with doubling times between 60 minutes and 10 hours; it falls by less than a half when growing with a doubling time of 24 hours (37). Similar results have been obtained for the biosynthetic tryptophan (trp) operon mRNA of E . coli (45). For the lac2 messenger, decay is unaffected by the presence of inducer (23, 30). For the trp messenger, decay is slowed to about a half under conditions in which the genes are strongly expressed; however, it is possible that this is an effect of amino-acid deprivation (46). There are, however, cases in which messenger decay is markedly regulated; examples are described below. Altogether, the functional decay of many messengers as well as that of bulk protein synthetic capacity have been measured. Generally this entails measurement of the time-course of incorporation of a radioactive amino acid into the residual protein synthesized in cells after mRNA synthesis is inhibited (47). Incorporation of isotope into bulk protein is measured after acid precipitation of the protein; the incorporation into individual proteins is measured by precipitation with antibody or after separation by gel electrophoresis (22). For a few enzymes, e.g., P-galactosidase, sensitive assays exist such that enzyme activity can be measured directly (Fig. 3). However, it has also been noted that errors can arise when nascent enzyme, in amounts of a few molecules per cell, is measured (36). From many such experiments, several generalities arise. Different messengers are inactivated at different rates, which generally follow exponential kinetics. In a study of the mRNAs of 22 proteins isolated by two-dimensional gel electrophoresis, functional half-lives varied between 40 seconds and 18 minutes at 30°C; the mean half-life of total protein-synthetic capacity was 2.5 minutes (47). Although in most instances the rate of decay of a particular messenger influences the production of the encoded protein, no general correlation exists between the abundance of proteins and the decay of their messengers (47). However, some abundant outer membrane proteins have particularly stable messengers, e.g., the ompA mRNA discussed in Section II,B,4). And whereas the molecular basis of the stability of the ompA mRNA is known, only a circumscribed group of mRNAs appear to share this feature. Further, one might assume that a group of related proteins controlled in concert by a common regulatory network might have mRNAs with similar half-lives. This idea appears simplistic. Among the ribosomal proteins resolved by twodimensional electrophoresis, mRNA half-lives varied from 1 to 2 minutes, although they were regulated similarly (47).

DECAY OF BACTERIAL MESSENGER HNA

161

One of the most important conclusions that can be drawn from these studies is that there is no relationship between the size of a transcript and its functional (or chemical) half-life (48).In addition, the functional half-lives of the lac2 and ZucA templates are (modestly) ddTerent, not withstanding that the two are cotranscribed (29, 30, 48, 49). That different segments of a polycistronic mRNA decay at different rates is also seen in the trpEDCBA mRNA as well as many other polycistronic transcripts (48, 50) (Section 11,B,3). Moreover, whereas the longest lived segment of the lac mRNA is the lac2 transcript at its 5’ end, the longest lived segment of the trp mRNA, the trpAB message, is at its 3’ end (48, 50). This lack of a relationship between the basic physical parameters of a messenger and its functional (or chemical) half-life, for either monocistronic or polycistronic mRNAs, points to the presence on individual messengers of specific stability determinants. By contrast, if each phosphodiester linkage were equally susceptible to cleavage by cellular ribonucleases, then the longer coding regions would be more susceptible to an inactivating cleavage. The existence of messenger-specific determinants of decay has been verified by constructing fusions between long- and short-lived messengers, but the exact nature of these determinants has remained remarkably elusive. The early studies of lac mRNA transcription and decay, of course, also provided many of the basic observations on bacterial gene expression (16,20, 21, 23, 30). From these, it was learned that transcription and translation are concomitant processes, such that translational initiation occurs only seconds after the 5’ region of the lac2 messenger is transcribed and well before transcription of the 3100-base cistron is completed, 80-90 seconds later (30). It was also deduced that decay of the Z message might commence at the 5’ end before the 3’ end is transcribed, although a the lag was observed in the onset of decay (21). This point is discussed further (Section II,B,l). It was also found that the 2 and A messages decay at different rates, implying that the polycistronic E . coli mRNA is cleaved (21, 30).This, too, was borne out.

E. Measurement of Chemical Decay Ideally, one would like to know the rates of turnover and relative abundances of all of the unstable RNAs in the cell. Of the unstable RNAs, messenger certainly makes up the largest part, but the processed and degraded portions of the ribosomal and transfer RNA precursors, and some regulatory RNAs, also turn over rapidly (11,51, 52). However, lacking a suitable means to separate these species, measurements of mRNA decay have relied on global measurements of RNA turnover (22). Historically, to measure the decay of the unstable species, cells have been labeled for a period that is short compared to the doubling time of the cells (30 seconds to 2 minutes) with a precursor such as radioactive uracil or

162

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

guanine (19, 20, 38). The labeling of RNA is then terminated, for example, with rifampicin, and incorporated isotope is measured in timed samples (39). An experiment in which [3H]uridine and rifampicin were added simultaneously to three E . coli cultures, which were growing at different rates, is presented in Fig. 4 (53).The results show a transient rise in isotope incorporated into RNA (X) as chains were synthesized and completed, followed by a loss, i.e., the decay. The residual stable counts (Y) reflect the amount of rRNA and tRNA synthesized. By recalculation of these results (see the legend to Fig. 4), one obtains exponential decay rates for the mRNA in the cultures with half-lives of 1.3-1.6 minutes. A similar protocol has been used to determine the decay of specific messengers. In this case, RNA is purified from labeled cells and hybridized to DNA carrying the gene of interest. Carefully carried out, this method is simple and effective (54);however, it requires close control over the conditions of hybridization. An abundant mRNA might constitute <0.04% of the total RNA and <0.4% of the pulse-labeled RNA of E . coli. Therefore, the assays can lead to substantial error. A second problem arises when one

.-

El

Acetate

S

51

- 0

0

10

TIME (minutes)

20

FIG.4. The synthesis and decay of RNA after rifampicin addition to E. coli cultures growing with different carbon sources. Cultures received [3H]uridine, rifampicin, and nalidixic acid (to prevent [3H]uridine incorporation into DNA) simultaneously, and samples were removed at the times indicated for the determination of incorporated isotope. See text (Section I,E) for explanation of X and Y. To obtain the rate of mRNA decay from the data, one plots (X - Y)/(X,,,= - Y) as a function of time in the same way as that shown in Fig. 3. The unlabeled curve (0)shows a culture that received [3H]uridine 5 minutes after rifampicin and nalidixic acid. Redrawn from Ref. 53.

DECAY OF BACTERIAL MESSENGER RNA

163

wishes to compare the synthesis and decay of mRNA in different cultures. As discussed in the following section, physiological conditions strongly influence the labeling of cellular RNA as well as the relative synthesis of different RNA species. Methods that are now more widely used for measuring chemical decay include Northern blot hybridizations, primer extension detection of the 5' terminus of the messenger, and protection of complementary RNA or DNA probes against nucleases that are specific for single-stranded nucleic acids (22). In each case, exogenously labeled probes are used so that detection does not depend on the cellular labeling of the subject RNA. Northern blots have an advantage in that one visualizes the messenger. Probes as small as 20-40 bases can be used so that very specific classes of RNA can be detected, or longer probes can be used so that the fate of an entire sequence is seen (55,56).Primer extension proves to be very sensitive and amenable to quantitation. With choice of primers, it also allows specific regions or cut sites to be visualized; with 5' proximal primers, it is often used to measure the kinetics of decay in cases wherein the disappearance of the 5' end of the mRNA reflects the decay of the entire messenger (57).Nuclease protection (with RNase I or S1 nuclease) is widely used to identify the 3' terminal sequence of an RNA; it is also used in innovative ways to measure the decay of different portions of a messenger or the decay of more than one mRNA simultaneously (58).

F. Chemical Decay Chemical decay proceeds at rates generally comparable to functional decay-half-lives of E . coli mRNAs vary between 0.5 and 20 minutes-and the decay follows exponential kinetics (59). This might occur if both depended on the same event as the rate-determining step, whether chemical decay followed at some later time or was concomitant with functional decay (4). Interestingly, the mRNAs of a number of E . coli bacteriophages have clearly distinct rates of functional and chemical decay (see 60). Many phage messages are processed as well, a feature found of only a few bacterial mRNAs (3, 11). Notwithstanding this, there have been several instances reported in which functional and chemical decays can be quite clearly separated. In gene fusions, when the t r p mRNA is transcribed from the P, promoter of phage A, the messenger is functionally inactivated at a normal rate, but its chemical decay is markedly slowed (60).A similar situation pertains in E . coli mutants carrying the ams(Ts) allele of rne (61). At the restrictive temperature, the chemical decay of several individual mRNAs as well as bulk mRNA is markedly slowed, whereas the rate of functional decay is unaffected (61). In a final example, the inactivation of the transposase mRNA of TnlO involves a pair-

164

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

ing with an antisense RNA that occludes its RBS. The pairing concomitantly creates an RNase I11 site whose cleavage leads to the removal of the mRNA (62). However, in rnc (RNase 111)mutants, the chemical cleavage is slowed, but inactivation still occurs. Northern blots reveal two important features of chemical decay. These are illustrated below in the context of the lacZ mRNA but apply to many of the messengers studied. The first is that the initial step of decay is ratelimiting and a random-hit process. This can be seen or stated in different ways: (1)mRNAs disappear with exponential kinetics-that is, a constant fraction of the population is degraded in each interval; (2) removal of individual molecules is “all or none,” there is no gradual shortening of the molecules with time; and (3)a full-length mRNA and its degradation fragments remain in constant proportion during decay (63). The second applies to monocistronic mRNAs and to those segments of polycistronic messengers that decay separately. For these, the decay of indioidual molecules is rapid. There are exceptions, but most often one sees little accumulation of intermediate fragments. If care has been taken in the isolation of the RNA (not a trivial task), there is strikingly little detected, either of specific-sized breakdown fragments or dispersed fragments of differing sizes. These two observations can be interpreted in mechanistic terms. The exponential loss of messenger suggests that molecules are random targets for decay as soon as they are made, or as soon as the determining features of their decay are available. The second observation-that there is a relatively low concentration of decay intermediates-can be interpreted in different ways. This “concerted decay (64) might occur due to the combined activities of endo- and exonucleases that act to remove the mRNA; it might be due to the action of a processive enzyme that functions in an all-or-none fashion; or it might stem from the existence of multiple pathways of decay (Section V,A). We believe the explanation lies in a mechanism that, in larger view, is processive, such that cleavages occur at several sites on the message simultaneously and the resulting fragments are concomitantly removed (Section

V,B).

G. Rate of Bulk mRNA Decay Although individual mRNAs have half-lives spanning a wide range, measurements of total functional or mass decay yield monotonic exponential decay rates with half-lives <2 minutes. In the experiment shown in Fig. 4 and others like it, mRNA, assayed following a brief labeling with an RNA precursor, decays with a half-life in the range of 1.5-2 minutes (at 37°C). There appear to be two underlying reasons. First, in a brief labeling of cells, the mRNAs that turn over the fastest are preferentially labeled, Put another

DECAY OF BACTERIAL MESSENGER RNA

165

way, those mRNAs with long half-lives are replaced at a slower rate and thus, in a short labeling, are underrepresented (38).Second, as seen in the following discussion, one can surmise that most E . coli mRNA, in terms of abundance, does decay with a half-life on the order of 90 seconds. When the capacity to synthesize protein is assayed following rifampicin addition, very similar half-lives are obtained (35, 47, 63). Measurements of functional half-life made on cultures similar to the three described in Fig. 4 gave values of 1.4-1.7 minutes (63). It appears that >90% of E . coli mRNA decays with a half-life close to the average value (47).3This is in keeping with findings that some particularly abundant mRNAs, such as those of the ribosomal proteins, have half-lives in this range.

H. Assaying the Amount of mRNA Because the size of the messenger pool does not change during steadystate growth, the amount of RNA turnover can equally be assayed as synthesis or decay. In a short interval, the amount of synthesis equals that decaying plus that amount added due to net growth. Because for most messengers the rate of turnover is 10 to 40 times the rate of growth of the bacteria, the second term can often be neglected. An approximation of the amount of RNA turnover can be obtained after the addition of rifampicin or actinomycin D, provided that the antibiotic does not induce the decay of stable species (38).This is also shown in the experiment described above and in Fig. 4.The ratio of counts present in the plateau (Y in the figure) to peak counts (Xmm) gives the fraction of stable RNA made during the pulse, and (X,,, - Y)/Xma gives the fraction of unstable RNA. These measurements are widely used, but, nonetheless, there are reasons for caution. Whereas the method can be used to estimate the fraction of the RNA in cells subject to turnover and the half-life(s) of this RNA, as discussed in the next paragraph, it cannot be used to determine the absolute rates of mRNA turnover in cultures in different physiological state without additional considerations (25, 38, 65, 66). Although conceptually simple, the direct determination of the total rate of synthesis or decay of messenger is unexpectedly complicated. Early experiments clearly show that mRNA is preferentially labeled during short labeling times (Fig. 1); thus, it was thought that pulse labeling could be used 3 By contrast, one report shows a rate of protein synthesis that declines smoothly following rifampicin addition and stops only at 45 minutes (48). Ironically, it might be argued that such a prolonged period for mRNA decay is the “expected result. If translation protects mRNA from decay (Section III,C), it might be expected that the remaining mRNA would become more stable with time after ritampicin treatment because of the redistribution of ribosomes as the mRNA pool falls.

166

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

to measure directly the rate of mRNA synthesis. This turned out not to be true (67). Such nucleic-acid precursors as uracil, uridine, and phosphate, which are used as tracers, do not enter bacteria freely; their uptake is regulated (68). For many of these precursors there is a dual control: first, there is a feedback inhibition by an intracellular nucleotide that is produced from the precursor, and, second, there is an inhibitory control by guanosine 5’-diphosphate 3’-diphosphate (ppCpp), a regulatory nucleotide whose concentration changes with growth rate and increases during cellular stress. These controls are only mitigated by a degree of pool expansion when the cells are initially exposed to the precursor (26, 69). In sum, these controls ensure that cells do not transport precursors beyond their needs for net (stable) RNA synthesis (25, 65, 69). Moreover, when growth is restricted, simple labeling methods are foiled by the cells’ preferential internal recycling of mRNA decay products. For example, when ribosome synthesis is inhibited by stringent control, mRNA turnover may be grossly underestimated unless the entry of precursors into the cell is considered (69).

I. Amount of mRNA and Its Regulation For E . coli cells growing at a moderately rapid rate ( t d = 45 minutes), about 20%of the total (dry) mass is ribosomal and transfer RNA. The synthesis of these RNAs, however, only reflects on the order of 50% of the rate of RNA synthesis (15). The balance is largely mRNA. Nevertheless, due to its rapid decay, mRNA only amounts to 3-4% of the total RNA mass. In Bacillus subtiZis this value is higher, about 8% (20). The overall rate of RNA synthesis and the fraction of this that is unstable are regulated and vary with growth rate (Fig. 4) (15, 66, 69). It is assumed that the unstable RNA is largely mRNA, but other species are known to turn over (above), and it is not fully known what fraction they contribute. Moreover, in slowly growing or starved cells, rRNA appears to be degraded. Unfortunately, little is known about the latter process, either in terms of its regulation or whether or not enzymes are used that are common to mRNA turnover as well (15). The fraction of the RNA made in E . coli that is mRNA (i.e., unstable RNA) varies in the range of 30 to 70%for cell doubling times of 30-90 minutes (or longer) (15, 53, 66, 67, 69, 70). The initial experiments demonstrating the existence of mRNA revealed that the amount of RNA turnover is substantial (Fig. 1).To some this result was counterintuitive. Because bacteria are thought to grow “efficiently,” it was surprising that they would expend so much energy on turnover. This impression was probably based on an incomplete view of protein synthesis. The cost of messenger turnover is only a small percentage of the amount of energy expended (26). The incorporation of one amino acid into protein, including amino-acid activation and the ribosome-associated steps of

DECAY OF BACTERIAL MESSENGER RNA

167

peptide-bond synthesis, expends four high-energy bonds. By contrast, based on estimates that E . coli mRNA is translated on average 20 to 30 times before being degraded (26, 70), the cost of synthesis of each triplet codon is 0.15 (3/20)ATP or 6/20 high-energy bond. This suggests that the percentage of the energy expended in protein synthesis that is given to turnover is about 7%. A more comprehensive calculation would also take into account the possibility that the phosphorolytic cleavage of mRNA by polynucleotide phosphorylase in the decay of mRNA reduces the energy cost of turnover by up to 50% (71) (Section IV,A), the cost of de nmo amino-acid synthesis, and the synthesis of ribosomes. These factors reduce the energy consumption further.

II. How mRNA Decays: Pathways and Determinants

A. Different rnRNAs Decay in Different Ways Early research on the decay of the lac and trp messengers suggested a common mechanism. It appeared that in each case the initial polycistronic transcript was cleaved endonucleolytically, that the 5’ end of the transcript could begin decay before the 3’ end of the operon was transcribed, and that distal regions of transcripts that were left untranslated by nonsense mutations were degraded very rapidly, if transcribed at all (5, 7, 72). Moreover, ribosomes, frozen by translation-blocking antibiotics such as chloramphenicol slowed decay, and their release by the antibiotic puromycin destabilized the messengers (73). Had these findings been universal, our understanding of decay might be greater today. A report on the effect of translational antibiotics on the functional decay of three bacteriophage T4 messengers displayed their individuality (74).The decay of each messenger responded differently as the concentrations of chloramphenicol, puromycin, and kasugainycin were varied. If the influence of translation on the rate determining step differed for each, might not the path of decay differ for each? The clearest indication that this was so came initially from studies of the decay of the phage A int mRNA (75).The int messenger is subject to “retroregulation”; that is, the decay of the mRNA appears to depend on the 3‘ site at which termination of transcription occurs. By contrast, decay of the trp mRNA can be made to depend on the promoter from which it is transcribed (60). As a result, studies of different messengers tell different stories, pointing at different mechanisms and suggesting a panoply of rate-limiting determi-

168

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

nants. In this section, we review several messengers and different decay determinants. In Section V,B, we advance a model for messenger decay that may reconcile some of the diverse observations.

B. Decay of Some Representative Messengers 1. DECAYOF THE lac OPERONmRNA The decay of the lac messenger (Fig. 2) is among the best studied, both in historic and contemporary contexts. Many workers contributed, producing pioneering studies. Those of Gros (72), Magasanik (35),and Levinthal (32)come to mind; the work of Kennel1 is particularly extensive (4). The ease and sensitivity of the measurement of P-galactosidase and length of the lacZYA transcript (5100 bases) facilitate study of its decay, particularly where timing of events and resolution of intermediates are concerned (76). The transcription of the lacZ gene takes about 80 seconds and that of the operon requires about 2 minutes at 37" (21, 30, 55). a. Cleavage of the Polycistronic Transcript. Kepes' prediction (Section 1,D) that the polycistronic transcript is cleaved to separate cistronic segments was confirmed by determinations of the size of the lac messenger at different times after induction and during decay (4, 21, 63, 77). At first, h Z Y A and h Z mRNAs predominate; these give way to messengers of the size of lacZ and smaller fragments, consistent with cleavages in the intercistronic regions. The individual messengers are seen clearly on Northern blots when gene-specific probes are used (55, 78). In steady state, species can be identified that correspond to the lacZ, lacA, lacZYA, lacZY, lacYA, and lacy messengers in approximately that order of abundance. The results indicate that the two intergenic regions are cleaved in random order and at approximately equal rates (T. Thai, G. J. Murakawa and D. P. Nierlich, unpublished). Although not predicted by earlier work, these studies also indicate that the lacy and lacYA messengers are degraded relatively rapidly after their formation (55). Prior work did, however, suggest that the lacy messenger is translationally inactivated following cleavage in the ZY intergenic region (77). Several studies have undertaken to identlfy the site(s) of cleavage in the lacZY intergenic region. In one, mRNA fragments were isolated by differential hybridization, 5'- or 3'-end labeled, and their sequences determined (79).The results indicated that cleavages occurred at a number of sites, most often between Pyr-A bases, and, surprisingly, in the distal part of lac2 and proximal part of lacy as well as in the intergenic region. However, similar results were not obtained in other studies (55, 80). These showed, using different methods, that the 3' end of the lacZ messenger is located just distal

DECAY OF BACTERIAL MESSENGER RNA

169

to a stable hairpin sequence, which is in turn located in the intergenic region just 3‘ to the lacZ stop codons. And, whereas one might assume that the cleavage that gives rise to the 3’ end of the Z messenger would also create a lacy species bearing a 5‘ end at the same site or downstream, no such 5’ end was found (80). However, this result may arise due to the rapid degradation of lacy and lacYA messengers after their formation (55).[A weak 5’ end was found just upstream of the above-mentioned hairpin when a ribosomal protein (S lO’)-’lac fusion transcript was characterized (8O).]

b. Decay of the lacZ Messenger. Due to the relatively rapid cleavage of the polycistronic messenger in the lacZY region, the lacZ message is the predominant lac message in the steady-state cell (55).And, notwithstanding the vagaries as to the location and nature of the cleavage in this region, several studies show that the hairpin distal to ZacZ, like stable hairpins at the 3’ ends of other mRNA’s, is required to stabilize the upstream messenger from exonucleolytic degradation (36, 55) (C. Kwan and D. P. Nierlich, unpublished). Such a 3’ end might be formed by direct cleavage or by the action of a 3’-exonuclease on molecules cleaved (or terminated) at a downstream site. The most striking feature of lac mRNA decay also relates to that of many other messengers. As can be seen in Fig. 5, the approximately 3150-base lac2 messenger disappears exponentially from the cell and the amounts of breakdown intermediates, to the (limited) extent that they are detectable, remain in fixed proportion to the full-length message (55, 77). This implies that the first step of chemical degradation is the rate-limiting step. Moreover, such results also argue that degradation overall occurs by a 5‘-to-3’ or 3’-to-5’ process, rather than by random endonucleolytic attack (81). This follows from the observation that the size distribution of the apparent decay fragments is not gaussian (but see Section V,B). c . A Lag in the Decay of the lacZ Messenger? In addition to the rates of functional and chemical decay, an additional parameter is needed to describe the decay of an individual messenger. This is the lag or time to onset. In cases in which decay is found to follow exponential kinetics, this can be defined as the time between the initiation of transcription (during a very short induction period) and the time that decay commences as estimated from the back-extrapolate of the exponential decay line (Fig. 3)(82).In this context, two types of lags can be distinguished: a fixed “delay,” or one that arises because initiation of decay requires several randomly timed events and thus displays multihit kinetics. Measurement of the lag could give insights into the mechanism of decay (4, 32, 33). For example, a delay might arise if a distal part of a messenger was the target of initial cleavage, or if

170

DONALD P. NIERLICH AND GEORGE J . MURAKAWA

ZYA

ZY

--

1541

= -

420

-

2905

1

2

3

4

5

6

7

8

FIG. 5. The decay of the lac mRNA after rifampicin addition. Northern blot probed with an RNA probe to the 5‘-lacZ region (G. J. Murakawa, T. Thai and D. P. Nierlich, unpublished observations; for methods, see Ref. 55). Lane 1, 3 minutes after rifampicin addition; lanes 2-4, samples taken at 30-second intervals.

decay involved a 3‘-poly(A) tract; inactivation could not commence until the distal segment was made. However, in practice, only a few measurements of the lag have been made (33, 83). Such measurements generally require use of a long messenger, such as ZucZ, and are limited by the need to stop transcription in a welldefined way.4 Different techniques also give subtle colorings to the actual measurements made; e.g., one might measure the cleavage of the 5’ end of an mRNA or the disappearance of all hybridizable material; or one might measure the cessation of initiation of translation (“potential” according to Ref. 33) or cessation of translation (“capacity”; Ref. 24). A lag in the functional decay of the lacZ message has been observed. In one study, a decay lag was found consistent with either a two-hit process or with a delay of the magnitude of the transcription time of the ZacZ message, 80 seconds (33).In this same study, the kinetics of decay of two other mRNAs indicated that they are single-hit processes, but due to their (short) length, one could not rule out a delay as occurring. A further study from the same 4 Experimentally, the measurements of the lag must often be corrected for the time ofentry (removal) and action of rifampicin (inducer). Thus, the lag is often assumed to be short or its measurement overlooked, in part because of the difficulty in obtaining a good inhibition of transcription or in estimating its time accurately (58, 84, 85). Unfortunately, rifampicin is a problem in this regard. It only penetrates Gram-negative bacteria poorly: 250-400 p,g/ml is often used in studies with E. coli whereas only 2 or 3 pg/ml is needed to inhibit RNA polymerase. Higher concentrations are a concern because of the chance of secondary effects due to the antibiotic or its solvent (ethanol, etc.). Alternatives have depended on use of strains made more permeable by treatment with EDTA or on those that are genetically variant (30, 33).

DECAY OF BACTERIAL MESSENGER RNA

171

lab showed that cleavages of the 5' ends of ZucZ mRNA commence (stochastically) as soon as they are synthesized (85). Moreover, the earliest cleavages occur with a characteristic half-life, implying that they are part of the same process that occurs later. Taken together, these studies suggest that the lacZ message decays by a two-hit process that results in cleavage at the 5' end and inactivation of its template activity; this process does not require the transcription of the 3' end of the ZucZ. Some insight into the mechanism of the initial steps comes from another recent study of functional decay that made use of constructions in which the ZucZ RBS and upstream region were replaced with several mutant 5' regions of the rpZA gene that varied in translational strength (83).These experiments demonstrated a lag that varied with the strength of the RBS employed; the stronger the RBS, the longer the lag. These results also have bearing on other findings that suggest that translation influences the rate of ZucZ messenger mass decay (Section III,D, 1). Notwithstanding that the decay of the ZacZ messenger commences (randomly in time) before its 3' end is transcribed, another study indicates that a different situation might be observed with still intact ZucZ messages that have been transcribed and cleaved free of the distal ZacV sequences (36). When functional decay was assayed in a series of lucZ mutants that possessed altered RBS and/or altered 3'-sequence stabilizers, both features affected functional decay. Thus, the presence of the stabilizing 3' hairpin (Section III,B) does influence the rate of decay. A role for 3' and 5' ends has been advanced in the decay of eukaryotic mRNAs (86).Whereas in general terms, the significance of the 3' end might be less in a message as long as ZucZ, it could be much more important for shorter transcripts in which the 3' end might be exposed for some period after transcription is complete.

d . ZacZ Messenger Decuy Determinants. From several lines of study, the primary determinant of both functional and chemical decay appears to be at the 5' end of the ZacZ messenger. Direct evidence for a 5' determinant comes from a variety of studies in which the 5' end of ZacZ is altered. Certain changes in the RBS, start codon, adjacent codons, and upstream of the RBS each affect decay to a variable extent (36, 83, 87-90). However, the most compelling evidence is that the fusion of the ZucZ sequence to a variety of upstream sequences that normally decay with long lives confers a longer life on the ZucZ message (52, 91), even in heterologous cells (92, 93). However, because such alterations may mask the normal determinants of decay, these studies imply only that the 5' end might contain the rate-determining element. They do nonetheless show that sites internal to or at the 3' end of the messenger have no independent role.

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

172

e. Cleavage of the lacZY Intergenic Sequence. Site-specific mutagenesis of the 54-base lacZY intergenic sequence suggests that the distinguishing feature of this region that targets it for cleavage is that the sequence is untranslated (94); (0. Fattal, T. M . F. Tuohy, J. F. Atkins and D. P. Nierlich, unpublished). Alteration of a variety of potential recognition sites, e.g., the hairpin structure, and random mutagenesis did not reveal sites that influenced the cleavage. By contrast, alterations of the ZacZ stop codons, such that translation passed into or through the intergenic region, suppressed cleavage. 2. DECAY OF

THE

trp mRNA

The five genes of the trp operon of E . coli encode the biosynthetic activities for the synthesis of tryptophan (Fig. 2). The genes of the operon, trpEDCBA, are organized in the reverse order of the steps that their products catalyze (46). Repression control is maintained through tandem mechanisms: the action of the TrpR repressor and a transcriptional attenuator in the leader sequence.

a. Cleavage of the Polycistronic Transcript. Like the lac mRNA, the decay of the trp polycistronic transcript undergoes internal cleavages to produce messages whose sizes roughly correspond to gene clusters-the precise sites of cleavage or the ends of these fragments have not been determined. The feature that distinguishes trp messenger decay is that the cleavages appear to occur sequentially from 5’ to 3’ along the message (46, 50). Labeling of E . coli cultures several minutes after repression of the operon gives RNA molecules labeled (presumably during the last pass of RNA polymerase) in the distal trpA gene transcript. In subsequent minutes, this message, detected by hybridization to an A gene sequence, shifts in size from predominantly full length, EDCBA, to A length, with intermediate fragments, e.g., DCBA, CBA, appearing and disappearing in accord with sequential 5’ to 3’ cleavages along the messenger. This contrasts to the lacZYA message in which the cleavages appear to be randomly ordered. The observation of sequential cleavages is supported by several other findings (50). One merits mention. When the trp operon is fused to an upstream stabilizing sequence, the entire polycistronic messenger is stabilized. This argues against the existence of independently accessible cleavage sites along the mRNA. It can further be said that it is likely that this relates to the fact that translation of the trp genes is “coupled.” That is, ribosomes translate the operon in large part by a full-length passage. Underlying this coupling is the fact that the intergenic sequences are very short; in fact, for two of the trp genes, their stop codons and start codons overlap (Fig. 2 legend) (46).Thus, one might rationalize the difference between the decay of

DECAY OF BACTERIAL MESSENGER RNA

173

the trp and lac mRNAs as being related to the absence and presence, respectively, of sizable intercistronic regions.

b. Determinants of Decay. As a consequence of work done in the labs of C. Yanofsky, F. Immamoto, and D. Schlessinger, the trp mRNA may be among the better examples of the role of 5’ decay determinants (95-99). A model to explain the pattern of trp mRNA decay might propose a dominant influence on decay by the transcript’s 5’ region, and secondary roles for the 5’ ends of successive downstream regions that function when upstream sequences are removed in the course of a 5’-to-3’ wave of decay (46, 50, 95, 96). In early research, it was found that decay of the 5’ end of the trp mRNA begins before 3’-end sequences are synthesized, and that the 3’ trpBA messenger decays more slowly than the 5’ trpE mRNA, that is, with over a twofold longer half-life (95, 97). Different strategies were carried out to determine the basis of the latter difference. For example, long, internal deletions in trp conferred a shorter half-life on the distal messenger, initially suggesting that its decay depends on its proximity to the 5‘ end of the message (50). However, the creation at internal sites of new promoters yielded distal transcripts whose decay rates paralleled those of their native counterparts (96).Thus it was argued that each segment possesses individual determinants, and that the slower decay of the 3’ end depends on a regional, secondary determinant, not on its distance from the operon’s promoter. That the 5‘ end has a dominant role was directly demonstrated in a remarkable experiment (60). A A transducing phage carrying the entire trp operon was found to express the trp genes both from the trp promoter (PT)as well as from the A P, promoter. The phage carried trp distal to its N gene, also transcribed from P,. Cells infected with the phage could use either promoter or both together, depending on physiological circumstances, i. e., on the presence and state of the A and trp repressors. In these experiments, the trp message transcribed from P, was degraded normally in regard to both rate and pattern (60).Message transcribed from P, was stabilized by about 10-fold, the degree increasing with time after infection, such that 16 minutes post-infection the messenger had a half-life in excess of 20 minutes. Additionally, whereas messenger transcribed from the P, promoter was stabilized chemically, the functional decay of the messenger from either promoter remained near normal. These experiments argue that the primary determinant of chemical decay is at the 5’ end of the trp messenger, and that there are no distal independent sites. Their results also support the notion that functional and chemical decay are independent, and that untranslated mRNA can persist for a long time under some circumstances. Most often untranslated messen-

174

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

ger is degraded very rapidly. Moreover, although it might be thought that the change in the 5’ sequence of the messenger could be enough to stabilize the messenger if it remained intact, it was reported, perhaps contrary to expectations, that the A-trp messenger underwent some fragmentation. Fragmentation is a feature of the decay of A late messenger (98). Other experiments suggest indirectly that the secondary, internal determinants of t r p mRNA decay might be the 5’ ends or RBSs of the downstream genes, possibly exposed after promoter proximal segments are inactivated or removed (but see Ref. 46). This follows from experiments carried out to characterize the role of the trpE RBS in influencing the decay of the messenger (54).When a series of conservative changes was made to this RBS such that the translational yield of the trpE protein was altered, there was a parallel change in the half-life of the trpE messenger. Thus these results imply that secondary regulation can occur at internal RBSs (trpE is distal to the attenuation leader and its leader polypeptide) (Fig. 2B.). Another feature of trp mRNA decay distinguishes it from the decay of some other polycistronic mRNAs, notably lac. There appear to be few 3‘ stabilizing structures at the ends of the internal trp genes (Section III,B,l). This may be seen in the “antipolarity” or reverse polarity exhibited by some mutations in trp (46). For example, some nonsense or frameshift mutations in trpA lead to a diminished expression of trpB (99). The mechanism of this has not been verified, but it is likely that premature termination of transcription in trpA leads to the destruction of the upstream sequence by 3’ exonucleases. Antipolarity is likely to be an extension of the better-studied phenomena relating to polarity mutations in trp, lac, and other operons. Messenger is often undetectable distal to nonsense mutations. Initially, it was difficult to determine whether such messenger was not made, or made and rapidly degraded. And whereas the induction of rho-dependent termination distal to nonsense mutations appears to be the most general explanation, a hyperlability of the distal, untranslated sequence is clearly a component (Section III,D,2). Where termination occurs in the absence of a 3’ stabilizing hairpin, the 3‘ terminal messenger is very rapidly degraded. Antipolarity is observed in the trp operon, where it appears that the destabilization extends 5’ from the nonsense mutation to result in the premature decay of upstream transcripts.

3 . DECAYOF OTHERPOLYCISTRONIC MESSENGERS As with the lac and trp mRNAs, there is no unlfying pattern for the degradation of other polycistronic transcripts; they can be degraded 5’ to 3‘, 3’ to 5 ’ , or segments can be freed to decay independently. Differential processing and decay of messenger is only one of several mechanisms em-

DECAY OF BACTERIAL MESSENGER HNA

175

ployed by bacteria to ensure appropriate stoichiometric expression of each cistron; others include operon-internal attenuators and terminators, differential translational efficiencies and translational coupling, active regulation of translational initiation, and multiple transcriptional promoters. If there is a unifying theme for the decay of polycistronic transcripts, it is that often individual genes or small groups of genes incorporate their own determinants for decay. Besides lac and trp, examples of polycistronic messengers with such differential decay include atp (100, 101), ars (102),his (103),mdEFG (104), CFAf1 fimbrae (105), and pilin (papA) (106). There are exceptions to this generality. The g d K operon (galETK)is one such example (107). Its mRNA appears to decay as a unit. The fact that translational coupling operates here suggests that the underlying mechanism relates to the translation of the mRNA as a unit, similar to the trp operon. Additional examples are found among those ribosomal protein operons in which a single ribosomal protein regulates translation and decay of an entire polycistronic transcript (although transcripts from each gene have not been studied individually) (28).

4. DECAYOF

THE

ompA AND bla mRNAs

The determinants of stability of the monocistronic ompA and blu messengers of E . coli have been extensively studied. The ompA messenger has an unusually long half-life, about 12-18 minutes, in rapidly growing cells; it is 4 as stable in slowly growing cells (59).The bla transcript, whose decay is not normally affected by growth rate, has a half-life of about 2-3 minutes (58, 108). Replacing the 5’ region of blu with that of ompA confers a three- to fivefold increase in bla messenger stability in rapidly growing cells and sensitivity to the growth rate (109). Translation of the gene products is necessary to maintain messenger stability of both ompA and bla transcripts; termination codons introduced into the genes render the transcripts more susceptible to degradation (110). However, although the untranslated leader and trailer regions of the bla messenger do not decay at rates markedly different than rates for those regions that are translated, translation of something less than the first 56 codons is necessary to bring about stabilization of the messenger (110).The entire messenger is destabilized or stabilized, respectively, when stop codons are inserted at the 26th or 56th codons. The exceptional stability of the ompA transcript is conferred by its 5’ UTR, a region with a prominent secondary structure (111-113). To understand its function, Belasco and co-workers systematically altered the 5’ region by deletion and insertion. This analysis showed that the proximal 5’ hairpin is essential for messenger stability. Moreover, when unpaired nucleotides are introduced proximal to the 5’ stem-loop structure, the transcript

176

DONALD P. NIERLICH AND GEORGE J. MIJRAKAWA

is no longer stable (114). It also appears that under these conditions the ompA messenger decays in a 5‘ to 3’ direction (115, 116). From these results, it has been proposed that ompA mRNA decay involves a net 5’-to-3’ process (Section III,A,l) initiated by the binding of a ribonuclease, presumably RNase E, to the mRNA’s single-stranded 5’ end (1,114, 117). Further, it is postulated that this is an essential step in messenger decay quite generally, and that the ompA hairpin interferes with this process (91). The 5’-hairpin structure is also required for the growth-rate-dependent control over the rate of ompA mRNA decay (111, 115). In cultures growing rapidly, the ompA mRNA decays slowly and degradation occurs 3’ to 5’, also slowly, such that partially degraded products can be detected (58). In cells growing rapidly, decay is rapid, and it appears that individual molecules are degraded rapidly as well; in addition, it is inferred that decay is now 5’ to 3’ (91; but see Ref. 111).

C. Regulation of mRNA Decay The rate of bacterial mRNA decay was initially thought to be passively mediated. However, in many instances now known, the converse is true: messenger processing and decay are active components of gene regulation. For example, expression of ribosomal proteins is coordinately regulated by conditions of growth; a component of this regulation is control over mRNA decay. Chemical effectors also mediate messenger decay-regulation of the e m and cat genes, which provide resistance to erythromycin and chloramphenicol in B . subtilis, involves control by the antibiotics over the stability of their messengers. Decay of several messengers, e.g., ompA, is regulated by the growth rate of the cell. Finally, the decay rates of the mRNAs encoding RNase E, PNPase, and RNase I11 are either autoregulated or regulated by other ribonucleases.

1. RIBOSOMALPROTEINS Synthesis of the 52 ribosomal proteins is post-transcriptionally regulated in regard both to growth rate and to providing appropriate amounts of each gene product ( 1 1 8 , 1 1 9 ) . Most of the genes are clustered into large operons, and in most cases studied a single gene product regulates the expression of all of the genes within its operon. Typically, this feedback regulation acts on a site within the 5’ UTR of the mRNA and affects all of the downstream cistrons of the ribosomal (r-protein) protein operon. By using plasmids containing the L11 r-protein, regulatory region under the control of the lac promoter, it has been shown that the translational regulation affects the decay rate of the r-protein mRNA (28).The presence of additional copies of transcripts with an intact feedback site increases the decay rate of the mes-

DECAY OF BACTERIAL MESSENGER HNA

177

senger; when the regulatory site is rendered nonfunctional, the messenger decays at a slower rate. Other examples of regulation affecting messenger stability are found in the spec, alpha, str, and S20 r-protein operons. The decay of the spc mRNA is thought to involve retroregulation (Section III,B, 1); that is, the polycistronic transcript appears to be cleaved endonucleolytically, and 3‘ exoribonucleases remove the upstream transcript to provide for equimolar production of upstream and downstream r-proteins (120).In the case of each of these operons, additional copies in the cell (on plasmids) result in increased rates of mRNA synthesis and the concomitant destabilization of the overproduced transcripts so that the mRNA concentration is only marginally increased (121-124). Stability of the r-protein S20 transcript requires its distal transcriptional terminator; deletion of this region increases its rate of decay 2.5 fold (56, 125).

2. INDUCEDSTABILITYOF ANTIBIOTIC RESISTANCE GENES The erm genes of B . subtilis and Staphylococcus aureus ( e m C and e n d , respectively) encode resistance to the antibiotic erythromycin, an inhibitor of translational elongation. Under normal growth conditions, each messenger decays with a half-life of 2 to 3 minutes; however, in the presence of antibiotic, their half-lives are increased to 17-40 minutes, an increase of 7to 2O-fold (93, 11 1 , 126). Within the 5’ leader region of the ermC gene, a leader polypeptide is encoded that, under normal conditions, is efficiently translated. This translation, in effect, stabilizes a hairpin configuration in the downstream messenger that sequesters the RBS of the e m structural gene. The resulting inhibition of translation is accompanied by rapid mRNA degradation. In the presence of erythromycin, the ribosomes translating the leader polypeptide stall, and an alternate secondary structure of the downstream transcript forms that allows translation of the e m gene. Under these conditions, not only is the gene product produced, but the transcript of the entire message is stabilized. A similar mechanism regulates e n d expression. Of two polypeptides encoded in the ermA leader, only the upstream one is necessary. Substantial research supports this model. Fusion of the erm leader region to the lac2 structural gene shows that the chimeric messenger is stabilized in the presence of erythromycin (126, 127). Studies using dimethyl sulfate to probe the secondary structure in vivo of the ermC messenger’s leader demonstrate the conformational alteration of the transcript under the two conditions (128). Further, supporting the importance given translation of the 5’ polypeptide, deletion analysis and mutagenesis of the proximal portion of the erm leader reveal that sequences downstream of the initial 10 codons of the

178

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

leader’s open reading frame are not required for mRNA stability (129, 130). Untranslated sequences added upstream of the leader polypeptide do not appear to affect messenger stability of ermA; however, such upstream additions to the transcript are unstable, indicating that messenger decay precedes 5‘ to 3’ and that the stalled ribosomes prevent decay of only the translated sequences (131, 132). The chloramphenicol-induced stability of the cut messenger in B . subtilis appears to operate in an analogous manner (133). In the ermC operon, both functional inactivation and chemical decay are prolonged by erythromycin; in the ermA operon, although the transcript is chemically stabilized, the rate of its functional inactivation is unaltered (126, 134). The basis of this difference is not known. 3. GROWTH-RATEREGULATION

Bacterial growth is accelerated in nutrient-rich media and slowed under nutrient-poor conditions. And whereas the expression of individual genes may vary by several orders of magnitude under different growth conditions, the actual rates of chain elongation of proteins and nucleic acids are remarkably little aEected (15, 70). Thus, it may not be altogether surprising that, in general, messenger half-lives are relatively u n a e c t e d by growth conditions. As an extreme example, when exponentially growing bacteria (E. coli) are starved for exogenous glucose, total messenger half-life is prolonged only 2.2-fold (135). Further, the rate of decay of the lac, lpp, and blu messengers appear to be unaffected over a wide range of doubling times (37, 108). However, the decay rates of some transcripts respond differently. As mentioned previously, both the ompA and cut transcripts of E . coli decay more rapidly in nutrient-poor environments (59, 108). Cleavage of the 9-S rRNA precursor to 5-S rRNA is also modulated by growth rate, but it is slowed in a nutrient-poor environment (108).Because RNase E mediates the cleavage of both the ompA transcript and the 9-S precursor, this growth-rate effect is unlikely to be mediated solely by RNase E. An exception to these generalities may pertain to E . coli cultures growing exceptionally slowly, i.e., anaerobically with a generation time of 700 minutes. Based on the observation that under such slow-growth conditions both the blu and ompA transcripts exhibited longer half-lives (concomitant with a decrease in the rate of 9-S rRNA processing), it has been suggested that mRNA decay is slowed generally (136).Possibly the pool of mRNA is maintained in very slowly growing cells, in which rates of synthesis are low, by slowing the rate of mRNA decay (136). Recently, the GroEL protein has been implicated in this process (Sections IV,B,l and IV,E). Bacteria, as with cells of higher organisms, respond to a sudden increase in environmental temperature by including the synthesis of a set of proteins

DECAY OF BACTERIAL MESSENGER HNA

179

and suppressing the synthesis of others. This heat-shock response allows the organism to respond to the stress; it is mediated at the transcriptional level. Messenger decay does not appear to be affected by heat shock in general, although a modest effect on the a t p (unc) mRNA has been reported (137, 138).

4. AUTOREGULATION OF RNASEGENES Regulation of the m e , rnc, and p n p genes, encoding RNase E, RNase 111, and PNPase, respectively, is mediated post-transcriptionally by way of processing and degradation of their niRNAs. Moreover, the degradation of these messengers is autoregulated (8). RNase E cleaves its own transcript at a site about 220 nucleotides downstream from the 5’ end; the cleavage is abolished in rne temperaturesensitive mutants (139). The autoregulation of RNase E modulates the halflife of the m e transcript over a range of less than 40 seconds to over 8 minutes (140). Consistent with this, increasing the copy number of the m e gene 21fold yields only a small increase of cellular RNase E. Expression of rnc (RNase 111) and the downstream era gene is autoregulated by an RNase-111 cleavage at a site in the 5‘ UTR of m c (8,141). Studies by M. Grunberg-Manago and colleagues indicate that regulation of pnp messenger abundance is also influenced by RNase 111, as well as by RNase E and PNPase (142, 143). The p n p transcript possesses an RNase-I11 site at its 5’ end in much the same fashion as the rnc operon does. An rnc mutant expresses 11-fold more PNPase than does a wild-type strain. Under these same conditions, the pnp mRNA half-life is increased over 20-fold and there is a 10-fold increase in the abundance of p n p mRNA. Additionally, in vivo and in vitro studies reveal that there is an RNase E site in the upstream region of the p n p transcript. Very recent studies also indicate that PNPase modulates its own synthesis by a mechanism that does not involve its catalytic activity. Characterization of a mutant overexpressing PNPase reveals that the enzyme normally binds at a 5’ site on the RNase-111-cleaved pnp messenger, affecting expression and decay of the transcript (C. Portier, personal communication).

111. Determinants of Decay Determinants are those features of an HNA molecule that either promote decay or stabilize against decay (111, 117). Determinants may also influence what decay intermediates are present and their relative concentrations. A number of examples are given in the discussions of specific messengers in Section II. In this section, additional examples are given and some remarks

180

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

are made about how such determinants might function. In addition, the role that translation plays in decay is reviewed, and the recent evidence for a role of polyadenylylation in the decay of bacterial messenger is considered. We know very little about the determinants of functional inactivation or their relationship to chemical decay. A priari, a message might be inactivated by cleavage near its RBS or at any point in the coding sequence; it might be inactivated by the binding of a factor (protein or RNA) such as to occlude the RBS; or a message might fold on itself to sequester the RBS and inactivate the message. It appears that examples of each of these occur, but as yet we have no evidence as to how general such mechanisms are. The fact that decay can be affected in many instances by translation has turned attention to the 5‘ determinants of translation; the fact that decay can be affected by 5’ stabilizers has brought scrutiny to the other features of the 5’ end. The 5’ region contains a number of candidate features: the RBS and its associated subsites, the untranslated leader, the 5‘ terminus per se, and the 5’ triphosphate that is present on nascent transcripts. This region is also the first to be transcribed and is therefore the most likely to be affected in a random process. Determinants influencing chemical decay appear to be present in 5‘ regions of a number of messengers. For monocistronic mRNAs and polycistronic, translationally coupled mRNAs, they seem most often to be the primary determinant or rate-limiting element of decay. Messengers often possess 3’ hairpins that stabilize them from the 3’ exonucleases of the cell. The latter, with notable exceptions, do not seem to have independent roles in determining rate of decay, perhaps because the cell most often eschews 3’ to 5‘ decay as a primary mechanism. Such could lead to the synthesis of truncated polypeptides. [On the other hand, there is evidence suggesting that 10-20% of translation products terminate prematurely (see Ref. II, and references therein).] As indicated in the discussion of specific operons, internal determinants that specify endonucleolytic cleavages also occur. In some polycistronic mRNAs (Section II,B,3), these lead initially to the freeing of cistronic or other segments of the transcript, allowing them to decay at different rates. In many cases, these cleavages appear surprisingly gratuitous. In the operonencoding r-proteins L10, L7/L12, and the p and p’ subunits of RNA polymerase, there is little visible effect of eliminating the cleavage distal to the L7/L12 transcript. Thus, we are left to assume that it plays a role in regulation of expression by some subtle means (144). In this and similar cases, it is not known whether these events are processing cleavages or are part of a decay pathway. We are aware of only a few instances in which likely primary cleavages have been identified within monocistronic messengers (145,146),although

DECAY OF BACTERIAL MESSENGER RNA

181

the complex set of fragments of the trx mRNA that accumulate in an E . coli mutant strain deficient in RNase E, RNase 111 and PNPase includes fragments that most likely were formed by internal endonucleolytic cleavage (146). Although such examples are rare, we speculate that such paths occur frequently, particularly where translational inactivation occurs prior to and independent of any cleavages (Section V, B). After the initial rate of decay is determined, secondary endonucleolytic cleavages may occur in a targeted molecule. In the decay of the trp mRNA, there is evidence that the messenger is cleaved at secondary sites in 5’ to 3’ order as the messenger decays. Internal sites have also been mapped in monocistronic mRNAs, such as r-protein S20 (rpsT),ompA, and trx (64,146, 147) (Section III,C,3). It has not been fully established if these are secondary cleavages, dependent on a prior cleavage of the mRNA. However, in the case of the trx messenger, it has been proposed that inactivation involves the 3‘ end and cleavages progress from 3’ to 5’ (148). At first glance it appears as if many of these determinants are simple features of sequence or structure. An RNA may be cleaved at a sequence rich in As and Us, as present in the RNase-E site of RNA I, or at a hairpin by RNase 111, as in the case of the rnc messenger (Section II,C,4). However, it is important to stress that all such determinants are likely to be complex and modulated, perhaps in several ways. RNase-E sequences can be obscured by their folding into secondary structures. RNase-I11 sites, for which secondary structure is required, can be obstructed by alternative folding (as occurs in retroregulation) or suppressed by translation. The strong influence of secondary structure on an RNA’s susceptibility to RNases (Section IV) provides a ready means by which primary and secondary cleavages can be regulated, and there is ample evidence that such regulation occurs. Structure can be influenced by translation and in several other ways, direct or indirect, by the binding of proteins. For example, it has been proposed that the stabilization of mRNAs by 3‘ hairpins involves a protein factor (Section III,B,l). Thus it is unlikely that determinants are simple static sites on a messenger. Rather, it appears that they involve a number of elements that interact either over short or long distances, or through events that are propagated along a molecule’s backbone. Such propagation might depend on the movement of ribosomes, movement of enzymes, movement of factors (similar to rho or other helicases), or the cooperative binding of proteins (Section IV,E).

A. 5‘ Determinants 1. 5’ STABILIZERS Knowledge of the role of the 5’ region comes in good part from study of 5’ determinants that confer exceptional stability (111). Altering such 5’ fea-

182

DONALD P. NIERLICH AND GEORGE J . MURAKAWA

tures destabilizes an mRNA, whereas fusing them onto some other mRNA confers greater stability on the recipient. These findings are revealing. They suggest that the 5' end of both the stable mRNAs and those RNAs that are stabilized as recipients contain a rate-determining element. It does not mean that they are the only elements involved in the initial step of decay, nor necessarily the site of initial cleavage, although these both might pertain. In addition to the aforementioned ompA mRNA and the messengers of the antibiotic resistance genes, e m and cat, other transcripts that have 5' stabilizers include the A P, transcript, the T4 gene product-32 messenger, and the sdh mRNA of B . subtilis. With the apparent exceptions of ornpA and sdh, each is likely to involve a trans-acting RNA-binding protein that is required for stability. Each is discussed further below. The decay characteristics of a h P,-trp fusion messenger are discussed in Section 11,B,2 (60). When the trp messenger was produced as a fusion from the A P, promoter, it showed a 10-fold increase in stability. Interestingly, functional inactivation was unaffected, implying that it is determined independently of chemical decay. In retrospect, these results suggest that the binding of the A N protein to the 5' end of the P,-derived transcripts stabilizes the mRNA (111). Krisch and collaborators studied a somewhat similar situation occurring in T4-infected E . coli cells (3,92,149). The long, gene-32 transcript of phage T4 is cleaved to release the distal gene-32 message and an upstream message; the latter than decays rapidly. In infected cells, the gene-32 product, an RNA-binding protein, regulates its own expression by binding its messenger near the Shine-Dalgarno sequence and stabilizing it (149a). Consistent with this, when the leader region of the gene is attached upstream of lacZ, the fusion transcript has a >10-fold longer half-life in T4-infected cells than in uninfected cells (149). The 5' region also appears to be dominant in regulating stability of the sdh mRNA in B . subtilis (150). Decay of this three-gene polycistronic transcript occurs in an all-or-none fashion, and translation of the 5' gene is required for downstream expression. As described above, the hairpin structure at the 5' terminus of the ompA mRNA is essential for its exceptional stability; indeed, the addition of only a few nucleotides to the 5' end of the hairpin destabilizes the mRNA. From these studies and by comparing similar dyad structures at the 5' end of E . coli papA, T4 gene-32, and R . capsulatus pufBA mRNAs, it has been suggested that E . coli contains a ribonuclease that binds to messengers with more than about four unpaired nucleotides at the 5' terminus (91, 106, 116, 151). Because RNase E is involved in the decay of some of these molecules, it is presumed to be this enzyme.

DECAY OF BACTERIAL MESSENGER HNA

183

2. 5’ DETERMINANTS OF DECAY RNAs that can be stabilized by fusion to a 5’ stabilizer include the l a d , blu, and trp mRNAs and the plasmid regulatory RNA, RNA I. What feature normally determines their decay, in the absence of such a stabilizer? Presumably, whatever it may be, it is a feature of many messengers, those that decay with average half-lives and probably have their primary decay determinant at the 5‘ end (91). In a number of instances, mutational alteration of an RBS affects both the frequency of translation and the half-life of the mRNA (54,90). In such cases, the faster decay of the poorly expressed proteins results in a constant relationship between the level of the mRNA in the cell and the protein produced. This observation has been interpreted to indicate that translation directly modulates decay, perhaps by a competition between the ribosomes and the decay system. However, this explanation appears limited because in some of these same cases, other alterations, equally affecting translation, have little effect on decay (88-90). Taken together, these results suggest that in some cases a 5’ primary determinant may overlap with the RBS, but in others, perhaps in most others, it does not. It may also be that in the case of some polycistronic mRNAs, e.g., trp, the removal of the 5‘ regions of the mRNA exposes downstream RBS that influence the rate of decay of the distal regions. Beyond these generalities, the nature of the 5‘ determinants is not clear. Many studies have been carried out in which a 5’ UTR or RBS has been altered, but no simple conclusions can be drawn. In the lac2 mRNA, for example, mutations in the 5’ proximal portion of the 38-base UTR, 22 to 34 bases proximal of the start codon, measurably alter the decay rate and translation, but there is no correlation between the two (87). [The footprint of the ribosome on the lac2 mRNA is from about 23 to 63 (152).]Indeed, for the ompA mRNA, alterations of its 5’ stabilizer, i.e., the 5’ end and adjacent hairpin, as well as sequences near the RBS, affect decay rates (112,114,116). Moreover, the 5‘ stabilizer itself has no effect on translation (115). Only a few studies bear on the role that the 5’ triphosphate of an mRNA might play in mRNA decay. In one study (153), the distribution of triphosphates on polysome-associated mRNA was investigated. About 15% of the polysome-associated mRNA molecules was estimated to possess 5‘ triphosphates, and this abundance was relatively constant and independent of the size of the polysomes. From this, it was concluded that, contrary to the rapid removal of the 5’ ends that occurs with tRNA and rRNA precursors, the 5’-triphosphate termini of mRNAs probably persist until the molecules are degraded.

184

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

A second study (53),from Stanley Cohen’s laboratory, exploited the fact that the initial, rate-limiting step of RNA-I degradation involves cleavage by RNase E to remove five nucleotides from its 5‘ end. Further, the distal decay intermediate produced has a structure resembling the 5’ stabilizer of the ompA gene; that is, there are only four unpaired bases at the 5’ end, preceding a stable hairpin structure. To characterize the fate of a decayintermediate analog that possesses a 5’ triphosphate instead of a 5’ monophosphate, a deletion was made that removed five bases from the 5’ end of the gene that encodes RNA I. The analog was stable compared to the intermediate, which is normally degraded so rapidly as to be undetectable in rne+ cells. Such stability is similar to that obtained when the RNase-E cleavage site in RNA I is replaced or when RNase E is inactive. If one assumes that the replacement of the 5’ monophosphate by a triphosphate does not induce secondary changes, then one can conclude that RNA-I degradation involves distinct primary and secondary processes. Further, after the primary RNase-E cleavage, the secondary process requires a substrate not possessing a triphosphate. This secondary process, at least for RNA I, also appears to involve 3’ polyadenylylation ( 1 5 3 ~ )Finally, . these two studies imply that, even when untranslated, nascent transcripts (i.e., those with 5’ triphosphates) are stable in the cell until they enter into a specific process for their removal.

B. 3‘ Determinants 1. 3’ HAIRPINS The 3’ ends of many bacterial transcripts contain hairpin structures that are essential for the stability of the upstream transcripts (117‘).Moreover, if these hairpins are not at the 3’ end of the original transcript, they may come to form the 3‘ end through the action of 3’ exonucleases. Indeed, it appears that the role of these distal hairpins is to impede the exonucleases, PNPase and RNase 11. Such hairpins are also found internally, e.g., in intercistronic regions of polycistronic transcripts, and come to form the 3’ end of a segmental mRNA after downstream cleavages. As presented in the following discussion, such stabilizers have been found in transcripts of E . coli, Salmonella typhimurium, a diversity of other bacteria, and several bacteriophages. In addition, hairpins whose function have not been directly established are also found at the 3’ ends of many transcripts, or at the 3’ ends of messengers formed by processing (80, 154,155).Their function as stabilizers can only be surmised, but it appears that rho-independent terminators, which incorporate a stable hairpin, play roles both in termination and stabilization of transcripts (156).Indeed, there is reason to believe that some putative rhoindependent terminators that have been identified by direct analysis of an in

DECAY OF BACTERIAL MESSENGER RNA

185

vivo transcript’s 3’ end function primarily as stabilizers and are not terminators in vivo (55, 157). It appears that any stable hairpin can serve as a 3’ stabilizer (117).However, in E. coli and S . typhimurium, some of these structures are members of a family, the REP (repeated extragenic palindrome) sequences, that possesses a 38-base conserved motif(117,158,159). There are an estimated 500 REP sequences in these genomes. Most often present in clusters of two or three, they are located at the ends of operons or in intergenic regions within operons. It is presumed that they have a function beyond the stabilization of mRNA, but what that might be is unknown. Although transcripts of REP elements have secondary structures similar to rho-independent terminators, they do not cause transcriptional termination (103,104).REP sequences can be found in a number of the operons mentioned above, e.g., distal to the hisG and lacy genes (155, 158). Deletion of the REP elements distal to the m d E gene in the E . coli malEFG operon destabilizes the upstream malE transcript; similarly, addition of REP sequences to messengers that do not have them can stabilize upstream transcripts and increase protein synthesis (103, 104). However, stabilization of upstream transcripts is not uniform to REP sequences: deletion of the cat messenger’s REP sequences has no effect on its half-life (160). Two termination sites have been found distal to the trp operon (46).The first, trpt, is a weak rho-independent terminator, whereas the second, trpt’, is a strong rho-dependent site that accounts for most of the termination in vivo. However, when cellular transcripts are analyzed for their 3’ termini, only ends that correspond to trpt are found. It has been shown that, in vitro, the full length transcript can be processed to trpt by RNase 11, as well as that the trpt sequence stabilizes the upstream messenger in vivo (157). The importance of the 3’ structures of the monocistronic glyA mRNA was discovered following an observation that the insertion of phage Mu cts in the gene’s 3’ UTR leads to a 70%reduction in expression (161).The glyA gene’s 182-base 3’ UTR contains two REP sequences and a rho-independent transcriptional terminator; the inserted prophage was found between the gene’s stop codon and the first REP sequence. Deletion of either the REP sequences or the terminator had only slight effects. However, deletion of both features markedly reduced protein expression and messenger stability. Mutations in both pnp and rnb, but not either alone, restored messenger stability and protein expression of the REP-deleted constructs, consistent with these 3’ hairpin structures preventing exonucleolytic decay. Notwithstanding the apparent importance of the 3’ structures in glyA mRNA decay, cleavages in the 5’ UTR can be detected during its decay. Chloramphenicol stabilizes both the wild-type and deletion-carrying transcripts, and further enhances the accumulation of the 5‘-cleaved decay intermediates. Either the

186

DONALD P. NIERLICH AND GEORGE J . MURAKAWA

decay of glyA mRNA involves two distinct mechanisms, or the cleavages in the 5‘ UTR, being outside of the region that is traversed by ribosomes, are not a part of the decay process (161). The transcript coding for A int-gene expression is regulated by differential RNA processing and messenger stability (75,162). The int gene is transcribed from two promoters, P, and P,. When int is expressed from the PI promoter, the transcript synthesized terminates immediately distal to the sib site. This transcript is relatively stable, and high levels of Int are produced. When int is transcribed from the P, promoter, a longer transcript that extends well beyond sib is made. The transcript forms a hairpin structure at sib, which is cleaved by RNase I11 to generate a single-stranded 3’ end (163). This transcript is unstable; thus Int is produced at a low level (164). Such post-transcriptional regulation has been termed retroregulation. In contrast, processing of the T7 mRNA by RNase I11 results in the stabilization of the upstream transcript after cleavage and formation of a 3’ hairpin structure (165). Further examples of the stabilization of messengers by 3’ stem-loop structures can be found in the E . coli r-protein S20 messenger (56), lac2 messenger (36, 55); (C. Kwan and D. P. Nierlich, unpublished), and crp mRNA (156). Also included are transcripts of the his operon of S. typhimurium (166), the puf operon of Rhodobacter capsulatus (167-169), the cry gene from Bacillus thuringiensis (170), the mbhA gene from Myxococcus xanthus (171), and phage +X174 (172). In the case of the cry gene, it is the secondary structure, not the sequence itself, that is important for the messenger’s increased stability, and mRNA stabilization is accompanied by increased protein production (170). The m b u messenger of M . xanthus is unusual in that its half-life is extremely long, about 150 minutes at 18 hours of myxospore development. Further, deletion of the 3’ UTR of the transcript, which has an extensive, complex secondary structure, leads to a 10-fold increase in decay of the mphA mRNA (171). Similarly, sequences at the 3‘ end of the polycistronic 4x174 mRNA are important for both chemical and functional stability of the transcript (172). Although it is widely believed that these 3’ hairpins directly protect mRNAs from digestion by the 3’ exonucleases, which show a limited ability in uitro to degrade hairpins, Higgens (173) has proposed that, in uiuo, a factor plays an additional stabilizing role. This exoribonuclease impeding factor (EIF), it is suggested, interferes with the activity of PNPase at stemloop structures and enhances their resistance to degradation (Section IV, E).

2. ROLE OF PoLY(A) Eukaryotic messengers are known to undergo extensive post-transcriptional modification in the nucleus, including addition of 5’ cap structures, removal

DECAY OF BACTERIAL MESSENGER HNA

187

and splicing of introns, and addition at the 3’ terminus of poly(A)sequences of 50 to 200 nucleotides. The capped and polyadenylylated termini are strongly implicated in one pathway of decay (86, 174). Early work showed that labile RNA species of Caulobacter cresentus are 3’ polyadenylylated, whereas the stable RNAs are not (175). By contrast, RNA from E . coli appeared to possess poly(A) tails at a much lower level, i.e., <1% (176).This latter finding plausibly deterred further study, notwithstanding that E . coli was known to possesses an active poly(A)polymerase; it is in fact still the enzyme of commerce. However, the presence of 3’-poly(A) tails on the labile RNAs of diverse eubacteria and several archaebacteria were subsequently demonstrated (e.g., 177, 178). In time, further research on E . coli and B . subtilis, particularly by N. Sarkar and co-workers, has brought a realization that polyadenylylation plays a role in the RNA metabolism of all bacteria (179-183). It has now been shown that a few specific mRNAs of E . coli possess poly(A) tails, and that their apparent low abundance is the result of the short length of their tails and the procedures used in their isolation (148, 180-182). Although there have been suggestions that the poly(A) tails from some bacteria are longer or more abundant than those of E . coli (182),it is difficult to document consistent differences (177,179,184). It seems possible that the methods of growth and isolation, and the phenotype of the particular strain used, will prove to be as large factors as the variation between species in this regard (177, 179, 182, 185). When RNA is isolated from normally growing cultures of E . coli K12, the poly(A) tails are from about 7-20 nucleotides in length with a few extending to about 40 nucleotides (148). This length appears to be the result of the dynamic state of the synthesis and degradation of the poly(A) tracts. Support for this comes from the observations that the mRNAs of E . coli mutants deficient in RNase I1 and PNPase, or these two and RNase E, have poly(A) tails of 50 or more bases (148, 182). Such length is similar to that observed in other bacteria, e.g., Bacillus strains and E . coli strain B (186, 187). It might be noted that B. subtilis has been found to lack RNase I1 (71). Values reported for the abundance of polyadenylylated species still vary widely, in the range of 1.3-15% for E . coli and perhaps to 40% for Bacillus species (179, 182, 186). Obtaining high values appears to depend on the use of low temperature (4°C)for the annealing step during oligo(dT)purification, and on the method of RNA extraction (179, 185). Abundance in this general range, i.e., 20%, suggests that most completed mRNA chains are polyadenylylated (171, 185). In a rapidly growing culture, label is distributed approximately equally between mRNA and ribosomal RNA transcripts, and a substantial fraction of the mRNA is incomplete at any time (53, 6.5, 76). A key assumption here is that rRNA

188

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

and tRNA precursors are not transiently polyadenylylated. This is supported by the observation that, in E. coli, the fraction of the pulse-labeled RNA polyadenylylated is inversely related to the synthesis of the stable RNAs that occur during normal growth and during amino-acid starvation (185). Supporting the view that a substantial fraction of the pool of free messengers possess poly(A) tails, about 40% of the E . coli trpA and Zpp messengers were shown to have poly(A) tails of 10 to 20 nucleotides (180, 181). However, whereas it does appear that many mRNAs in the pool are polyadenylylated (148), it does not appear likely that all species are adenylylated to the same extent or for the same time. Two cDNA clone libraries were constructed with RNA from the eubacterium Thermotoga muritim. One was randomly primed, and one was constructed with a poly(dT) primer. When more than 100 clones from each were sequenced, it proved that each library was quite distinct. The random-primed library contained mRNAs of proteins known to be particularly abundant; the poly(dT)-primed library contained a different set, some members of which might be thought to have regulatory roles (178). In recent work, cDNA clones were obtained from polyadenylylated species containing the 3’ regions of the E. coli trp and l p p transcripts and the hug mRNA of B . subtilis (182). Sequence analysis of individual molecules showed them to be polyadenylylated at sites just distal of 3’ hairpins that are the genes’ transcriptional terminators, but more often, at sites proximal to the 3’ hairpins (182, 183). Based on these results, it was proposed that polyadenylylation plays an essential role in turnover by destabilizing mRNA (182). The limited results to date prevent one from obtaining a clear view of the role of polyadenylylation. The discussion above, suggesting that much of the mRNA pool is polyadenylylated, might be misleading if the rate of turnover of polyadenylylated RNAs were different (e.g., much slower) than that of other species. This leads to the possibility that rather than playing an essential role in mRNA decay, polyadenylylation plays a circumscribed role; e.g., poly(A)sequences might act primarily to remove molecules that persist. This possibility gains support from several observations. The first stems from the finding, detailed in the following paragraph, that although polyadenylylation is required for turnover of RNA I, mutant strains deficient in polyadenylylation do carry out the initial, 5’ cleavage of the molecule (188). The second relates to the presence of molecules partially degraded at the 3’ end among the polyadenylylated RNAs that were isolated from E. coli cells carrying RNase mutations (182, 188) (S. Kushner, personal communication). This might arise if slowly turning-over, relatively stable molecules were poly-

DECAY OF BACTERIAL MESSENGER RNA

189

adenylylated. The third observation is that the pcnB gene, which encodes PAP1 poly(A) polymerase, is not essential for growth (189, 190). Poly(A) tails have been found to be present on the ColE1 regulatory RNA, RNA I (188-190). RNA I is an antisense RNA that is involved in the control of replication of the ColEl group of plasmids. Although not translated, it has some mRNA-like properties: it is unstable with a half-life of <2 minutes (51),and its degradation is dependent on cleavage by RNase E (51, 192), which in turn is blocked by the 5' stabilizer of the ompA mRNA (114). The poly(A) tails appear directly involved in the turnover of RNA I; mutation of the gene pcnB (plasmid copy number) disrupts copy-number control and leads to the accumulation of RNA I cut at a 5'-proximal RNase-E site (188, 190). It is notable that only a small fraction of the RNA-I molecules isolated from wild-type cells have a poly(A) tail (188). It appears that only 1 of 40 molecules is polyadenylylated. This indicates that the poly(A) tail is only present for a fraction of the life of the molecule. Either nonpolyadenylylated molecules have a long life, or molecules, once polyadenylylated, decay rapidly. As indicated, the E . coli pcnB gene encodes a poly(A) polymerase, PAP I (189). A pcnB mutation slows RNA-I turnover to & (188). Nevertheless, a strain with a deletion of the gene grows normally and maintains plasmids that are not regulated by antisense control at normal copy numbers (190). In spite of its strong effect on the turnover of RNA I, a defect in pcnB results in only slight effects on the turnover and patterns of decay of several E . cob mRNAs (148). One can detect, however, the accumulation of slightly shorter transcriptional products, presumably lacking their 3' extensions. On the other hand, in multiply mutant strains, a pcnB deletion produces a much stronger effect. Strains carrying mutations in the genes encoding PAP I, RNase 11, PNPase, and RNase E show a striking slowing of the decay of several messengers, although not uniformly so (148). These observations have fostered a model in which polyadenylylation plays a role in a 3'-to-5' endonucleolytic pathway that complements one involving RNase 11, PNPase, and RNase E (148). A second poly(A) polymerase (PAP 11)has been detected that constitutes about 10%of the total activity in wild-type E . coli (193).Consistent with this, pcnB mutants have a 90% reduced level of polyadenylylated RNA (148). There is as yet no evidence as to what the role of PAP 11 might be. Interestingly, in keeping with a role in the polyadenylylation of mRNA for PAP I, it has been noted that, among the diverse PAPS that have been characterized biochemically, it has greatest similarity to the mammalian PAP that also appears to act on mRNA (see Ref. 182, and references therein). Nonetheless, there is no sequence similarity between the two enzymes.

190

DONALD P. NIEHLICH AND GEORGE J. MURAKAWA

C. Internal Determinants and the Accumulation of Intermediates in Decay

1. INTERNALPRIMARYDETEHMINANTS Many polycistronic messengers are cleaved internally, and the cleavage leads to the differential decay of the proximal and distal segments. Such instances suggest that differential decay is the function of the internal cleavages, but one can imagine other functions. Cleavage may lead to alternative secondary structures so that an RNA functions differently, or it might place an RBS in a different context so as to alter expression. It also suggests that the rates of further decay of the cleaved products are determined, at least in part, by the sequences and structures present in the products themselves, rather than the cleavages that form them. On the other hand, the discoveries that some of the ribonucleases with roles in decay occur in complexes, and that mutations in the gene of one such enzyme are potentiated by mutation in another, suggest that in some cases the action of one enzyme might be linked to cleavage by another (194-196) (Section IV). Conceivably, cleavage of an RNA by RNase E could destine the upstream product to degradation by PNPase. In this context, it is not known to what extent the intercistronic cleavages of polycistronic mRNAs are independent of events occurring at the 5’ end of the upstream messenger. In fact, a model for the decay of the lac operon mRNA incorporates the idea that the decay of lacy might be the result of two processes: one, a “wave” coming from the 5’ end of the operon, and the second, an independent cleavage near the ZY junction (4). By contrast, recent work (91) advances a third possibility. The ompA 5’ stabilizer and partial ompA gene, placed upstream in an operon fusion with the bla gene, stabilized the latter to the extent of perhaps half that given by an in-frame gene fusion. On the other hand, if an RNase I11 cleavage site was placed in the untranslated region separating the upstream ompA and downstream bla sequences, its cleavage was unhindered. Thus cleavages internal to a polycistronic messenger can, but need not, be influenced by distant 5’ events. In the discussion above (Section 11,B,2) and in work on the cleavage of the lmZY intergenic sequence (94), it is argued that the effective uncoupler separating the decay of proximal and distal regions in polycistronic operons is the translational uncoupling afforded by the intercistronic regions. It may also be that some internal features, such as a strong RNase-I11 site, are relatively independent of effects coming from the 5’ end, although they too may be affected by translation in the same situations.

191

DECAY OF BACTERIAL MESSENGER RNA

2. INTERNALCLEAVAGES OF MESSENGERS AT SECONDARY SITES In general terms, identification of internal cleavage sites is dlmcult because of the rapidity of decay, once initiated. In a few instances, sites of cleavage have been identified by use of novel techniques or by careful application of standard techniques, such as S1 nuclease analysis. This has worked only with abundant messengers, such as lac2 mRNA, or messengers whose overproduction has been forced, e.g., ompA and bla (79, 81, 147, 197, 198). More often, it has been necessary to characterize intermediates that accumulate in RNase-E-deficient cells, and in several cases, it has been possible to observe that cleavages occurring in viva can be duplicated in vitro with purified endonuclease (162, 198-200). Such studies indicate that the decay of messenger is carried out by one or more broad-specificity endonucleases. Initially, this seemed to rule out RNase E as responsible for many of the endonucleolytic cleavages, because early studies suggested that it had a long consensus sequence (201). By contrast, the most recent studies show that RNase E has a weak sequence preference, and experiments with conditional lethal mutants implicate RNase E as an important activity (Section IV,B). It remains to be seen if other endonucleases share this role. The characterization of the decay of the relatively (A T)-rich messengers of phage T4 in rne mutant cells, vis-a-vis the decay of E . coli messenger, suggests another enzyme(s) may be involved (3).A number of other studies, some already mentioned, also suggest that there may be pathways, not involving RNase E, that contribute significantly (91, 148, 196) (S. H.-Y. Wei and D. P. Nierlich, unpublished). A characterization of cleavages in the lac mRNA found that they occurred at Y-A sequences and led to the isolation of several new, broad-specificity enzymes, RNases M, I*, and RNase R, the first ofwhich has Y-A specificity (79, 202). Based on these results, it has been suggested that these endonucleases might be the principal endonucleases of decay and/or active in removing the short fragments, those remaining after the degradation of longer fragments by the exonucleases PNPase and RNase 11. The absence of mutant studies as yet and the fact that the three enzymes produce 3’-NMPs as cleavage products (Section 111,C,3)leave the role ofthese enzymes open to question. There is no evidence that 3’-NMPs are part of the major pathways of nucleotide metabolism.

+

3. INTERMEDIATES IN DECAYAND BRMINAL BREAKDOWN PRODUCTS In wild-type cells, cleavages occurring within the coding regions of messengers, for the most part, can only be detected by sensitive means. Such cut

192

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

sites can be found at intervals of
/

Unstable

de novo synthesis

FIG.6. The synthesis and decay of mRNA. Redrawn from Ref. 67.

Stable RNAs

DECAY OF BACTERIAL MESSENGER HNA

193

cates that the two organisms differ, the first degrading mRNA by a hydrolytic pathway and the second, by a phosphorolytic one (204,205). This finding has recently gained additional support (71) (Section IV,A). A markedly different view proposes that RNAses M, I*, and R play central roles in mRNA decay. General evidence for a role for these enzymes in RNA turnover was provided by a characterization of the end groups on short oligonucleotides isolated from E . coli (206).A substantial fraction possessed 5'-OH groups, consistent with their formation by the three endonucleases.

D. Role of Translation and Ribosomes in Decay The idea that translation is involved in messenger decay is based on a very long series of observations, which sometimes appear contradictory (82; for examples, see Ref. 38). The earliest work indicated that messengers might be truncated 3' to sites that correspond to nonsense mutations. In subsequent work aimed at distinguishing between premature termination as the cause, and synthesis of the segment and its rapid decay, it was found that following chloramphenicol treatment, the distal segment could be detected (72). Based on earlier work from which it was concluded that chloramphenicol allowed reattachment of ribosomes to an mRNA, it came to be thought that the distal as well as the proximal transcript might be protected by ribosomes or by translation. Other work suggested that deprivation of energy could also protect mRNA (20,207). (These findings led to the inclusion of azide and chloramphenicol in the cell-harvesting buffers widely used.) Later work, both with antibiotics (208-210) and with strains possessing RBS mutations that altered the efficiency of translation, gave some further support to a role for translation in decay. Whereas there is much to be gleaned from this extensive literature, it might be just as productive to discuss conclusions drawn only from more recent works. These indicate that untranslated messenger may be degraded rapidly, but that it is not inevitably so. Ultimately, it seems that to understand the effects of translation on decay it is necessary to focus on the different aspects of translation: formation of the initiation complex, initiation per se, and translation of the length or specific regions of the transcript (36). We venture that much of the difficulty in understanding the effect of translation on decay has been because of the difficulty in making these distinctions; this seems to be particularly the case when antibiotics such as puromycin and chloramphenicol are concerned. In retrospect, it appears that translation may influence decay in three ways, discussed in the following section. It should be noted that this list is probably not complete (82); for example, there are instances in which translation appears necessary for cleavage (198).

194

DONALD P. NIERLICH AND GEORGE J . MURAKAWA

1. TRANSLATION AND 5' DETERMINANTS OF MESSENGERDECAY Although, as indicated, untranslated mRNA is generally rapidly degraded, several studies show that there is no obligatory relationship between translation and decay (36,88,89). An example derives from the characterization of a series of constructions that fused the ZacZ gene 3' to RBSs of different strengths, which were derived from phage A. The results showed that the decay of the chimeric mRNAs was not markedly affected by translation, so long as polarity effects in the weakly translated constructions were prevented (88). By contrast, studies ofthe lacZ and trpE mRNAs show that if changes in their RBSs are made conservatively, that is, in such a way as to change their efficiency without seemingly altering other nearby features, their decay rates are increased in approximate proportion to the decrease in their translation (54,89, 90). Physiologically, this leads to a striking result: the spacing of ribosomes on the mRNAs remains the same independent of the strength of translation (54,90, 211)! From the perspective of decay, this indicates that there is a competition in these instances between the initiation processes of decay and translation. However, taken with the broader base of observations, the results indicate that the rate-limiting step of decay need not reside at the RBS, and in the various constructions mentioned where a correlation was not observed, the rate-limiting determinants were elsewhere (e.g., 88, 212). What is the basis of the competition that underlies this translational effect? Given the lack of a precise understanding of decay determinants, one can only speculate that it relates to the overlap of this site (90,212,212~) with the site of translational initiation. If correct, this overlap is somewhat better defined in a study in which different sequence changes in the 1acZ RBS were made and found to affect decay and efficiency differently (89). Importantly, this work in particular makes the point that intermediate stages in the formation of a translational initiation complex may be more important in influencing the decay of a particular message than the translational efficiency itself. Finally, insofar as decay of a particular mRNA is affected by translation, one can infer that the frequency of ribosome loading, as influenced by growth rate and metabolic circumstances, will affect decay. Thus, in this context, it has been suggested that translation affects the decay of different mRNAs differentially in a physiologically meaningful way (213).

2.

LACK O F TRANSLATION INFLUENCES DECAY BY ALTERING OR EXPOSING 3' DETERMINANTS VIA TRANSCRIPTIONAL

POLARITY EFFECTS

Premature termination redefines the 3' sequences of messengers in terms of the 3' determinants they possess, and thus affects decay (72,88,90). Alterations of the RBS or 5' UTR of an mRNA that reduce the frequency of

DECAY OF BACTERIAL MESSENGER RNA

195

translation, sequence alterations that introduce stop codons, or treatment with antibiotics such as chlorainphenicol will lead to transcriptional polarity and change the 3’ sequence of the mRNA. Insofar as many natural operon terminators contain features designed to stabilize the 3’ end of the mRNA, it can be expected that the decay of the mRNA may well be affected. Because transcriptional polarity creates mRNAs that lack 3’ stabilizing features, it is not surprising that weakly translated mRNAs often decay rapidly. One example, which was mentioned above, pertains to a study in which stop codons were inserted at different sites in bla (110). This showed that between 26 and 56 codons of the bla mRNA had to be translated to obtain normal stability. From this, one might infer that there are no polarity terminators or primary decay determinants distal to codon 56. Relevant to this speculation is the observation that a highly unstable bla mRNA, containing a nonsense mutation in the fourth codon, was stabilized by fusion to the 5’ stabilizer of the ompA gene (91). This suggests that 5’- and 3’-decay determinants may interact. Finally, transcription, translation, and decay of the lacZ mRNA have been characterized in constructions that allow the lacZ gene to be expressed in uiuo by either the E . coli or phage T7 RNA polymerase (214-216). T7 RNA polymerase has a very high rate of polymerization and is insensitive to polarity termination. When the mRNA in the constructions carried a functional RBS, the cells produced large amounts of full-length mRNA using the T7 polymerase, but produced relatively little P-galactosidase. Untranslated mRNA was also produced in large amounts, and for this mRNA, as with the translated mRNA, mass decay was normal as to pattern and exponential kinetics. Together, these observations indicate that chemical decay can be uncoupled from both translation and the transcriptional events associated with the use of the E . coli RNA polymerase. These are important results, not withstanding the fact that effects on functional inactivation and processing of the chimeric transcripts were observed that are still to be understood (214). In regard to the small production of P-galactosidase, it appears that the T7 RNA polymerase, with its high elongation rate, far outpaced the ribosomes and exposed t h e mRNA to secondary inactivating events. 3.

TRANSLATION

MAY ALTER THE

SUSCEPTIBILITY TO ATTACK

OF SPECIFIC INTERNAL DETERMINANTS

Translation through regions containing stabilizing hairpins affects the stability of the messenger. This has been observed in a few situations in which the stabilizing structures are located near the 3’ ends of transcripts, where they can be degraded by 3’ exonucleases, or with regard to decay intermediates that are stabilized by 3’ hairpins (169). It has also been seen in the cleavage of polycistronic mRNAs in intercistronic regions, where such

196

DONALD P. NIERLICH AND GEORGE J. MUFL4KAWA

sites are variably translated (0.Fattal, T. M. F. Tuohy, J. F. Atkins and D. P. Nierlich, unpublished). However, it seems probable that in many instances, internal sites, e.g., secondary RNase-I11 or RNase-E sites, will be modulated by translation, directly or indirectly (212, 2 1 2 ~ ) .

E.

Interactions among Determinants

The preceding discussion of individual determinants should not be taken to mean that individual determinants function in isolation. Determinants present in a defined region, e.g., such as the 5’ end, are likely to be composed of several elements, and these may, in many situations, interact with other elements at a distance. Certainly the literature on rRNA processing, involving many of the same enzymes, indicates that this is the case (9-12). Alternative folding affects the cleavage of the X sib site, and thus the decay of the int mRNA (Section III,B,l). And for the chemical decay of RNA I of ColE1, a plausible argument can be made for the interaction of a 5’ site and a 3’ poly(A) sequence (Section III,B,2). A similar situation exists for some eukaryotic mRNAs (86). In this same vein, it was discussed in Section II,B,3 that many mono- and some polycistronic mRNAs decay in a unitary fashion, whereas for many or perhaps most polycistronic mRNAs, differential decay occurs for parts of the transcript. What is required for a messenger to decay in a unitary fashion? From the obvious inferences of the above discussion, this would include strong determinants at the 5’ and 3’ ends, one of which is a primary decay determinant; the absence of internal features that provide alternative primary determinants; and the absence of strong internal hairpins that could impede the removal of fragments by the 3’ exonucleases. For a polycistronic mRNA to decay in a unitary fashion, the evidence indicates that coupling of translation from cistron to cistron is necessary.

IV. Enzymes of mRNA Decay More than 20 exo- and endoribonucleases have been identified in E . coli. Of these, most were found in the course of studies of rRNA or tRNA processing, and the purified enzymes have been shown to act on rRNA or tRNA precursors in vitro. In contrast, as few as four RNases are thought to participate in important ways in messenger decay, although others may play ancillary roles, and additional enzymes continue to be discovered (6, 202, 21 7 , 218). Only two ribonucleases, RNases E and RNase P, appear essential for growth (9, 12). These two, both endonucleases, cleave both stable RNA precursors and mRNAs in vivo. Although RNase P is essential for the pro-

DECAY OF BACTERIAL MESSENGER RNA

197

cessing of tRNA, it also cleaves a few mRNAs, largely mRNA-tRNA (or tRNA-like) cotranscripts (198, 219). RNase E is clearly important for decay, notwithstanding uncertainty concerning its function(s) (8, 201). One relatively unique endonuclease, RNase 111, which also cleaves both mRNAs and stable RNA precursors, plays a circumscribed but significant role (75). The remaining ribonucleases are remarkably redundant in their activities. In some cases, mutations in each of several 3‘ exonucleases do not appreciably slow growth, and generally any one of several enzymes will allow growth at near normal rates in genetic backgrounds deficient in as many as four or five enzymes (6, 220, 221). There is longstanding evidence that the two 3’ exonucleases, PNPase and RNase 11, have roles in decay (222-224), but it is largely indirect, and their precise roles are unknown. The properties of the two enzymes in vitro correspond well with that inferred for the exonuclease activities observed in the cell. Further, double mutants that carry a temperature-sensitive mutation for one of the exonucleases accumulate at the restrictive temperature, however weakly, a broad range of degradation fragments, some of which are clearly the products of RNase-E cleavage (146, 196). Moreover, mutations affecting these two enzymes in combination with mutations in the genes for RNase E and PAP1 have strong effects on mRNA decay (148). Further, as described in Section IV,B, PNPase and RNase E can be isolated in a complex. Although not entirely our view (Section V,B), it is generally believed that endonucleases are responsible for functional inactivation, and broadspectrum endonucleases and exonucleases mediate chemical degradation. The fact that strains bearing mutations in several candidate ribonucleases ( m e , pnp, rnb) generally have weak phenotypes has made establishment of these enzymes’ roles and this general model elusive (196, 225). Thus, the search for other enzymes that may function in messenger decay continues. Each of the enzymes mentioned above are discussed in the following sections. References to other ribonucleases that have been identified as important in RNA processing but could play a role in decay, and RNases whose roles are largely unknown, can be found elsewhere (6, 7 , 10, 217, 218, 226).

A. RNase II and PNPase RNase I1 is an RNA hydrolase, producing nucleoside 5’-monophosphates from the 3’ terminus of a molecule (7, 8, 206). PNPase is an RNA phosphohydrolase, able to both synthesize and degrade RNA; degradation produces nucleoside 5’-diphosphates from the 3’ end of the substrate and Pi (8).RNA turnover, using the latter enzyme, presumably saves one-half of the energy

198

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

needed to regenerate the NTPs needed in RNA synthesis. Consistent with this possibility, strains carrying mutations in both PNPase and RNase PH, a second, similar phosphohydrolase that is presumed to have its primary role in the processing of tRNA, grow at a much reduced rate when compared to some strains with several mutations in hydrolytic exoribonucleases (221). Both RNase I1 and PNPase are maximally active on long, single-stranded chains and digest them to a limit of one to four nucleotides (7, 8). Both show a processive mode: they degrade long substrates (e.g., for RNase I1 > 15 bases) before initiating attack on another molecule (227). In addition, both enzymes are inhibited by hairpin structures, although PNPase can digest through a structured RNA, such as a tRNA, if given a sufficient unpaired 3’ end (8). These features all have strong implications for the way in which these exonucleases might act in decay (227). A recent analysis of RNase I1 catalysis addresses its processive mechanism (227). The result was to demonstrate, as proposed also for other RNA and DNA 3’ exonucleases, that two sites exist: one, the “anchor,” holds the substrate; the second, which interacts with the substrate about 15 to 27 nucleotides from the anchor site, binds and hydrolyzes the 3’-terminal dinucleotide linkage. This analysis suggests that the substrate, bound at the anchor site, is pulled progressively through the anchor by the force developed by the cleavages occurring at the catalytic site (227). Neither PNPase or RNase I1 has an activity unique in the bacterial cell, but there is no evidence to link the other 3’ exonucleases to mRNA decay. Early tracer studies on RNA turnover implicated nucleoside 5’-phosphates and, particularly, 5’ NDPs as products of inRNA turnover (26, 203) (Fig. 6). In fact, subsequent studies of the turnover of RNA in the presence of H,180 were interpreted as indicating that turnover in E . coli is primarily hydrolytic, whereas that in B. subtilis is phosphorolytic (204, 205). These findings have gained support in recent work demonstrating that these bacteria differ markedly in their content of the two enzymes. Escherichia coli possess 10fold more RNase I1 activity than PNPase activity in extracts; B. subtilis has no detectable RNase I1 activity (71). Although present in eukaryotic cells, no 5‘-to-3’ exoribonucleases have been identified in bacteria, in spite of the scrutiny resulting from the fact that many messengers appear to be degraded 5’ to 3’. Indeed, an E . coli in vitro system manifesting 5’-to-3’ decay that was coupled to translation was shown to depend on the RNase I1 in the extracts (228).Because of these and related observations, it has been postulated that endonucleases cleave messenger progressively from the 5‘ terminus, and that 3’ exonucleases mediate the degradation of the freed RNA fragments to mononucleotides. More direct evidence for the participation of RNase I1 and PNPase in messenger decay comes from diverse sources (2, 161, 229). For example, the presence

DECAY OF BACTERIAL MESSENGEH HNA

199

of 3‘-terminal hairpins on many transcripts is often attributable to the activity of these enzymes (157, 230). Escherichia coli strains that harbor mutations in either PNPase or RNase I1 alone do not differ significantly from wild type in growth rate, or in the decay of p-galactosidase or total mRNA (223, 231);however, double mutants are inviable (222, 232). Strains that lack PNPase activity and possess a temperature-sensitive RNase I1 have been constructed, although with dimculty. At the restrictive temperature, these strains weakly accumulate fragments ranging from 100 to 1500 nucleotides (196, 224), as well as some defined fragments of specific messengers. These include transcripts from the malEFG operon, the s p c operon, the glyA operon, and of the ribosomal protein S20 inRNA (64,103, 120, 161). The observation that the E . coli p n p mb(Ts) mutants do not have a markedly altered overall rate of mRNA decay is consistent with the presumption that these enzymes do not generally initiate decay but rather serve to remove intermediate fragments. However, in a triply mutant strain that includes a temperature-sensitive RNase E [ p n p mb(Ts) ams(Ts)], the bulk rate of mass decay is slowed to 4-4 at the nonpermissive temperature (196). Also, decay of certain specific messengers is much more strongly affected in the triple mutant. The synergism between p n p m b mutations and a m might be explained in any of several ways. However, it is noteworthy that a complex of PNPase with RNase E has been demonstrated.

B. RNase E

1. IDENTIFICATION OF THE m e GENE AND PROPERTIES OF RNASEE

The determination that RNase E participates in messenger decay and the characterization of the purified enzyme have been perhaps the most significant recent advancements in the field. The me gene is essential; no “knockout” mutations have been isolated. RNase E cleaves several bacterial and phage messengers, as well as influencing the stability of its own mRNA. Finally, RNase E appears to be both a large and complex enzyme and a component of a multiunit protein complex. The m e locus was identified by Apirion by way of a temperaturesensitive mutation that mapped at min 23.5 on the E. coli genome and that was defective in the synthesis of 5-S rRNA at the nonpermissive temperature (233). Initial biochemical studies revealed that RNase E converts the 9-S rRNA precursor into the 5-S rRNA species (2,201,234-236), cleaves RNA I of ColE1, and processes several T4 RNA species (192,200,237).At about the same time, Kuwano and Ono isolated mutants that were temperature sensitive for growth and whose mRNA had significantly prolonged chemical lifetimes; one of the mutations, designated a m (altered mRNA stability), was

200

DONALD P. NIERLICH AND GEORGE J. WFlAKAWA

chosen for further study (238-240). Subsequent work has shown that the original mutations, me-3701 and a m (now also called m e - l ) , are encoded by the same gene and affect adjacent amino acids, being only six bases apart (241-244). The umslme-1 allele possesses a slightly stronger phenotype than does me-3701; otherwise, the two appear much the same. The structural gene for RNase E has been sequenced and encodes a 1061 amino-acid protein (245-247) ( G . Mackie and I. B. Holland, personal communications). The initial sequences of the m e gene obtained were incomplete; the complete sequence (save for some recently posted corrections) was obtained from an E . coli clone that had been selected with antibody directed to yeast heavy-chain myosin. It is unlikely that this was a spurious cloning (199, 245, 2 4 7 ~ )The . product of the hmpl (high molecular weight protein) gene, Hmpl, possesses both attributes of a cell-structure protein and RNase E. Although RNase E does not appear related to other ribonucleases, sequence analysis identified a region very similar to the highly conserved 70kDa snRNP protein involved in eukaryotic RNA splicing in the hydrophilic C-terminal half, and a myosin-like sequence and potential membrane insertion site in the N-terminal half, which also contains the amslme mutations

(245). The RNase E protein has been difficult to identdy and purify, and different techniques have yielded quite different products (199, 201, 245, 2 4 7 ~ ) . The size of the polypeptide, for example, has been variously reported as 60 to 180 kDa. RNase E, which, from DNA sequence analysis, has a subunit size of 118 kDa, migrates aberrantly as a 180-kDa polypeptide on SDS polyacrylamide gels, apparently because of a highly charged C terminus (199, 245, 2 4 7 ~ )The . protein, obtained following denaturing gel electrophoresis at about 95% purity, can, after renaturation, cleave natural substrates correctly, although it possesses a low specific activity (199). When purified without denaturation, RNase E can be obtained as a defined complex of about 500 kDa (248). The complex contains PNPase as well as two unidentified proteins of 50 and 48 kDa. Initial experiments with antibodies directed at the four polypeptides suggest that all of the RNase E and PNPase of the cell are present in the complex; this is not the case for the two smaller proteins (248) (A. J. Carpousis, personal communication). At various times, RNase E has been reported to be either membrane associated or not, and in a complex with other RNA processing endonucleases (249251). It is now generally considered not to be membrane bound, but given its physical characteristics, it may well be associated in some way with cellular structures. RNase E does not appear to be an abundant protein in the cell, and what is present has an apparent size of 180 kDa on SDS gels, as described above ( 2 4 7 ~(A. ) J. Carpousis, personal communication). In a number of studies,

DECAY OF BACTERIAL MESSENGER HNA

20 1

active RNase E of 60-80 kDa was obtained; these polypeptides appear now to be in vitro degradation products that nonetheless retained some activity (202, 252). Similarly, the protein as isolated using different substrates for purification had distinct catalytic features. This “RNase K” activity also appears artifactual (Section IV,B,3). The bacterial chaperonin protein, GroEL, has also been reported to copurify with RNase E, although GroE subunits are not among the two unidentified polypeptides mentioned above (253). In an odd twist of research, a gene that complements the ams mutation was cloned and sequenced. The sequence was subsequently identified as that of GroEL, in keeping with the discovery that the complementing gene did not map at the same site as the a m gene (253, 254). Despite this finding, it has been difficult to establish that the association of the proteins is not fortuitous: GroEL has an affinity for other polypeptides. Nonetheless, highly purified GroEL appears to bind and protect RNA in vitro from RNase E cleavage in a reaction that is dependent on Mg2+ and ATP (255). However, a specific interaction with RNase E per se still remains to be established. Finally, EIF, which is inferred to bind hairpins as discussed in regard to 3’ stabilizers (Section III,B,l), purifies with a large complex that contains RNase E and PNPase (173, 256). These intriguing results lead to the conclusion that RNase E functions as part of a large multisubunit complex, perhaps a “degradasome.” However, they leave the composition of the complex unresolved. 2. CLEAVAGE OF RNA in Viva AND in Vitro BY RNASE E RNase E cleavages sites have been identified, first, by cataloging the RNAs that accumulate in me(Ts)mutants, and second, by detailed comparisons of cleavage patterns of mRNAs in vivo and in uitro. Based on such analyses, RNase E appears to cleave many RNAs, perhaps on the order of 1 per 100 bases (2, 3, 201). The two substrates first characterized, the 9-S rRNA precursor and RNA I, possess two and one cleavage sites, respectively. These facts, as well as the finding that two of these sites have in common a 9or 10-base sequence, led to the initial idea that the enzyme is quite specific. However, as additional substrates were characterized, it became increasingly clear that there is minimal similarity among the sites. By tabulating cleavage sites identified as occurring in vivo and by analyzing target sequences by mutagenesis in T4 gene 32, E . coli r-protein S20, RNA I and other sequences, an only weakly defined pattern emerges (3, 92, 201, 257). Thus, it was concluded that RNase E preferentially cleaves regions of generally single-stranded RNA with a consensus sequence of RAUUW (R = A or G, W = A or U) (92)or simply AU (257);cleavage most often occurs 5’ to A. In some cases, the presence of a nearby secondary structure, often a

202

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

downstream hairpin, has appeared important to RNase E recognition, but . both ranthis has not proved to be a consistent feature (92, 2 5 7 ~ )Indeed, domly generated mutations and site-specific mutagenesis of the region surrounding the cleavage site of RNA I show that the sequence near the site of cleavage influences cleavage, whereas the relative locations of physical features (5‘ end, hairpin loops) do not. However, there is a great deal of freedom in the sequence; e.g., in one case, the introduction of a G shifted and repressed the cleavage but did not abolish it (258-260). In keeping with this, recent studies of RNase E cleavage in uitro, using highly purified enzyme and, as substrates, both RNA I and synthetic oliU gonucleotides, indicate that it preferentially cleaves sites high in A content with little sequence constraint (260). Short, single-stranded RNA oligoribonucleotides are readily cleaved by RNase E, and nearby stem-loop structures actually inhibit the activity in oitro. Many E . coli transcripts that have RNase E cleavage sites have been characterized; these include the mRNAs of the his operon (198),the rpsUdnaG-rpoD operon (261),the pap operon (262),the rplKA]LrpoBC operon (263), the dicB operon (264), the rspO-pnp operon (143, 265), the unc operon (266), the r-protein S20 gene (257, 257a), and m e itself (140).Additional RNase-E-mediated cleavage sites occur in EacZ fusion transcripts (212), tetR-lucZ (267), a hybrid transcript of atpElinterferon-P (268), IS10 mRNA (269), the R. casulatus puf mRNA in E . coli (270), and phage f l mRNA (271). Although a large number of naturally occurring RNase E sites have been identified, it remains unclear whether the enzyme (1)mediates the functional inactivation of these transcripts or acts only to mediate chemical decay, (2) plays an important role in substrate-site selection vis-u-vis features of the substrate or the substrate’s environment, or (3)possesses multiple independent functions, or whether its myosin-like features are components of its function in RNA processing and decay. Only in the autoregulation of RNase E has this enzyme been shown to mediate functional inactivation (139, 140). RNase E cleavage sites have been identified in poorly transcribed chimeric lacZ fusions; however, we and others have found that RNase E mutations do not alter the functional or chemical decay of the natural lacZ transcript (61, 212) (S. H.-Y. Wei, unpublished results), although it appears that sites resembling RNase E cleavage sites are present in lacZ (S. H.-Y. Wei and D. P. Nierlich, unpublished). Interestingly, when rne-3701 was first characterized, its effect on general mRNA decay was missed, and further, when it was studied in the context of strains with defects also in RNase I11 and RNase P, 21 of 80 proteins had reduced, rather than increased, levels (272).It is the strong phenotype of the triple mutant, ams pnp mb, that brings to focus the likely role of RNase E in

+

DECAY OF BACTERIAL MESSENGER RNA

203

decay (196). Decay products then accumulate and the lac and other mRNAs are stabilized markedly (S. H. -Y. Wei, unpublished). It therefore appears that the rnelams product will prove to have an important cellular role. In this regard, two reports present evidence for the presence of RNase-E-like enzymes in human cells. In one, an enzyme with the specificity of RNase E was isolated from cultured B-cells (273);in the second, an enzyme complementing m e mutations in E . coli was cloned from a human library and expressed. In the latter case, the protein obtained had enzymatic and immunological properties in common with E . coli RNase E

(274). 3. “RNASEK” An enzyme was purified that cleaved the ompA transcript in uitro at the same sites cleaved in uivo and was designated RNase K (147, 275). This enzyme had some properties distinct from those reported for RNase E, although it was not present in rne mutants. For a period, it was believed that RNase K might be a natural proteolysis product of RNase E (201). The original investigators now suggest that the properties found for RNase K are largely the consequence of the presence of GroEL in preparations of RNase E (Section IV, B, 1)(A. von Gabin, personal communication).

C. RNase 111 RNase I11 is specific for double-stranded RNA. It was identified as the endonuclease that cleaves one or both strands of the stems of hairpins present in phage T7 transcripts and E . coli rRNA precursors (75, 276, 277). Although RNase I11 sites are relatively infrequent, both perfectly and imperfectly paired sequences are cleaved, and the sites observed share few features. Thus, the basis of the enzyme’s specificity remains elusive (278). Notwithstanding its role in rRNA processing, the structural gene for RNase 111, rnc, is not essential; steps normally catalyzed in rRNA synthesis appear to be made unnecessary with cleavages at other sites (279). Mutation in rnc alters the relative expression of a substantial number of proteins in E . coli, some positively and others negatively (272),and affects the stability of a number of messengers, among them pnp (142), the rnc operon (141), the metY-nusA-in@ operon (280),the dicF gene (264),and the secE-nusG and rplKA]L-rpoBC operons (263). RNase I11 cleaves the sib hairpin and thus destabilizes the int message of bacteriophage A, as discussed above (281). It cleaves mRNAs of phages fl (271)and T7, the latter cleavage stabilizing the mRNA (165). RNase 111 also cleaves the “killer” mRNAs of the hoklsok, srnB, and pnd transcripts for plasmid maintenance (282). In studies in vitro, RNase 111 cleaved the 5’ region of the lac2 transcript

204

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

at discrete sites and inactivated its capacity to synthesize the lacZ a-polypeptide (283, 284). Moreover, RNase I11 was the major activity inactivating the a-polypeptide template activity in E . co2i extracts. Whether RNase 111 cleaves the lac mRNA in vivo is less clear. Strains carrying an rnc mutation produce P-galactosidase at a reduced rather than an enhanced rate, but this trait frequently reverts without affecting the RNase 111 deficiency (285). Although fusion of ZacZ to an upstream sequence that possesses an RNase 111 cleavage site appreciably speeds the decay of lacZ messenger (286),decay of the native lac operon mRNA, when characterized on Northern blots, is not altered by rnc mutation ( S . H.-S. Wei, G . J. Murakawa and D. P. Nierlich, unpublished). These observations suggest that, in uivo, the RNase I11 site(s) in the 5’ coding region of lac2 is masked by some specific mechanism. Thus, it appears that RNase I11 cleaves a small but significant number of mRNAs in E . coli. Cleavage can result in the stabilization or destabilization of the mRNA, and such cleavages provide means by which decay may be modulated.

D. Other Endonucleases RNase I is a broad-specificity endonuclease that degrades RNA to 3’ mononucleotides (287). Because of its periplasmic location, RNase I presumably does not participate in messenger degradation. A form of the enzyme, RNase I*, exists intracellularly in E . coZi, and it was suggested that it might function in mRNA degradation (206, 288). Specifically, given that the exonucleases PNPase and RNase I1 leave short core products (Section IV,A), RNase I* might be particularly suited for their cleavage. If true, RNase I* can have only a participatory role, because it has been known for many years that m a (RNase I) mutants are not defective in mRNA decay (7). Another enzyme, oligonucleotidase, an RNase-11-like enzyme that acts on short substrates, could also perform this function (7). RNase M and RNase R were isolated in an effort to find other broadspecificity endonucleases (202). RNase M is a 26-kDa monomeric endoribonuclease that preferentially cleaves Y-A linkages in RNA (289). RNase R (for residual) is a minor activity in E . coli extracts. It is a 24-kDa endonuclease with a specificity resembling RNase I* (202). RNase I*, RNase M, and RNase R cleave RNA to produce 3’ nucleoside phosphates, which presumably could only be reutilized after dephosphorylation. This suggests that the role of these enzymes in decay might be quite limited, although observations supporting their action in vivo have been made (Section III,C,3) (206).On the other hand, there is work that suggests that the rnu gene product(s) is involved in the degradation of rRNA in C-source-starved E . coli (290). This observation may mean that these enzymes function under those conditions in which the stable species are de-

DECAY OF BACTERIAL MESSENGER RNA

205

graded. (This suggestion does not preclude the lysis of some fraction of the cells during starvation.)

E. Other RNA Binding Proteins Bacteria contain a number of nonspecific RNA binding proteins and helicases. With the exception of the rho protein, which functions in transcriptional termination, and other termination factors, the roles of these proteins and helicases largely are not known (291, 292). Almost certainly some of these will function in mRNA decay. Along this line, results of a recent study suggest that, in E . coli, poorly translated mRNAs are stabilized by the DEAD-box proteins, encoded by deuD and srmB (219). The DEADbox proteins are a ubiquitous group of RNA helicases. A still incompletely defined activity, exoribonuclease-impeding factor (EIF), is postulated to stabilize hairpins from attack by PNPase (Section IV,A). Recent attempts at its purification suggest that it is contained in RNase E- and PNPase-containing complex of about 500 kDa that is much like that obtained by direct purification of the RNase E activity. Finally, GroEL, implicated as protecting mRNA from decay in slowly growing cells, binds RNA in vitro (255) (Section IV, B, 1).

V. Mechanism of mRNA Decay A. Models: Killer Ribosomes, the

“Degradosome,“ and the Runoff of Ribosomes

The long history of study of messenger decay is reflected in the models that have been proposed to explain its underlying mechanism. Based on the observations that messenger decay proceeds 5’ to 3’ and is in some way associated with translation, models were advanced in which a 5’ exonuclease degrades mRNA behind the last translating ribosome, or decay involves a ribosome with an associated exonuclease (a “killer” ribosome), which inactivates and degrades messenger after a random number of translations (293, 294). The observation that decay follows exponential kinetics pointed to a random attack on an exposed site (32). In this same vein, search for an mRNA-degrading activity in vitro led to a report of a new enzyme (RNase V) whose action was dependent on translation and had a 5’-to-3‘ directionality (228, 294). Further research showed that this activity depends on the presence in the extracts of the 3’ exonuclease, RNase I1 (295, 296), and led in time to a focus on the 3‘ ends of mRNAs as substrates for decay (297). When studies of rRNA processing revealed the number and importance of endonucleases in the cell, and as it became clearer that some mRNAs were degraded 5’ to 3’, the idea devel-

206

DONALD P. NIEHLICH AND GEORGE J . MURAKAWA

oped that decay came about through the combined action of endonucleases and exonucleases (10, 12). Further characterization of rRNA and tRNA processing activities [including RNase E, whose role in mRNA decay was not initially detected (272)] showed that some sedimented with other ribonucleases. Similarly, the construction of multiply ribonuclease-deficient strains showed that they possess distinct phenotypes that single mutants do not. This led to the idea of processing complexes-processosomes and degradosomes (248, 250). Studies of the decay of the lac messenger strengthened the notion that mRNA is not degraded by random nucleolytic hits. This led to a reinforcement of the idea that mRNA is degraded by the combined action of endonucleases and 3’ exonucleases. In this model, their action constitutes a 5’-to-3’ “net wave” acting on molecules initially protected by translation and then, after messenger inactivation, exposed to endonucleolytic cleavage behind the last translating ribosome (4). Focus on the role of the 5’ end as a prime determinant of decay for many mRNAs has continued (f14), and a role for RNase E has gained greater support (196).This is reflected in the proposal that RNase E, perhaps as part of a degradosome, enters the messenger and traverses it from 5’ to 3‘, either by sliding or skipping ( 1 , 115, 298). This model is particularly attractive in the light of the finding that RNase E might have spatially distinct cleavage and RNA binding sites (245).Another model, based on the synergism found between the mutations in RNase E and other genes affecting mRNA decay, proposes that a decay-enzyme complex anchors to the 3’ terminal poly(A) sequences and cleaves internally from the 3’ end (148).

6. Concerted Decay

We believe that these models, although accommodating many important features of decay, do not address three aspects that are strongly supported.

1. Messenger molecules are degraded rapidly once decay is initiated (55, 64, 203) ( G . J. Murakawa, T. Thai and D. P. Nierlich, unpublished). This is contrary to the relatively slow movement of ribosomes, which are proposed to limit the 5’-to-3’ movement of the decay wave. 2. It is probable that there are alternative pathways of decay and participatory roles for additional enzymes. Even in double or triple mutants that combine mutations in p n p , m b, and m e , messenger decay still occurs, sometimes only slightly slower than in the wild type, and there is only a modest increase in the accumulation of degradation intermediates of some mRNAs (146, 196). 3. Current models neither address the functional inactivation of messenger nor correctly reflect the role that translation plays in the overall

DECAY OF BACTERIAL MESSENGER HNA

207

process (83, 90, 94,212a) (0.Fattal, T. M . F. Tuohy, J. F. Atkins and D. P. Nierlich, unpublished). The model shown in Fig. 7 has as a key element the formation of a translationally inactive complex that initially encompasses the 5’ region of a messenger but then sequesters distal sequences. Formation of the complex is brought about by the binding of a protein factor near the RBS and inactivating it. The protein factor-it might be an RNA binding protein similar to rho or a ribonuclease like RNase E in an inactive state-binds to RNA randomly and with broad sequence specificity. However, when binding near the RBS, competition occurs between the factor and ribosomes, and when binding to actively translated regions, the factor is displaced by the movement of ribosomes. Where the decay of a polycistronic mRNA such as lacZYA is concerned (94), the factor might inactivate a distal messenger by binding in an intercistronic region, without affecting the decay of the upstream region. Where the intercistronic region is short, the factor could be displaced by ribosomes in the same way as was postulated to occur in translated regions. Further formation of the RNA.protein complex involves gathering the

FIG. 7. A model for the decay of bacterial mRNA. An RNA binding protein or RNase (in an inactive state, arrow) binds randomly to mRNA. Where binding occurs in translated regions, the factor is displaced by ribosomes. Where binding overlaps with an RBS, further translation is inhibited and a complex forms. Association with RNases occurs to form a complex that leads to the rapid formation of fragments, transfer of 3’ exonucleases to newly formed ends leads to rapid removal of fragments (POOF) (94) (0.Fattal, T. M. F. Tuohy, J. F. Atkins and D. P. Nierlich, unpublished).

208

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

inactivated mRNA into folds, the structure of the folds reflecting secondarystructure constraints of the RNA molecule. This complex formation is accompanied by the association of several ribonucleases, including, presumably, RNase E, RNase 11, and PNPase. Triggering of the complex leads to cleavage of the messenger at several sites and the transfer of 3’ exonuclease(s) to the newly formed 3’ ends. The processive action of the exonuclease(s) reduces the size of the fragments quickly. Triggering might be randomly initiated or it might, in some cases, require the formation of a specific complex. The protection afforded by ribosomes is thus indirect, due to interference with the formation of the complex, potentially at several points. By this model, cleavage of messengers might not be entirely 5‘ to 3‘. The rapid decay of the messenger might, to some extent, hide features of the process. At the same time, the 5’ end might be cleaved before translation is fully run out or before transcription of the 3’ end is complete, particularly with long mRNAs. However, by the introduction of a lag between functional and mass decay, most often (particularly with shorter messages) translation will have run out when chemical degradation begins. In some cases, secondary regions of the messenger might be degraded faster or slower than the 5’ end because of differences in the rates of formation and degradation of different domains. In still other cases, the triggering event might depend on the formation of a specific complex, and an event involving the 3‘ or 5’ end, or both, might be rate-limiting. Similarly, the presence of a stabilizing 5’ hairpin might prevent the nascent complex from encompassing the RBS. This model is distinct from some prior models in that decay is not a passive process in which mRNA is attacked randomly by cytoplasmic RNases as it becomes exposed by the runoff of ribosomes. Rather, current research, more and more forcibly, indicates that there is a core of alternative means to remove rapidly the bulk of an mRNA. The enzymes responsible for this appear to be contained in degradation complexes, and their activity is modulated in various ways, in part to make them physiologically responsive, but also, importantly, to direct the overall process to depend on the 5’ and 3’ determinants that are rate-limiting for different mRNAs. For this process to work, translated regions, even those infrequently translated, are not generally direct targets. On top of this core of processes are overlaid the processing and cleavage steps that distinguish individual mRNAs whose rate-limiting steps are dependent on RNase I11 or, perhaps, on RNase E or other endonucleases. Polyadenylylation is necessary for the complete degradation of one labile RNA species, but does not appear necessary for the general decay of mRNA. Thus we envision its role as being in an alternative path of decay, perhaps one that removes both translated and untranslated RNAs that persist after escaping other routes of degradation.

DECAY OF BACTERIAL MESSENGER RNA

209

ACKNOWLEDGMENTS We thank the many colleagues who sent reprints and preprints, or who otherwise communicated unpublished results. We also thank Gregory Charlop, Daniel Behroozan, Wayne Chang, Diane Brandt, Mimi Chan and Rainier Griiang for help in the preparation of the manuscript and figures. D. P. N. thanks the students and associates who over the years have helped make these studies so pleasurable a pursuit. Work in his laboratory was supported by grants from the NIH and NSF and research funds of the University of California.

REFERENCES 1. J. G. Belasco, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman, eds.), Chap. 1. Academic Press, Sari Diego, 1993. 2. C. P. Ehretsmann, A. J. Carpousis and H. M . Krisch, FASEB J. 6, 3186 (1992). 3. A. J. Carpousis and H. M. Krisch, in “Molecular Biology of Bacteriophage T4” (J. D. Karam, ed.), Chap. 14, American Society for Microbiology, Washington, D.C., 1994. 4 . D. Kennell, in “Maximizing Gene expression” (W. S. Reznikoff and L. Gold, eds.), p. 101. Butterworths, Stoneham, MA, 1986. 5. P. Geiduschek and R. Haselkorn, ARB 38, 647 (1969). 6 . M. P. Deutscher, This Series 39, 209 (1990). 7. A. K. Datta and S. K. Niyogi, This Series 17, 271 (1976). 8. M. Grunberg-Manago and A. von Gabain, NATO AS1 series, in press (1995). 9. N. R. Pace, in “Processing of RNA” (D. Aperion, ed.), Chap. 1. CRC Press, Boca Raton, FL, 1984. 10. T. C. King, R. Sirdeskmukh and D. Schlessinger, Microbiol. Rev. 50, 428 (1986). 11. T. C. King and D. Schlessinger, in “Escherichia coli and Salmonella typhimurium, Cellular and Molecular Biology” (F. C. Neidhardt, ed.), Chap. 47. American Society for Microbiology, Washington, D.C., 1987. 12. D. Apirion and P. Gegenheimer, in “Processing of RNA” (D. Aperion, ed.), Chap. 2. CRC Press, Boca Raton, FL, 1984. 13. D. Apirion and A. Miczak, Bioessays 15, 113 (1993). 14. J. G. Belasco and G. Brawerman (eds.), “Control of Messenger RNA Stability.” Academic Press, San Diego, 1993. 15. D. P. Nierlich, Annu. Reu. Microbiol. 32, 393 (1978). 16. F. Jacob and J. Monod, CSMSQB 26, 193 (1961). 16a. F. Jacob and J. Monod, J M B 3, 318 (1961). 17. S. Spiegelman, CSHSQB 26, 75, (1961). 17u. E. Volkin and L. Astrachan, Virology 2, 149 (1956). 18. S. Brenner, CSHSQB 26, 101 (1961). 19. F. Gros, W. Gilbert, H. H. Hiatt, G. Attardi, P. F. Spahr and J. D. Watson, CSHSQB 26, 111 (1961). 20. C. Levinthal, D. P. Fan, A. Higa and R. A. Zimmermann, CSHSQB 28, 183 (1963). 21. M. Jacquet and A. Kepes, J M B 60, 453 (1971). 22. J. G . Belasco and G. Brawerman, in “Control of Messenger RNA Stability” (J. G. Belasco and G . Brawerman, eds.), Chap. 18. Academic Press, San Diego, 1993. 23. D. Nakada and B. Magasanik, J M B 8, 105 (1964). 24. A. Kepes, BBA 76, 293 (1963). 25. A. L. Koch, J. Theor. Biol. 32, 429 (1971).

210 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.

DONALD P. NIERLICH AND GEORGE J. MURAKAWA W. Salser, J. Janin and C. Levinthal, JMB 31, 237 (1968). N. S. Petersen, C. S. McLaughlin and D. P. Nierlich, Nature 260, 70 (1976).

J. R. Cole and M. Nomura, ] M E 188, 383 (1986). A. Kepes, BBA 138, 107 (1967). L. Leive and V. K o l h , J M B 24, 247 (1967). L. Hartwell and B. Magasanik, J M B 7, 401 (1963). M. Adesnik and C. Levinthal, C S H S Q B 35, 451 (1970). D. Kennell and V. Talkad, J M B 104, 285 (1976). A. L. Koch, JMB 60, 12 (1971). R. 0. Kaempfer and B. Magasanik, JMB 27, 475 (1967). C. Petersen, J. Bact. 173, 2167 (1991). R. L. Coffman, T. E. Norris and A. L. Koch, JMB 60, 1 (1971). L. Leive, J M B 13, 862 (1965). G. Lancini, R. Pallanza and L. G. Silvestri, J. Buct. 97, 761 (1969). K. R. Sowers, T. T. Tung and R. P. Gunsalus, JBC 268, 23172 (1993). A. N. Hennigan and J. N. Reeve, Mol. Microbiol. 11, 655 (1994). T. Kivity-Vogel and D. Elson, BBA 138, 66 (1967). C. Petersen, MGG 209, 179 (1987). M. Wice and D. Kennell, JMB 84, 649 (1974). J. K. Rose and C. Yanofsky, ] M E 69, 103 (1972). C. Yanofsky and I. P. Crawford, in “Escherichirr coli and Salmonella typhimurium, Cellular and Molecular Biology (F. C. Neidhardt (ed.), Chap. 90. American Society for Microbiology, Washington, D.C., 1987. 47. S. Pedersen, S . Reeh and J. D. Friesen, MGG 166, 329 (1978). 48. M. Blundell, E. Craig and D. Kennell, Nature NB 238, 46 (1972). 49. D. Kennell and C. Simmons, J M B 70, 451 (1972). 50. D. Schlessinger, K. A. Jacobs, R. S. Gupta, Y. Kano and F. Irnarnoto, JMB 110, 421 (1977). 51. S. Lin-Chao and S. N. Cohen, Cell 65, 1233 (1991). 52. P. J. Green and M. Inouye, J M B 176, 431 (1984). 53. M. L. Pato and K. von Meyenberg, CSHSQB 35, 497 (1970). 54. K.-0. Cho and C . Yanofsky, ] M E 204, 51 (1988). 55. G. J. Maurakawa, C. Kwan, J. Yamashita and D. P. Nierlich, J. Bact. 173, 28 (1991). 56. 6. A. Mackie, J. Buct. 169, 2697 (1987). 57. G. Klug and S. N. Cohen, I. Bact. 172, 5140 (1990). 58. A. von Gabain, J. G. Belasco, J. L. Schottel, A. C. Y. Chang and S. N. Cohen, PNAS 80, 653 (1983). 59. G. Nilsson, J. 6. Belasco, S . N. Cohen and A. von Gabain, Nature 312, 76 (1984). 60. T. Yamamoto and F. Imamoto, J M B 92, 289 (1975). 61. M. Ono and M. Kuwano, JMB 129, 343 (1979). 62. C. C. Case, E. L. Simons and R. W. Simons, EMBOJ. 9, 1259 (1990). 63. M. Blundell and D. Kennell, J M B 83, 143 (1974). 64. G. A. Mackie, J. Buct. 171, 4112 (1989). 65. D. P. Nierlich, JMB 72, 751 (1972). 66. D. P. Nierlich, JMB 72, 765 (1972). 67. D. P. Nierlich, Science 158, 1186 (1967). 68. J. Neuhard and P. Nygaard, in “Escherichiu coli and Salmonella typhimurium, Cellular and Molecular Biology (F. C. Neidhardt, ed.), Chap. 29. American Society for Microbiology, Washington, D.C., 1987. 69. D. P. Nierlich, PNAS 60, 1345 (1968).

DECAY OF BACTERIAL MESSENGER RNA

211

70. H. Bremer and P. P. Dennis, in “Escherichia coli and Salmonelh typhimurium, Cellular

and Molecular Biology (F. C. Neidhardt, ed.), Chap. 96. American Society for Microbiology, Washington, D.C., 1987. 71. M. P. Deutscher and N. B. Reuven, PNAS 88, 3277 (1991). 72. G . Contesse, M. Crepin and F. Gros, in “The Lactose Operon” (J. R. Beckwith and D. Zipser, eds.), Chap. VI. CSHLab, Cold Spring Harbor, New York, 1970. 73. P. S . Cohen, K. R. Lynch, M. L. Wansh, J. M . Hill and H. L. Ennis,JMB 114,569 (1977). 74. A. C. Walker, M. L. Walsh, D. Pennica, P. S. Cohen and H. L. Ennis, PNAS 75, 1126 (1976). 75. D. Court, in “Control of Messenger RNA Stability” (J. 6. Belasco and G. Brawerman, eds.), Chap. 5. Academic Press, San Diego, 1993. 76. D. Kennell and H. Riezman, J M B 114, 1 (1977). 77. L. W. Lim and D. Kennell, J M B 135, 369 (1979). 78. G. J. Murakawa, Ph.D. Disertation. University of California, Los Angeles, 1988. 79. V. J. Cannistrao, M. N. Subharao and D. Kennell, J M B 192, 257 (1986). 80. J. R. McConnick, J. M. Zengel and L. Lindahl, NARes 19, 2767 (1991). 81. V. J. Cannistraro and D. Kennell, J M B 182, 241 (1985). 82. C. Petersen, in “Control of Messenger RNA Stability” (J. G. Belasco and G . Brawerman, eds.), Chap. 6. Academic Press, San Diego, 1993. 83. J. R. McCorniick, J. M. Zengel and L. Lindilhl, J M B 239, 608 (1994). 84. J. E. Toivonen and D. P. Nierlich, Nature 232, 74 (1974). 85. V. J. Cannistraro and D. Kennell, J. Bact. 161, 820 (1985). 86. C. J. Decker and R. Parker, Trends Biochetn. Sci. 19, 336 (1994). 87. V. J. Cannistraro and D. Kennell, Nature 277, 407 (1979). 88. P. Stanssens, E. Remaut and W. Fiers, Cell 44, 711 (1986). 89. L. A. Wagner, R. F. Gesteland, T. J. Dayhuff and R. B. Weiss, J. Bact. 176, 1863 (1994). 90. 0. Yarchuk, N. Jacques, J. Guillerez and M. Dreyfus, J M B 226, 581 (1992). 91. M. J. Hansen, L.-H. Chen, H. L. S. Fejzo and J. G. Belasco, Mol. Microbiol. 12, 707 (1994). 92. C. P. Ehretsniann, A. J. Carpousis and H. M. Krisch, Genes Den 6, 149 (1992). 93. D. H. Bechhofer and D. Dubnau, PNAS 84, 498 (1987). 94. D. P. Nierlich, 0. Fattal and T. Tiiohy, FP 7, A1090 (1993). 95. J. Forchhammer, E. N. Jackson and C. Yanofsky, J M B 71, 687 (1972). 96. Y. Kano, H. Nakamura, R. L. Somerville and F . Imamoto, MGG 176, 379 (1979). 97. R. D. Mosteller, J. K. Rose and C. Yanokky, C S H S Q B 35, 461 (1970). 98. R. S . Gupta and D. Schlessinger, J M B 92, 311 (1974). 99. C. Yanofsky and J. Ito, J M B 24, 313 (1968). 100. P. Ziemke and J. E. 6. McCarthy, BBA 1130, 297 (1992). 101. J. E. G. McCarthy, B. Gerstel, B. Surin, U. Wiedemann and P. Ziemke, Mot. Microhiol. 5, 2447 (1991). 102. J. B . Owolabi and B. P . Rosen, J . Bact. 172, 2367 (1990). 103. S. F. Newbury, N. H. Smith, E. C. Robinson, I. D. Hiles and C. F. Higgins, Cell 48, 297 (1987). 104. S. F. Newbury, N. H. Smith and C. F. Higgins, Cell 51, 1131 (1987). 105. B. J . A. M. Jordi, I. E. L. op den Camp, L. A. M. de Haan, B. A. M. van der Zeijst and W. Gaastra, J. B a t . 175, 7976 (1993). 106. M. Baga, M. Goransson, S. Normark and B. E . Uhlin, Cell 52, 197 (1988). 107. D. Schumperli, K. McKenney, 13. 4. Sobieski and M . Rosenberg, Cell 30, 865 (1982). 108. D. Georgellis, S. Arvidson and A . von Gabain, J . Bact. 174, 5382 (1992). 109. J. G . Belasco, 6. Nilsson, A. von Gabain and S. N . Cohen, Cell 46, 245 (1986).

212

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

110. G. Nilsson, J. G. Belasco, S. N. Cohen and A. von Gabain, PNAS 84, 4890 (1987). 111. D. Bechhofer, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman,

eds.), Chap. 3. Academic Press, San Diego, 1993. 112. L. Chen, S. A. Emory, A. L. Bricker, P. Bouvet and J. G. Belasco, J. Bact. 173, 4578 (1991). 113. V. Rosenbaum, T. Klahm, U. Lundberg, E. Homgren and A. von Gabain and D. Riesner, ] M B 229, 656 (1993). 114. P. Bouvet and J. G. Belasco, Nature 360, 488 (1992). 115. S. A. Emory and J. G. Belasco, ]. Bact. 172, 4472 (1990). 116. S. A. Emory, P. Bouvet and J. G. Belasco, Genes Deu. 6, 135 (1992). 117. C. F. Higgins, H. C. Causton, G . S . C. Dance and E. A. Mudd, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman, eds.), Chap. 2. Academic Press, San Diego, 1993. 118. M. Nomura, R. Course and G. Baughman, ARB 53, 75 (1984). 119. J. M. Zengel and L. Lindahl, This Series 47, 332 (1994). 120. L. Mattheakis, L. Vu, F. Sor and M. Nomura, PNAS 86, 448 (1989). 121. A. M. Fallon and C. S. Jinks, G . D. Strycharz and M. Nomura, PNAS 76, 3411 (1979). 122. A. M. Fallon, C. S. Jinks, M. Yamamoto and M. Nomura, J . Bact. 138, 383 (1979). 123. M. 0. Olsson and K. Gausing, Nature 283, 599 (1980). 124. P. Singer and M. Nomura, MGG 199, 543 (1985). 125. C. A. Mackie, j . Bact. 173, 2488 (1991). 126. P. Sandler and B. Weisblum, J M B 203, 905 (1988). 127. J. F. Dimari and D. H. Bechhofer, Mol. Microbiol. 7, 705 (1993). 128. M. Mayford and B. Weisblum, E M B O J . 9, 4307 (1989). 129. M. Mayford and B. Weisblum, J M B 206, 69 (1989). 130. K. K. Hue and D. H. Bechhofer, 1. Bact. 173, 3732 (1991). 131. D. H . Bechhofer and K. H. Zen, J. Bact. 171, 5803 (1989). 132. P. Sandler and B. Weisblum, J. Bact. 171, 6680 (1989). 133. J. Dreher and H. Matzura, Mol. Microbiol. 5, 3025 (1991). 134. A. G. Shivakumar, J. Hahn, G. Grandi, Y. Kozlov and D. Dubnau, PNAS 77, 3903 (1980). 135. N. H. Albertson and T. Nystrom, Fed. Euro. Microbid. Soc. Lett. 117, 181 (1994). 136. D. Georgellis, T. Barlow, S. Arvidson and A. von Gabain, Mol. Microbid. 9, 375 (1993). 137. M. D. Henry, S. D. Yancey and S. R. Kushner, J. Bact. 174, 743 (1992). 138. 0. R. Lagoni, K. von Meyenburg and 0. Michelsen, J. B a t . 175, 5791 (1993). 139. E. A. Mudd and C. F. Higgins, M d . Microbiol. 9, 557 (1993). 140. C. Jain and J. G. Belasco, Genes Deu. 9, 84 (1995). 141. J. C. A. Bardwell, P. Regnier, S. Chen, Y. Nakamura, M. Grunberg-Manago and D. L. Court, EMBO J . 8, 3401 (1989). 142. C. Portier, L. Dondon, M. Grunberg-Manago and P. Regnier, EMBO]. 6, 2165 (1987). 143. E. Hajnsdorf, A. J. Carpousis and P. Regnier, JMB 239, 439 (1994). 144. G. Barry, C. Squires and C. L. Squires, PNAS 77, 3331 (1980). 145. N. P. Ambulos, Jr., E. J. Duvall and P. S. Lovett, Gene 51, 281 (1987). 146. C. M. Arraiano, S. D. Yancy and S. R. Kushner, J. Bact. 175, 1043 (1993). 147. G. Nilsson, U . Lundberg and A. von Gabain, EMBO]. 7, 2269 (1988). 148. E. B. O’Hara, J. A. Chekanova, C. A. Ingle and Z. R. Kushner, PNAS 92, 1807 (1995). 149. K. Gorski, J. M. Rocho, P. Prentki and H. M. Krisch, Cell43, 461 (1985). 149a. D. S. McPheeters, G . D. Stormo and L. Gold, J M B 201, 517 (1988). 150. L. Melin, H. Friden, E. Dehlin, L. Rutberg and A. von Gabain, Mol. Microbiol. 4, 1881 (1990). 151. J. C. Belasco, J. T. Beatty, C. W. Adam, A. von Gabain and S. N. Cohen, Cell 40,171 (1985).

DECAY OF BACTERIAL MESSENGER RNA

213

152. 6 . J. Murakawa and D. P. Nierlich, Bchem 28, 8067 (1989). 153. C. D. Bieger and D. P. Nierlich, J . Buct. 171, 141 (1989). 153a. F. Xu and S. N. Cohen, Nature 374, 180 (1995). 154. M. A. Hediger, D. F. Johnson, D. P. Nierlich and I. Zabin, PNAS 82, 6414 (1985). 155. D. P. Nierlich, C. Kwan, G. J. Murakawa, P. A. Mahoney, A. W. Ung and D. Caprioglio, in “The Molecular Biology of Bacterial Growth (M. Schaechter, F. C. Neidhardt, J. L. Ingraham and N. 0. Kjeldgaard, eds.), p. 185. Jones and Bartlett, Boston, 1985. 156. H. Aiba, A . Hanamura and H. Yamano, JBC 266, 1721 (1991). 157. J. E . Mott, J. L. Galloway and T. Platt, EMBO J. 4, 1887 (1985). 158. M. J. Stern, G. F. Ames, N. H. Smith, E. C. Robinson and C. F. Higgins, Cell37, 1015 (1984). 159. M. Gilson, J.-M. Clement, D. Brutlag and M. Hofnung, EMBOJ. 3, 1417 (1984). 160. B. J. Meyer and J. L. Schottel, Mol. Microhiol. 6, 1095 (1992). 161. M. D. Plamaiin and G. V. Stauffer, MGG 220, 301 (1990). 162. 6. Guarneros, C. Montana, T. Hernandez and D. Court, PNAS 79, 238 (1982). 163. G. Guarneros, L. Kameyama, L. Orozco and F. VelBzquez, Gene 72, 129 (1988). 164. G. Plunker, I11 and H. Echols, J . Bact. 171, 588 (1989). 165. N. Panayotatos and K. Truong, NARes 13, 2227 (1985). 166. P. Alifano, C . Piscitelli, V. Blasi, F. Rivellini, A. G. Nappo, C. B. Bruno and M. S . Carlomagno, Mol. Microbid. 6, 787 (1992). 167. G. Klug, C. W. Adams, J. Belasco, B. Doerge and S. N. Cohen, EMBOJ. 6, 3515 (1987). 168. C.-Y. A. Chen, J. T. Beatty, S. N. Cohen and J. G. Belasco, Cell 52, 609 (1988). 169. G. Klug and S. N. Cohen, J. Bact. 173, 1478 (1991). 170. H. C. Wong and S. Chang, PNAS 83, 3233 (1986). 171. J. M. Romeo and D. R. Zusman, Mol. Microbiol. 6, 2975 (1992). 172. M. N. Hayashi and M. Hayashi, NARes 13, 5937 (1985). 173. H. Cduston, B. Py, R. S. McLaren and C. Higgins, Mol. Microbid. 14, 731 (1994). 174. 6 . Brawerman, in “Control of Messenger RNA Stability” (J. G . Belasco and G. Brawerman, eds.), Chap. 7. Academic Press, San Diego, 1993. 175. N. Ohta, M. Sanders and A. Newton, PNAS 72, 2343 (1975). 176. H. Nakazato, S . Venkatesan and M. Edmonds, Nature 256, 144 (1975). 177. J. W. Brown and J. N. Reeve, J. Buct. 166, 686 (1986). 178. C. W. Kim, P. Markiewicz, J. J. Lee, C. F. Schierle and J. H. Miller, JMB 231,960 (1993). 179. Y. Gopalakrishna, D. Langley and N. Sarkar, NARes 9, 3545 (1981). 180. J. Taljanidisz, P. Karnik and N. Sarkar, J M B 193, 507 (1987). 181. P. Karnik, J. Taljanidisz, M. S.-Szekely and N. Sarkar, JMB 196, 347 (1987). 182. G.-J. Cao and N. Sarkar, PNAS 89, 7546 (1992). 183. G.-J. Cao and N. Sarkar, Fed. Euro. Microhiol. SOC.Lett. 108, 281 (1993). 184. J. W. Brown and J. N . Reeve, J . Bact. 162, 909 (1985). 185. R. Hanschke and M. Hecker, J . Basic Microbiol. 26, 317 (1986). 186. Y. Copalakrishna and N. Sarkar, Bchem 21, 2724 (1982). 187. Y. Gopalakrishna and N. Sarkar, ABB 224, 196 (1983). 188. F. Xu, S. L.-Chao and S. N. Cohen, PNAS 90, 6756 (1993). 189. 6.-J. Cao and N. Sarkar, PNAS 89, 10380 (1992). 190. M. Masters, M. D. Colloms, I. R. Oliver, L. He, E. J. Macnaugliton and Y. Charters, J. B a t . 175, 4405 (1993). 192. T. Tomcsanyi and D. Apirion, J M B 185, 713 (1985). 193. M. P. Kalapos, G.-J. Cao, S. R. Kushner and N. Sarkar, BBRC 198, 459 (1994). 194. B. K. Ray and D. Apirion, J M B 149, 599 (1981). 195. B. Pragai and D. Apirion, J M B 154, 465 (1982).

214

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

196. C. M. Arraiano, S. D. Yancy and S. R. Kushner, J. Bact. 170, 4625 (1988). 197. M. N. Subbarao and D. Kennell, J. B a t . 170, 2860 (1988). 198. P. Alifano, F. Rivellini, C. Piscitelli, C. M. Arraiano, C. B. Bruni and M. S. Carlomagno, Genes Deu. 8 , 3021 (1994). 199. R. S. Cormack, J. L. Generaux and G. A. Mackie, PNAS 90, 9006 (1993). 200. E. A. Mudd, P. Prentki, D. Belin and H. M. Krisch, EMBOJ. 7, 3601, (1988). 201. 0. Melefors, U. Lundberg and A. von Gabain, in “Control of Messenger RNA Stability (J. 6. Belasco and G . Brawerman, eds.), Chap. 4. Academic Press, San Diego, 1993. 202. S. K. Srivastava, V. J. Cannistraro and D. Kennell, J. B a t . 174, 56 (1992). 203. D. P. Nierlich and W. Vielmetter, J M B 32, 135 (1968). 204. J. J. Duffy, S. G . Chaney and P. D. Boyer, J M B 64, 565 (1972). 205. S. G. Chaney and P. D. Boyer, JMB 64, 581 (1972). 206. V. J. Cannistraro and D. Kennell, EJB 213, 285 (1993). 207. D. P. Fan, A. Higa, and C. Levinthal, J M B 8, 210 (1963). 208. F. Imamoto, J M B 74, 113 (1973). 209. R. S. Gupta and D. Schlessinger, J. B a t . 125, 84 (1976). 210. M. Y. Graham, M. Tal and D. Schlessinger, J. B a t . 151, 251 (1982). 211. J. Guillerez, M. Gazeau and M. Dreyfus, NARes 19, 6743 (1991). 212. 0. Yarchuk, I. Iost and M. Dreyfus, Biochimie 73, 1533 (1991). 212a. L. R. Rapaport and G. A. Mackie, J . Bact. 176, 992 (1994). 213. N. Jacques, J. Guillerez and M. Dreyfus, JMB 226, 597 (1992). 214. P. J. Lopez, I. Iost and M. Dreyfus, NARes 22, 1186 (1994). 215. I. Iost, J. Guillerez and M. Dreyfus, J. Baet. 174, 619 (1992). 216. M. Chevrier-Miller, N. Jacques, 0 . Raibaud and M. Dreyfus, NARes 18,5787 (1990). 217. M. P. Deutscher, Cell 40, 731 (1985). 218. M. P. Deutscher, Trends Biochem. Sci. 13, 13 (1988). 219. I. Iost and M. Dreyfus, Nature 372, 193 (1994). 220. K. 0. Kelly and M. P. Deutscher, J. B a t . 174, 6682 (1992). 221. K. 0 . Kelly, N. B. Reuven, Z. Li and M. P. Deutscher, JBC 267, 16015 (1992). 222. T. G. Kinscherf and D. Apirion, MGG 139, 357 (1975). 223. W. P. Donovan and S. R. Kushner, NARes 11, 265 (1983). 224. P. Donovan and S. E. Kushner, PNAS 83, 120 (1986). 225. E. Hajnsdorf, 0. Steir, L. Coscoy, L. Teysset and P. Regnier. EMBO J. 13, 3368 (1994). 226. Z. Li and M. P. Deutscher, JBC 269, 6064 (1994). 227. V. J. Cannistraro and D. Kennell, J M B 243, 930 (1994). 228. M. Kuwano, C. N. Kwan, D. Apirion and D. Schlessinger, PNAS 64, 693 (1969). 229. G. Guarneros and C. Portier, Biochirnie 72, 771 (1990). 230. R. S. McLaren, S. F. Newbury, G. S. C. Dance, H. C. Causton and C. F. Higgins, JMB 221, 81 (1991). 231. A. M. Reiner, J. Bact. 97, 1437 (1969). 232. S. D. Yancey and S. R. Kushner, Biochimie 72, 835 (1990). 233. D. Apirion, Genetics 90, 659 (1978). 234. B. K. Ghora and D. Apirion, Cell 15, 1055 (1978). 235. T. K. Misra and D. Apirion, JBC 254, 11154 (1979). 236. T K. Misra and D. Apirion, J . Bact. 142, 359 (1980). 237. E. A. Mudd, A. J. Carpousis and H. M. Krisch, Genes Deu. 4, 873 (1990). 238. M. Kuwano, M. Ono, H. Endo, K. Hori, K. Nakamura, Y. Hirota and Y. Ohnishi, MGG 154, 279 (1977). 239. M. Ono and M. Kuwano, J M B 129, 3-43 (1979).

DECAY OF BACTERIAL MESSENGER RNA

215

240. M. Kuwdno and M . Ono, in “Microbiology-1983’’ (D. Schlessinger, ed.), p. 86. American Society for Microbiology, Washington, D.C., 1983. 241. E. A. Mudd, H. M . Krisch and C. F. Higgins, Mol. Microbiol. 4, 2127 (1990). 241a . L. Taraseviciene, A. Miczak and D. Apirion, Mol. Microbiol. 5, 851 (1991). 242. 0. Melefors and A. von Gabain, Mol. Microbiol. 5, 857 (1991). 243. P. Babitzke and S. R. Kushner, PNAS 88, 1 (1991). 244. K. J. McDowall, R. G. Hernandez, S. Lin-Chao and S. N. Cohen, J . Bact. 175, 4245 (1993). 245. S. Casarkgola, A. Jacq, D. Laoudj, G . McGurk, S. Margarson, M. Tempete, V. Norris and I. B. Holland, J M B 228, 30 (1992). 246. A. K. Chauhan, A. Miczak, L. Taraseviciene and D. Apirion, NARes 19, 125 (1991). 247. F. Claverie-Martin, M. R. Diaz-Torres, S. I). Yancey and S. R. Kushner, J B C 266, 2843 (1991). 247a. L. Taraseviciene, S. Naureckiene and B. E. Uhlin, J B C 269, 12167 (1994). 248. A. J. Carpousis, 6. Van Houwe, C. Ehretsmann and H. M. Krisch, CeZl 76, 889 (1994). 249. S. K. Jain, B. Pragai and D. Apirion, BBRC 106, 768 (1982). 250. A. Miczak, R. A. K. Srivastava and I). Apirion, Mol. Microhiol. 5, 1801 (1991). 251. R. A. K. Srivastava, N. Srivastava and D. Apirion, Biochem. Znt. 25, 57 (1991). 252. M. K. Roy and D. Apirion, BBA 747, 200 (1983). 253. B. Sohlberg, U. Lundberg, F. Hartl and A. von Gabain, PNAS 90, 277 (1993). 2!54. P. K. Chanda, M. Ono, M. Kuwano and H. Kung, J. Bact. 161, 446 (1985). 255. D. Georgellis, B. Sohlberg, F. U . Hartl and A. von Gabain, MoZ. Microbiol. 16, 1259 (1995). 256. B. Py, H. Causton, E. A . Mudd and C. F. Higgins, Mol. Microbiol. 14, 717 (1994). 257. G . A. Mackie, J B C 267, 1054 (1992). 257a. G . A. Mackie and J. L. Genereaiix, J M B 234, 998 (1993). 258. S. Lin-Chm, T. Wong, K. J. McDowall and S. N. Cohen, J B C 269, 10797 (1994). 259. K. J. McDowall, S. Lin-Chao and S. N. Cohen, JBC 269, 10790 (1994). 260. K. J. McDowall, V. R. Kaberdin, S.-W. Wu, S. N . Cohen and S. Lin-Chao, Nature 374, 287 (1995). 261. V. Yajnik and G. N. Godson, JBC 268, 13253 (1993). 262. P. Nilsson and B. E. Uhlin, Mol. Microbiol. 5, 1791 (1991). 263. J. Chow and P. P. Dennis, Mol. Microbiol. 11, 919 (1994). 264 M . Faubladier, K. Cam and J. Bouche, J M B 212, 461 (1990). 265. P. Regnier and E. Hajnsdorf, J M B 217, 283 (1991). 266. A. M. Patel and S. D. Dunn, J. Buct. 174, 3541 (1992). 267. R. Baunieister, P. Flache, 0. Melefors, A. von Gabain and W. Hillen, NARes 19, 4595 (1991). 268. G . Gross, JBC 266, 17880 (1991). 269. C . Jain and N. Kleckner, MoZ. Microbid. 9, 233 (1993). 270. 6. Klug, S . Jock and R. Rothfuchs, Gene 121, 95 (1992). 271. R. J. Kokoska, K. J. Blumer and D. A. Steege, Biochimie 72, 803 (1990). 272. D. R. Gitelnian and D. Apirion, BBRC 96, 1063 (1980). 273. A. Wennborg, B. Sohlberg, D. Angerer, 6. Klein and A. von Gabain, PNAS. 92, 7322 (1995). 274. M. Wang and S. N. Cohen, PNAS 91, 10591 (1994). 275. U . Lundberg, A. von Gabain and 0. Melefors, E M B O J. 9, 2731 (1990). 276. H. D. Robertson, R. E. Webster and N. D. Zinder, J B C 243, 82 (1968). 277. H. D. Robertson, Cell 30, 669 (1982).

216 278. 279. 280. 281. 282. 283. 284. 285. 286. 287. 288. 289. 290. 291.

292. 293. 294. 295. 296. 297. 298.

DONALD P. NIERLICH AND GEORGE J. MURAKAWA

A. M. Nicholson, This Series. 52, l(1995). P. Babitzke, L. Granger, J. Olszewski and S. R. Kushner, J. Bact. 175, 229 (1993). P. Regnier and M. G. Manago, JMB 210, 293 (1989). U. Schmeissner, K. McKenney, M. Rosenberg and D. Court, J M B 176, 39 (1984). K. Gerdes and A. Nielsen, JMB 226, 637 (1992). V. Shen, M. Cynamon, B. Daugherty, H. Kung and D. Schlessinger, JBC 256, 1896 (1981). V. Shen, F. Imamoto and D. Schlessinger, J. Bact. 150, 1489 (1982). V. Talkad, D. Achord and D. Kennell, J. Bact. 135,528 (1978). M. Robert-Le M e w and C. Portier, EMBO J. 11, 2633 (1992). J. Meador 111, B. Cannon, V. J. Cannistraro and D. Kennell, EJB 187, 549 (1990). V. J. Cannistraro and D. Kennell, J . Bact. 173, 4653 (1991). V. J. Cannistraro and D. Kennell, EJB 181, 363 (1989). R. Kaplan and D. Apirion, JBC 249, 149 (1974). T. D. Yager and P. H. yon Hippel, in “Escherichia coli and Salmonella typhimurilam, Cellular and Molecular Biology (F. C. Neidhardt, ed.), Chap. 76. American Society for Microbiology, Washington, D.C., 1987. C. G. Burd and G. Dreyfuss, Science 265, 615 (1994). D. E. Morse, R. D. Mosteller and C. Yanofsky, C S H S Q B 34, 725 (1969). G. Mangiarotti, D. Schlessinger and M. Kuwano, J M B 60, 441 (1971). R. Holnies and M. F. Singer, BBRC 44, 837 (1971). A. L. M. Bothwell and D. Apirion, BBRC 44, 844 (1971). D. Apirion, MGG 122, 313 (1973). J. G. Belasco and C. F. Higgins, Gene 72, 15 (1988).

The Linker Histones and Chromatin Structure: New Twists JORDANKA

ZLATANOVA

Department of Biochemistry and Biophysics Oregon State University Coruallis, Oregon 97331 and Institute of Genetics Bulgarian Academy of Sciences 1113 So&, Bulgaria

KENSAL

VAN

HOLDE~

Department of Biochemistry and Biophysics Oregon State University Corvallis, Oregon 97331 I. Linker Histones: Properties and Interactions with Other Chromatin Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Properties of the Linker Histones . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Interactions of Linker Histones with D N A ... C. Interactions with High-mobility-group Proteins . . . . . . . . . . . . . . . . . 11. The Importance of Linker Histones in Chromatin Fiber Structure . . . . A. Location of Linker Histones in the Nucleosome . . . . . . . . . . . . . . . . B. Linker Histones and the Chromatin Fiber at Low Ionic Strength . . . . . C. Linker Histones and the Condensed Chromatin Fiber . . . . . . . . . . . 111. What Do We Know? What Do We Need to Learn? . . . . . . . . . . . . . . . . .

...

........

221 221 225 235 236 237 242 246 253 255

The problem of how the fibers of chromatin are folded in the eukaryotic nucleus has interested biologists and biochemists for decades. It has long been recognized that the histones play a major part in this folding (see Ref. 1 for the earlier history). However, the distinctly different roles of the histones H2A, H2B, H3, and H4 on one hand, and the lysine-rich histones such as H1 and its cognates on the other, were not understood until after the discovery of nucleosomes in the early 1970s. It then became clear that the first four were constituents of the nucleosome core particle, whereas the lysine-rich histones were somehow associated with the “linker” DNA between core To whom correspondence may be addressed. Progress in Nucleic Acid Research and Molecular Biology, Vol. 52

217

Copyright 0 1996 by Academic Press, Ioc. All rights of reproduction in any form reserved.

218

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

particles. Accordingly, the former have come to be called “core” histones and the latter “linker” histones. Wrapping of 146 bp of DNA about the histone octamer to form the core particle provides one level of folding (a compaction ratio of about 5:1), but this cannot account for the many thousandfold condensation afforded the DNA in the eukaryotic nucleus. The “string-of-beads” structure observed in early electron microscopic studies (2-4) obviously could not satisfy the compaction requirement. It soon became evident that there must exist some level or levels of higher order folding of the chromatin fiber. Accordingly, extensive research in the late 1970s was directed toward a search for the details of such structure. In a seminal paper, Finch and Klug (5)showed that the extended nucleosomal filaments condense into irregular fibers of about 30 nm diameter in the presence of even low concentrations (0.2mM) of Mg2+. Based on earlier X-ray diffraction studies of chromatin fibers as well as their appearance in the electron microscope, these authors proposed a “solenoid” model, in which nucleosonies were wrapped into a regular helix with a pitch of about 11 nm. Later studies (6, 7) provided much more information. It was shown that increasing concentrations of either monovalent or divalent cations resulted in a progressive condensation of the fiber (Fig. 1). At very low ionic strength, the fibers appear to lie on the grid in a flattened zig-zag conformation, and contract to the condensed fiber at about 60 mM Na+ or 0.3 mM Mg2+. The

FIG. 1. Electron micrographs of rat liver chromatin fibers under three different ionic conditions: [A) 0 InM NaCI, [b) 20 niM NaC1, and ( C )75 mM NaC1. All samples were fixed in 0.1% glutaraldehyde, 5 mM triethanolamine, 0.2 m M EDTA, pH 7.4, at the corresponding ionic strength. Reproduced with permission from Ref. 7.

LINKER HISTONES IN CHROMATIN

219

-50 -50

-100’ 100

so

0

’50

FIG. 2. Model of the chromatin fiber, its simulated scanning-force microscope image, and an actual SFM image. (A) Model of a chromatin fiber used in these simulations. The D N A wraps in a left-handed fashion 1.75 turns around a histone octamer. The octamer is simulated by a disk, 5.5 nm high and 11nm in diameter. The radius of curvature of the D N A wrapped around the core is 5.5 nm, and the pitch of the D N A is 2.86 nm. The D N A has an average of 10.15 bp/turn around the histone octamer and 10.4 hp/turn in the linker portion. The exit angle ofthe D N A is determined by the tangent at the point it leaves the nucleosome. The length of the linker D N A is determined using a uniform deviate random-number algorithm that generates linker lengths between 60 and 64 bp. The linker D N A is assumed to adopt a straight configuration between nucleosomes. The model generates three-dimensional, randomly organized fibers with an average diameter of 30 nm. (B) The model in A as it appears on a plane after partial flattening to simulate the process of deposition o n the mica. (C) Simulated SFM image of the model in A, obtained by convoluting the plane projection in B, assuming a parabolic tip with a radius of curvature of 10 nm. (D) Image of C viewed from a 30" inclination angle. (E) A 30" view of an experimental S F M image of a fixed chromatin fiber deposted on mica from 5 mM triethanolamine-HC1, pH 7.0. Image sizes 400 nin X 400 nm (B-E). Reproduced with permission from Ref. 13.

formation of well-defined fibers requires the presence of lysine-rich histones such as H1. A kind of condensation could occur in the absence of such proteins, but this led only to irregular aggregates. These observations provided the incentive for the generation of a series of models for the condensed fiber in the years that followed. These have often been described (see, for example, Refs. 1, 8, and 9). We believe that it is not important to describe such models here, for there exists no substantial

220

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

evidence to suggest that any significant fraction of chromatin exists in any regular helical structure. A detailed argument is given elsewhere (lo),but the main points can be summarized as follows. First, in none of the many electron-microscope studies of chromatin that have been published do we observe more than minute patches of what might be considered a regular helix. Second, the low-angle scattering patterns (X-ray or neutron) from chromatin fibers do not provide conclusive evidence for any significant amount of regular helix. The major features of the scattering patterns can be accounted for by a random arrangement of nucleosomes on a cylinder (11). Recent scanning-force microscope (SFM) studies (12-14) of chromatin fibers at low ionic strength reveal irregular, helixlike structures whose conformation is explicable in terms of a distribution of linker lengths (see Section II,B,l) (Fig. 2). It seems unlikely that such an irregular fiber should condense into a regular helix as the salt concentration is raised. In fact, recent electron-microscope studies on chromatin fibers in frozen-hydrated or lowtemperature embedded sections of nuclei also show irregular structures (15, 16)(Fig. 3). Finally, studies of chromatin in aqueous solution, using the new technique of X-ray contact microscopy, also show primarily irregular structures (17). Whether the fiber has a regular structure or not, much experimental

FIG. 3. Single slices (a-d) from tomographic reconstructed volumes of electronmicroscope images of nuclei embedded at low temperature and stained with a nucleic-acidspecific stain. Different views of nucleosome-associated and linker DNA (arrowheads)within in situ chromatin fibers from starfish sperm. Bars, 30nm. Reproduced from The Journal of Cell Biology, 1994, Vol. 125, p. 1 by copyright permission of the Rockefeller University Press.

LINKER HISTONES IN CHROMATIN

221

evidence shows that at least some portions of chromatin exist in a rodlike condensed form under physiological conditions. It is also clear that lysinerich (linker) histones are essential for the proper condensation into this fiber. It is our purpose here to examine the following questions: How are these proteins arranged in chromatin? How do they aid in its condensation? What role do they play in regulation of dynamic chromatin processes like transcription? Although we cannot discuss the latter issue in detail (for recent reviews, see Refs. 18 and I S ) , we will, during the course of the presentation of the structural issues, touch on the consequences of H1 binding for the process of transcription and its regulation. To approach these questions, we begin with a description of the various linker histones.

1. linker Histones: Properties and Interactions with Other Chromatin Components

A. Properties of the Linker Histones 1. PRIMARYSTRUCTUREAND VARIANTS

The lysine-rich histones are characterized by a high lysine-to-arginine ratio (on the average, about 15),after which they are named. The amino-acid composition is dominated by basic amino acids: the net charge of the molecule is usually between +50 and +60. The different types of amino acids are distributed quite nonuniformly along the polypeptide chain. The N- and C-terminal regions contain many lysine, arginine, and proline residues, whereas the central part of the molecule is considerably less basic and contains the bulk of the hydrophobic amino-acid residues. The asymmetric distribution of these residues along the chain is reflected in the secondary and tertiary structures typical of these histones (see Section I,A,2). Representative primary structures are shown in Fig. 4. In each cell, the lysine-rich histones are represented by several molecular types (subfractions, isohistones), which differ in molecular mass, aminoacid composition and sequence, and physico-chemical and immunochemical properties (for a review, see Ref. 20). The microheterogeneity is species and tissue specific. In general, the tissue specificity is expressed more as a difference in the relative content of various subfractions than as presence or absence of some of them. However, there are some examples of tissue-specific linker histones. The most well studied is histone H5, which accumulates during the process of terminal differentiation of some nucleated erythrocytes and is believed to be involved in the strong chromatin compaction and transcriptional inactivation

222 H1-CHICK H1O-HUMAN H5SCHICK

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

1 setapvaapavSAPgakaA AKkpkkaaggAkprkpagPsvtelItkAvsAsKeRkGlSla Ill I l l I I I I l l l l l 1 TEnstSAP A AK PKRaKASkkStdHPkYSdMIvAAIqAEKnRaGSSRQ /I I I l l I I I I I I I l l l l l l l l l l l l l l l l l l 1 TEslvlsP ApAK PKRvKASrrSasHPtYSeMIaAAIrAEKsRgGSSRQ

H1-CHICK

61 alkKalaaggydvEknnSrIKLglKsLVskGtLvQTKGtGASGSFklnkKpgEtKakatkK

H10-HUMAN

48 SIQKYIKSHYKVGENADSQIKLSIKRLVttGVLKQTKGVGASGSF RLAKSDEpKKsvafK l I l l l l l / l l l l l I l l l l l l l l I1 1111111I11l1111 I I I I I I I I 49 SIQKYIKSHYKVGhNADlQIKLSIrRLlaaGVLKQTKGVGASGSF RLAKSD kaKrspgK

HSSCHICK H1-CH ICK H10-HUMAN H5SCHICK H1-CHICK H10-HUMAN H5SCHICK

I

I

I l l 1

I l l

I I I I I I I I I I I I

I

I 1

I

122 KpaakpKKpaakkpaaaAkKPKKAAavkkspKkakkspKkakkPaaaAtKKaAksP~tkagrpkkt I I 1 / 1 1 Ill1 I /I I l / / l l / 108 KtKKeiKK vatPKKAsKPKKAAskaPtKKF’KATPvkKAkKKlAAtPKKA KKPKTVK I II IIII I I I I IIIIII I I II I IIII IIIIIII 108 K KK avrrstsPKKAarPrKA rsPaKKPKAT arKArKKsrAsPKKA KKPKTVK 183 AKsPaKAkavKPKaAKskaaKPKAakakKaAtKKK l l l l l l l l l l /Ill I I I I I

164

AK

II

PvKA

II 160 AK srKA

SKPKKAK PvKPKAKSsAKrAqKKK I I I I I I I I I I I Ill SKaKKvK rsKPrAKSgArkspKKX

FIG.4. Alignment of representative primary structures of individual members of the lysine-rich histone family: chicken HI (187),human HI" (18H),and chicken H5 (189).Alignment was carried out using Intelligenetics software. Capital letters denote amino-acid residues conserved in at least two of the three sequences. Courtesy of Dr. S. H. Leuba, University of Oregon, Eugene.

in this cell type (21, 22). Other examples of tissue-specific variants are the subfractions present in gametes or in somatic cells that differentiate into such cells. A well-studied representative of these proteins is H l t (23),which appears only in the mammalian testis during a specific stage of development of the spermatocyte. The sperm-specific proteins of a number of invertebrate species also belong to the tissue-specific members of the histone H1 class. Some H1 subtypes are observed only during embryonic development. Typical examples of these are Hlcs and H l a , which are expressed only during the earliest stages of embryonic development of the sea urchin Strongylocentrotus purpurutus (24).Similarly, the maternally inherited variant H1M (protein B4) is only available until the midblastula stage o f development ofXenopus laevis, at which point it is replaced by somatic-type variants (25, 26). Qualitative and quantitative differences in H1 subtypes have also been observed between normal and malignantly transformed tissues. Especially intriguing are the changes in the amount of histone Hl". This histone, first identified as characteristically present in mammalian tissues with little cell

LINKER HISTONES IN CHROMATIN

223

division, has been subsequently implicated in the establishment and maintenance of the terminally differentiated state (27).

2.

~ R T I A R YSTRUCTURE AND

ASSOCIATION

In aqueous solution at physiological values of p H and ionic strength, the lysine-rich histones consist of three structurally distinct domains: a strongly basic, unstructured fragment at the N terminus (“nose”), a nonpolar central globular domain (“head”), and, again, a strongly basic unfolded domain at the C terminus (“tail”) (28) (Fig. 5A). The overall fold of the globular domains of histones H 5 and H 1 (GH5, GH1) has been determined using twodimensional NMR (29, 30). The NMR results agree well with the recent crystal structure of G H 5 solved to 2.5A resolution (31)(Fig. 5B). The domain consists of a three-helix bundle, with a p hairpin at the C terminus. The structure explains the strong conservation of certain residues at defined positions. Thus, for instance, Gly-79 is involved in making a sharp bend in the polypeptide chain, in going from the end of helix I11 into the p hairpin. Amazingly, the structure is very similar to that of the bacterial DNA-binding protein CAP (catabolite gene activator protein) and also to the structure of the DNA-recognition motif of the Drosophilu transcription factor HNF-3 (32) (Fig. 5B), although the sequences of these proteins show little similarity. The nose-head-tail structure of the members of the H1 family is so strongly conserved that it is often used as a criterion for identification of H 1 proteins. However, it should be noted, that some proteins identified as H 1 histones by a number of other criteria do not contain a typical globular domain. Such, for instance, is the Tetruhymenu H1 (33), the H 1 from the hypotrichous ciliate EupZotes (34),and also one form specific to a terminally differentiated cell type (35). The linker histones constitute the most evolutionary variable histone class. The various portions of these molecules are characterized by a different degree of variation, both in evolution and among subtypes from one and the same tissue or organism. The globular region is relatively well-preserved during evolution and is almost identical in individual representatives of the H 1 complement from a given source. The sequence differences are mainly located in the polycationic termini, both among subtypes and in evolution. It is still not certain to what extent individual linker histone molecules interact with one another in solution. Chemical cross-linking can lead to the formation of H 1 oligomers in solution (1). However, control experiments indicate that the oligomers could have arisen as a result of collision of molecules in solution with some contribution from short-lived couplexes (36),in accordance with conclusions from earlier work (37, 38). On the other hand, H 1 molecules have been shown to aggregate in solution at NaCl concentrations over 50 mM, mainly as a result of interactions among the globular

224

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

A

N

W

C

B

E

GH5 GH5

CAP

~

HNF-3

1

C

FIG. 5. Tertiary structure of histone H5. (A) Schematic presentation of the “nose-headtail” overall structure of the linker histones. (B) Schematic diagram of the globular domain of histone H5 built on the basis of crystallographic data: comparison with the structures of the DNA-binding domains of the bacterial protein CAP and the Drosophila transcription factor HNF-3. Fig. 5B kindly supplied by Dr. V. Ramakrishnan, Brookhaven National Laboratory, Upton, NY. Reprinted with permission from Nature (Ref. 31). Copyright 1993 Macmillan Magazines Limited.

domains (39). These results suggest that the action of H1 in compaction of the fiber (see Section II,C,3) could be due to specific interactions among globular domains. In apparent support of this idea, it has recently been reported that the globular domain of H5 may self-associate in a specific way in solution, whereas similar self-associationwas not observed for the globular domain of histone Hl(40). However, others report that the globular domains of H1 and H5 show little, if any, tendency to associate, even under a wide variety of ionic conditions and protein concentrations (41, 42). How these

LINKER HISTONES IN CHROMATIN

225

seeming discrepancies between results from different laboratories are to be resolved is unclear.

3. SYNTHESISAND MODIFICATION Like the other histones, H 1 is synthesized most intensely during DNA replication. However, a considerable amount of H1 can also be synthesized outside the S phase, especially in G, (43). It is important to note that the synthesis of different linker histone variants may be subject to different types of regulation. For example, transcription of the H5 gene is not cell-cycle regulated (44). Similarly, the synthesis of H1" shows a complex pattern of regulation, depending on the cell type, the physiological state, and the status of differentiation (27). The molecular mechanisms governing the metabolism of the linker histones have recently been reviewed (45). Histone H 1 undergoes two major postsynthetic modifications: phosphorylation and poly(ADP-ribosy1)ation. Despite the intense effort that has gone into the study of these modifications, their role still remains obscure. Bradbury and co-workers (46) suggested that H1 phosphorylation might trigger the condensation of chromosomes during mitosis. However, exceptions to the long-held correlation between H 1 phosphorylation and chromatin condensation in a number of systems have been reported, in which the modification correlates best with decondensation of chromatin (for a review, see Ref. 47). These exceptions call for a reconsideration of how the role of H1 phosphorylation might depend on the specific requirements of different systems. The role of the other major type of modification, poly(ADP-ribosyl)ation, is also far from being clear. The prevailing view is that it is mainly connected to processes involved in DNA repair (1).

6. Interactions of

Linker Histones with DNA

1. INTERACTIONWITH LINEARDNA

The belief that the linker histones interact primarily with the DNA in chromatin has led to numerous studies of complexes between pure DNA and these histones (48).Several lines of evidence suggest that artificial Hl/DNA complexes can be used as appropriate model systems for studying the role of the histone in chromatin. First, the location of the histone molecules in the chromatin fiber is such as to allow chemical cross-linking, with the formation of H1 homopolymers (41, 49). Similar homopolymers can be derived from cooperatively formed complexes of H 1 with linear DNA (50). Second, the saturation ratio of bound H1 to eukaryotic DNA in 0.5 M salt is about the same for naked and chromatin DNA (51),corresponding to one strong H1binding site per 150 bp. This packing density corresponds to occupancy of only the strongest binding sites, because 0.5 M salt represents the bor-

226

JOHDANKA ZLATANOVA AND KENSAL VAN HOLDE

derline for dissociation of H1 from DNA. At lower salt concentrations, H1 molecules will bind nonspecifically to DNA with a density of one H1 for about 35-40 bp. The average nonspecific packing densities of different H 1 subtypes in their complexes with DNA correlate well with the average linker DNA length of the chroinatins with which the respective histone subtypes are associated in uivo (52). Finally, there is a correspondence between the salt level at which compaction of the nucleosomal fiber occurs and the salt concentration required for formation of aggregated H l i DNA complexes. Depending on the ionic strength and the histonelDNA ratio, the appearance of the complexes may differ significantly: thin filaments of two DNA molecules (or possibly DNA duplex hairpins) bridged by histone molecules forming soluble complexes at low salt concentration, rodlike or cablelike structures (consisting of thin filaments packed side by side) at higher salt concentrations, circles, and doughnut-shaped structures have all been observed (48,53-55). The more condensed structures appear to form at higher ionic strength and higher DNA concentrations. The formation of “doughnuts” takes place under conditions of extensive neutralization of the negative charges of the DNA molecule (physiological ionic strength and high Hl/DNA ratio). With the use of scanning-force microscopy (SFM) we have recently observed globular complexes between 146-bp core particle DNA fragments and chicken erythrocyte H1 at low ionic strength (S. Leuba and J. Zlatanova, unpublished). In these complexes the path of the DNA is not resolved, but from their general appearance it seems that it must be severely bent, perhaps wound about a core of histone H1 molecules. The average diameter of these globules is around 50 nm; if the DNA is on the outside, this would correspond to a radius of curvature of the double helix of around 25 nm, which is of the order of the value reported for spontaneously bent DNA molecules whose negative charges had been neutralized by either divalent cations or polyamines (56-58). Inner radii of similar values have been observed in toroids formed by the interaction of isolated C-terminal domains of H1 with DNA (55). Under the conditions of the latter study, however, the intact protein complexed DNA in particles of much greater radius (about 60 nm). It is perhaps physiologically relevant that linker histone binding can cause the DNA to bend so significantly, forming doughnut or toroid structures of curvature close to the curvature of the DNA in the condensed chromatin fiber. The interaction of the linker DNA with the linker histones may be the major determinant of the minimal radius of the chromatin fiber, and hence of the average number of nucleosomes per turn in the irregular helical structure (see Section II,C,l) (see Refs. 59-63, for theoretical analysis of the role of charge neutralization of the DNA phosphates by histone H 1 in DNA bending and higher order folding of polynucleosomes).

LINKER HISTONES IN CHROMATIN

227

OF INTERACTION 2. COOPERATIVITY

Depending on ionic strength, DNA concentration, and histone/DNA ratio, the interaction of histone H1 with DNA will show varying degrees of cooperativity (48). Watanabe (64), using very dilute solutions, has demonstrated cooperative binding on single DNA molecules, with a cooperativity parameter (ratio of cooperativity binding constant to nucleation binding constant) varying from about 3 x 102 at 20 mM salt to about 103 at 200 mM. At higher DNA concentrations, the behavior becomes more complex, with the formation of multistranded complexes beginning in a range of NaCl concentration between 20 and 50 mM. Under these conditions, the cooperativity becomes so extreme that the histone is distributed so as to produce free DNA and saturated multistrand complexes. Interesting, and in contrast to the behavior of histone H1, is the observation that, at these high DNA concentrations, histone HS interacts with DNA cooperatively at all ionic strengths (52). Similarly, the globular domains of both H1 and H5 interact strongly cooperatively at all ionic strengths (42, 65). Whether these aspects of binding behavior of the linker histones to pure DNA also hold true for the binding to chromatin is still unclear (48).

3. PREFEHENTIAL BINDINGTO CROSSOVERS IN DNA As early as 1975 it was shown that histone H1 prefers binding to superhelical over linear or relaxed circular DNA (67, 68). Later, this preference was confirmed in direct competition experiments (69, 70). A simple demonstration of this preference is shown in Fig. 6. Here, a mixture of supercoiled, linear, and relaxed circular DNA has been titrated with H1. The data demonstrate that the supercoiled DNA is shifted rapidly in mobility, whereas the linear and the relaxed DNA forms are unaffected, even though present in excess over the supercoiled one. The H1 concentrations used are such that there are many more histone molecules than DNA writhes, yet all of the histone appears to bind to the superhelical forms. This is consistent with the notion that H1 binds to supercoil crossovers with high affinity, triggering the binding of additional H1 molecules to the same DNA molecules in a cooperative fashion. However, in all these experiments plasmid or viral DNA preparations were used that were incompletely characterized with respect to their topological state. In such preparations the superhelical density was not precisely known and it was unclear how much torsional deformation of the DNA accompanied supercoiling. More importantly, the high superhelical tension would be expected to create alternative non-B-DNA conformations such as cruciforms or Z-DNA at specific nucleotide sequences. If the histone binds

228

JORDANKAZLATANOVAANDKENSALVANHOLDE

FIG. 6. Preference of histone HI for supercoiled DNA forms. A mixture of supercoiled, linear, and relaxed DNA, obtained by treatment of supercoiled pBR322 (lane 1)with the singlestrand-specific endonuclease P1 titrated with increasing amounts of histone HI, as designated above the figure (lanes 2-7). M, Marker lane, containing BstEII-digested A DNA. Note that although the linear and relaxed DNA bands show no change in electrophoretic mobility, the superhelical form is more and more retarded with increasing the amount of H I present in the incubation mixture. Courtesy of Dr. M. Ivanchenko, Oregon State University, Corvallis.

preferentially to some such structures, as turns out to be the case (see below), the preference for supercoiled DNA per se might be illusory. This issue has recently been addressed by using supercoiled plasmid DNA partially prerelaxed with topoisomerase I, so that a population with a narrow distribution of topoisomers of low linking-number difference was obtained (71). None of these topoisomers would contain alternative non-B-DNA structures, The order of disappearance of individual topoisomers from electrophoretic gels as a result of H1 binding again indicated a clear preference for initial histone binding to molecules containing crossovers of double-helical DNA, followed by cooperative binding to these same molecules. The same effect was observed with the isolated globular domain of H5;this suggested that the preference for binding to crossovers is determined by the existence of two specific DNA-binding sites in the linker histone globular domains (31). Crossovers of two double-stranded DNA regions are structurally similar to four-way junction DNA (72). Four-way junction DNA is a biologically

LINKER HISTONES IN CHROMATIN

229

relevant structure in that it is present during genetic recombination (Holliday junctions) and in cruciforms extruding from palindromic sequences in supercoiled DNA. A number of proteins that specifically recognize and bind to these structures have been identified (74, including high-mobility-group protein 1 (HMGl), a member of the other class of major linker DNA-binding proteins (74). Studies on the binding of histone H1 to synthetic four-way junctions have shown that the histone forms a defined complex with these structures even in the presence of a vast excess of linear nonspecific competitor DNA (75) (Fig. 7). The four-way junction also competes efficiently against linear DNA molecules that have the same sequence information as the four-way junction, and against “incomplete” junctions (Fig. 8). The difference in affinity between a four-way junction and linear DNA is so great that we can think of the former as a kind of “specific” binding, the latter “nonspecific.” As in the case of binding to crossovers of DNA in supercoiled plasmids, the globular domain by itself can bind strongly; however, whereas the intact histone forms a single complex, inultiple copies of the globular domain can bind to

A

FIG. 7. Titration of binding of linker histones H I and H5 to four-wayjunction DNA in the presence of competitor DNA. Indicated concentrations of H1 (A) or H5 (B) were incubated with 2.1 nM of four-way junction DNA in the presence of 50 pglml competitor (salmon testis) DNA and analyzed by mobility-shift assay. The concentration of polyacylamide was 5%. Reproduced with permission from Ref. 76.

230

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

-IL

ir

JL

1

JL

FIG.8. Competition between four-way junction DNA, linear control duplexes, and incomplete junctions for binding of H1. Appropriate single-stranded oligonucleotides were annealed in 10 mM Tris HCI (pH 7.45), 100 mM NaCI, 10 mM MgCl,, 1 m M EDTA by heating to 70°C for 3 minutes and cooling down to 0°C over a period of 4 hours. H1 (25 nM) was incubated with labeled four-way junction DNA (0.5 nM) and the indicated amounts of unlabeled four-way junction, unlabeled incomplete junctions, or control duplexes, having the same sequence information as the four-way junction molecule. The products of the interaction were analyzed by mobility-shift assay on polyacrylamide gels. Reproduced with permission from Ref. 75.

the same four-way junction (76). Finally, the affinity of H5 for a four-way junction is higher than that of H1. Binding of intact H1 to four-way junctions is inhibited by cations, Mg2+ and spermidine being much more effective inhibitors than Na+ (76). This inhibition is not likely to be a general ion-competition effect, for Mg2+ shows much less inhibition of the nonspecific binding of H1 to linear DNA. Instead, the inhibition of binding to the four-way junction may be due to iondependent changes in the conformation of the junction. In the absence of specific cations the four-way junction exists in a square planar configuration; in the presence of sufficient concentrations of these ions, the junction folds into an X-shaped structure with two quasicontinuous, coaxially stacked helices (73)(Fig. 9). The transition between these two conformations results in a considerable change in the angles between the four arms of the junction. The results from such experiments suggest that the linker histones may prefer the square planar conformation in which the angle between arms is 90". It is worth noting that 1.75 superhelical turns of the DNA around the octamer of histones would produce an angle of 90" between the DNA strands entering and exiting the nucleosome. It seems likely from the above results that the linker histones bind strongly to such structures, thereby fixing the entry-exit angle and contributing to the formation of the three-dimensional chromatin fiber (see Section 11,B,2). Using planar four-way junctions as a model for crossovers of linker DNA

23 1

LINKER HISTONES IN CHROMATIN

Folded

Unfolded

Stacked X-structure

FIG.9. Schematic offour-way junction folding. In the absence of metal ions, the junction is maximally extended in a planar configuration. Binding of metal ions reduces the electrostatic repulsion of the phosphates in the DNA backbone to the point at which helix-helix stacking may occur. The parallel alignment of the quasicontinuous helices is unstable, again due to electrostatic repulsion along the length of the helices, resulting i n a rotation into the X-structure. The angles between the arms in the unfolded state are around 90" and those in the stacked structure, 60" and 120", respectively. (Reproduced with permission from Ref. 73.)

at the entry and exit to nucleosomes, competition experiments have been performed between linker histones H 1 or H 5 and H M G l (77). The interest in such studies was generated by the fact that the two major groups of linker DNA binding proteins-the linker histones and HMGl/e-have opposite roles in transcription: histone H 1 and its variants seem to act as repressors (18,19),whereas H M G l and HMG2 are reported to act as general transcriptional activators (see Ref. 77, and references therein). In competition experiments in which the two types of proteins were added either simultaneously or successively to the incubation mixture, it was shown that HMGl can compete efficiently with H1 for binding to four-way junctions (Fig. 10A). In contrast, histone H 5 seemed refactory to displacement by HMGl (Fig. 10B). The difference between histones H1 and H5 correlates well with known effects on transcription: whereas H1-containing chromatin can be transcriptionally activated, the presence of H5 seems irreversibly to preclude transcription in the silent genome of nucleated erythrocytes. These observations suggest that direct displacement of histone H I by H M G l on the nucleosome might be part of the mechanism of gene activation by HMGl (see also Section 1,C).

4.

SEQUENCE-SPECIFIC

PREFERENCE OF BINDING

In addition to showing a preference for certain kinds of DNA structures, linker histones may also prefer certain sequences for binding. These preferences fall into two classes. First, histone H1 exhibits a general preference for binding to (A + T)-rich DNA regions (48, 78).The compact p-turn structure

232

JORDANKA ZLATANOVA AND KENSAL VAN H O L D E

A 1

2

3

4

5

6

7

8

9

- four-way junction

- incomplete Junction

;hi dr

0

0

562

0

2010

0

18

35

1

2

3

4

281 562 nM H1

I nM HMCl

2012

[

112

B

70 140

57

5

29

6

14

7

7.2

3.6 molrrntio HMGllRl

8

9

- four-way junction

- incomplete junction 0

0

544

0

2010

0

17

34

272 544 nMH5

InMHMGl

2010

[

I18

68 136

59

30

15

7.4

3.7 molar ratio HMCllHS

FIG.10. Competition between histone H1 (A) or H5 (B) with HMGl for binding to fourway junction DNA. The indicated concentrations of the linker histones and HMGl were incubated with 3.8mM labeled four-way junction DNA and 50 fig/ml of competitor (salmon testis) DNA. The binding was monitored by DNA mobility-shift analysis. Note that in no instance is a ternary complex of DNA, HMG1, and linker histone observed. Reproduced with permission from Ref. 77.

233

LINKER HISTONES IN CHROMATIN

of the tetrapeptide SPKK, present in the histone tails (79), has been implicated in this preference (80). This is in apparent contradiction to earlier work, which identified the globular domain as the portion of the molecule responsible for this property (81).The issue seems to be further complicated in view of the fact that polylysine exhibits the same (A + T)-rich preference (82). In addition to the general A + T preference, there is mounting evidence that there may exist, particularly in eukaryotic DNA, certain specific sequences of very high aflinity for linker histones. A first indication of this can be found in the early filter-binding experiments of Renz (83).When histone H 1 was incubated in a mixture of differentially radiolabeled calf lymphocyte and E . coli DNA, the histone bound preferentially to the eukaryotic DNA. Later, Diez-Caballero et al. (51)found that at 0.5 M salt, where most nonspecific Hl/DNA interactions are very weak, eukaryotic DNA (but not prokaryotic DNA) exhibited a class of strong H1 binding sites. In recent years, more specific information concerning a few such sites has appeared (84-87). Such “Hl-hypersites” are not necessarily (A + T)-rich. Indeed, in the composite binding site identified in the rat albumin gene by DNase-I footprintT)ing (85), the two binding regions themselves are not particularly (A rich, in contrast to the segment between them, which contains more than 80% A + T. An intriguing set of experiments, pointing to the existence of “sequence”specific binding, has been recently performed in a study of the interaction of histone H 1 with populations of restriction fragments of plasmids pBR322 and pUC19 (87a). Certain fragments exhibited unusually high affinity for the histone, forming large complexes at Hl/DNA ratios at which the other fragments present in the same incubation mixture showed no apparent binding (Fig. 11). The highly preferred fragments turned out to be intrinsically curved, as judged by their anomalously slow electrophoretic mobility in polyacrylamide gels, by computer modeling analysis, and by scanning-force microscopy. However, the presence of curvature alone was not sufficient for the preferential binding, because highly curved kinetoplast DNA fragments of similar length were not selectively bound. Using various restriction fragments centered around the highly preferred molecule, it was found that the high-affinity binding required the simultaneous presence of sequences on both sides of the region of curvature. At present there is not yet enough evidence to attempt description of a “consensus” site, or even to conclude that such sites exist. More study of strong H1 binding sites is needed. Of course, it could be that the “sequence”-specific preference identified in the above cases may only reflect peculiar, albeit still unrecognized, sequence-dependent DNA structures. We would like to stress, as we have previously (18), that the major linker

+

234

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

A

O

6

N k ? N Y k ? O

O

r

r

c

r

N

0

z

y

?

o

c

N

r

k

c

?

N

HllDNA

FIG. 11. Titration of DraI-RstNI digests from pBR322 (A) and pUC19 (B) with increasing amounts of histone H1 on agarose electrophoretic gels. The protein:DNA ratios (w/w) are designated below the lanes. The arrows denote the specific fragments from both plasmids that are preferentially hound to histone H1. Courtesy of Dr. J. Yaneva, Oregon State Unviersity, Corvallis.

histone-binding sites in chromatin must be the crossovers of doublestranded DNA at the entry and exit of the nucleosome: if sequence-specific sites exist, they may be involved in some kind of specific regulatory mechanisms, involving only specific genes (such, for instance may be the case of the differential regulation of transcription in the somatic and oocyte types of 5-S RNA gene sequences; see below). In this respect, it is interesting to note that the eukaryotic hypersites identified to date are located in 5' flanking regions or in the beginning of the coding sequences of the respective genes, a location expected to be of importance in regulating transcription from these genes. Perhaps binding of the linker histones to such H1 hypersites will help fix a particular nucleosorne or even a chain of nucleosomes over specific DNA sequences (see Section II,B,3). A particularly clear-cut example of how sequence-preferring (albeit somewhat promiscuous) binding of the linker histones to DNA may affect transcription is presented by the two types of 5-S RNA gene families in Xenopus (88). The large oocyte gene family consists of about 20,000 copies per haploid genome organized in clusters scattered among most of the chromosomes. The smaller somatic gene family consists of 400 copies per haploid genome grouped in a single cluster. The two gene families are differentially

LINKEH HISTONES IN CHROMATIN

235

regulated during early Xenopus development: the oocyte genes become repressed at the niidblastula transition so that the only 5-S genes transcribed in somatic cells are those belonging to the somatic gene family. This differential gene expression requires the presence of histone H1 both in vivo and in vitro (see Ref. 88, and references therein). It seems likely that the differential effect of histone H1 on the transcription of the two types of genes depends on the preferential binding of the histone to oocyte chromatin, which contains an (A + T)-rich spacer between the genes compared to a (G + C)-rich spacer between the somatic genes (89). However, this hypothesis has not, to our knowledge, been tested directly. Neither is it clear whether the effect depends on binding of H1 per se, or on H1-directed binding of other entities, possibly nucleosomes (see Section 11,B,3).

C. Interactions with High-mobility-group Proteins The only chromatin proteins for which the interaction of linker histones has been studied are members of the HMG class. Isolated HMGl binds relatively tightly to histone H 1 (90, 91) but there has been concern that the interaction might be merely nonspecific, deriving solely from the electrostatic interaction between the negatively charged tail of HMGl and the positively charged tails of H1 (92). However, the fact that oxidized and reduced HMG1, which differ only in the presence or absence of a disulfide bond, interact differently with H1 suggests that the interaction may be more specific and significant (93). Whether this type of interaction is physiologically relevant in chromatin remains to be seen: both H1 and HMGl are known to bind to linker DNA, but whether they can do so simultaneously, or only by displacing each other, has not been determined rigorously as yet. The isolation of mononucleosomes containing HMGl and HMG2 but lacking histone H1 favors the replacement hypothesis (94). Also, in direct competition experiments for binding to four-way-junction DNA the two proteins were never observed in ternary complexes involving both H1 and HMG1, strongly indicating that they occupy the same, or overlapping, sites on the junction, and, by inference, perhaps on the nucleosome (77). However, conflicting data have also been reported: the presence of HMGl and HMG2 was found to be restricted to H1-containing mononucleosomal particles isolated from native Chromatin (95).Cross-linking experiments with chromatin reconstituted with exogenous HMGl and HMG2 demonstrate that at least some HMGs are sufficiently close to histone H1 to allow cross-linking to occur (96). As in all reconstitution experiments, it is difficult to relate these results to the in situ situation in view of uncertainties about the fidelity of reconstitution and of the fact that HMGs were added to bulk chromatin already containing endogenous HMGs.

236

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

In this context, it is relevant to mention experimental data concerning the effect of HMGl on the structure of highly aggregated Hl/DNA complexes. The nonhistone protein destroys the complex or prevents its formation (97). Electron-microscope observations demonstrate that HMGl destroys the double DNA fibers and leads to the formation of beaded structures. Native HMGl interacts with histone H1 in such a way as to modulate the ability of H1 to condense pure DNA in uitro (98).It is believed that the binding of H1 to HMG involves the portion of the nonhistone protein that does not participate in interaction with DNA (99). In this way, HMG1, binding to H1 and DNA with different domains of the molecule, could modulate the interaction of H1 with DNA, or, conversely, the interaction of H1 with HMGl could change the affinity of these proteins for DNA. The interaction of histone H1 with HMG14 and HMG17 is much less well-studied. Chemical cross-linking in solution led to the formation of HUHMG14 heterodimers, whereas no interaction between H1 and HMG17 was observed (100).The interaction of HMG14 with H1 was strongly d e c t e d by phosphorylation of the nonhistone protein (101).Recent hydroxyl radical footprinting experiments have addressed the location of HMG14 and HMG17 in nucleosome cores and in chromatosomes lacking linker histone (102).These proteins occupy DNA sites near the end of the chromatosome but distinct from those occupied by the linker histones; in the region of the dyad axis the binding sites of HMGs overlap those protected by the linker histones. The placement of HMG14 and HMG17 near the dyad suggests that interactions between these nonhistone proteins and histone H 1 may affect the transcriptional potential of chromatin. Neutron scattering studies aimed at elucidating the effect of HMG14 binding on chromatin structure showed that HMG14 binding results in a considerable reduction in the mass per unit length of the fibers, which probably reflects larger spacing between neighboring nucleosomes along the DNA (103). This general loosening of the higher order structure might be a necessary condition for the transcriptional activation attributed to HMG14 (see Ref. 103).

II. The Importance of linker Histones in Chromatin Fiber Structure

The early electron-microscope observations of the salt-dependent conformational transitions of soluble chromatin fragments pointed to the importance of the linker histones in the formation and maintenance of the higher order structures (5, 6). The H1-depleted chromatin was also observed to condense with increasing ionic strength; however, no definite structures with a well-defined fiber direction could be obtained. The structure of the

237

LINKER HISTONES IN CHROMATIN

low-ionic-strength extended fiber was also dependent on the presence of histone H1. In chromatin containing H1, the DNA entered and left the nucleosome on the same side, giving rise to the zig-zag appearance of the extended fiber, whereas in H1-depleted chromatin, the entry and exit points of the DNA were much more random, creating the extended “beads-ona-string’’ conformation (6). Recent scanning-force microscopy of native and H1-depleted fibers beautifully support the conclusions based on electron microscopy (13, 14) (see also Sections II,B,l and II,B,2).

A. Location of Linker Histones in the Nucleosome 1. THE POSITIONOF THE LINKERHISTONESWITH RESPECT THE NUCLEOSOME

TO

It has long been known that H1 binds to the linker DNA, and there is much evidence that this is at the point where the DNA double-helix enters and exits from the core particle (1). Binding of the linker histones to linker DNA protects against nuclease digestion an additional 20 bp of linker DNA immediately contiguous to the 146 bp of DNA in the nucleosome core; the particle containing around 168 bp of DNA and one linker histone molecule has been called a chromatosome (104).That H 1 lies close to the nucleosomal core has been directly demonstrated by protein/DNA cross-linking (105)and by proteidprotein cross-linking (106-110). In some studies the major contacts identified were with histone H2A (e.g., 109), whereas other experiments demonstrated cross-linking to all core histones (107, 108). A number of these studies were carried out in chromatin, or even in intact nuclei; this indicates that the propinquity reported is not simply a feature of the isolated chromatosome. A recent study characterized the products of digestion of chromatin by the peptidase clostripain and showed that the N termini of both H 4 molecules lied in close proximity to the globular domain of H1, each H4 terminus pointing toward it (111). However, acetylation of the tails of the core histones does not block the binding of histone H5 to reconstituted mononucleosomal cores (112). Most researchers believe that the binding to the linker is symmetrical, involving 10 bp extending from each end of the core DNA (reviewed in Ref. 1). However, in one case, a nucleosome reconstituted on a 5-S RNA gene fragment from Xenopus borealis, an asymmetric protection of the ends has been reported (113).This observation is unique and may depend on particular features of the DNA sequence used or on the reconstitution procedure. Reconstitution experiments using either intact H1 or isolated fragments thereof have suggested that the globular part of the histone is necessary and sufficient for the protection of the 168-bp chromatosomal DNA (114). The participation of the globular domain of the linker histones in binding near the ends of nucleosomal DNA has also been observed in protein/DNA cross-

238

JOHDANKA ZLATANOVA AND KENSAL VAN HOLDE

linking experiments that identified His-25 within the globular domain of H 5 as a major site of cross-linking in isolated particles, extended chromatin, and nuclei (115). In the case of reconstituted X . borealis mononucleosomes, in which the protection of the linker was found to be asymmetrical (114, the globular domain of H 5 was also found to associate with the core asymmetrically, cross-linking to a single site on one side of the dyad axis (116). Because of the small size of GH5 (2.9 nm) (117)and because of its location on this site far away from the dyad, it would not be capable of also interacting with the entry/exit DNA of this nucleosome. Again, these results may reflect some peculiar specific feature of the X . borealis 5-S sequence. Also, faithful reconstitution of the globular domain of H5 onto single nucleosomes may be a problem, despite the fact that the temporary pausing of micrococcal nuclease digestion, typical of the chromosomal structure ( I ) , has been observed in these experiments. How can a single linker histone molecule interact with the nucleosome core, and both entering and exiting DNA? In an early suggestion, the globular domain was placed directly over the twofold axis of the nucleosome; its size would fill up the gap between the DNA strands entering and leaving the particle (114) (Fig. 12A). Such a placement implies three contacts with the chromatosonial DNA: one with the entering DNA, one with the exiting DNA, and one with the core DNA near the dyad axis. Indeed, DNase I is denied access to the DNA at the dyad in H1-containing dinucleosomes (118). The distance in the chromatosome between the center of mass of the linker histone and the center of mass of the histone octamer measured by neutron scattering is 5.5 nm (119). This estimation places the linker histones very

A

6

FIG. 12. A schematic representation of the possible locations of the globular domain of histone H I (or H5) with respect to the core particle. The histone octamer is presented as a cylinder, the DNA as a tube, and the globular domain of histone H1 as a black ball. (A) The globular domain makes contacts with the DNA at three points: the DNA entering and exiting the nucleosome and the DNA at the dyad axis. (B) Tlie globular domain is situated at a distance from the DNA at the dyad, making contacts only with the crossover ofthe entering and exiting DNA strands.

LINKER HISTONES IN CHROMATIN

239

close to the central turn of the core particle DNA, implying an additional interaction with the core DNA. However, such a triple interaction would require three binding sites in the globular domain, and only two have been suggested on the bases of biochemical (e.g., 50,52)and recent crystallographic data (31).In the light of these considerations it may be that the protection of the dyad site seen in the DNase-I digestion studies (118) may simply reflect steric hindrance to the enzyme by the globular domain, which may not actually bind the DNA at this point. O n the other hand, the results from neutron scattering may reflect the fact that the measurements were done on isolated chromatosoines, in which the histone may artifactually approach the dyad of the core particle more closely than it does in full-length nucleosomes or in chromatin fibers. Because the issue of the exact location of the linker histones at the surface of the nucleosome in vivo cannot be considered solved, an alternative view remains tenable; this places the globular domain away from the dyad, exactly at the point where the entry and exit DNAs cross each other. If the DNA makes 1.75 turns around the histone octamer, then the entry and exit DNAs would cross each other at 90" at some distance (about 2 nm) from the surface of the DNA at the dyad (Fig. 12B). The strong preference of the linker histones and their isolated globular domains for crossovers of DNA (see Section I,B,3) might reflect an evolutionary design to bind to the crossover of the nucleosomal DNA. Whether the linker histone binds close to the nucleosome or at the crossover, it will be able to interact with both entering and exiting strands; the consequences of this are discussed in Sections II,B, 1 and II,B,2. In the above context, a recent study on the DNA sequence organization in the chromatosome (120)may be of interest. Analysis of a large set of DNA sequences cloned from chicken erythrocyte chroinatosomes reveals an asymmetry of di- and trinucleotide steps along the DNA at the chromatosome ends. In particular, two dinucleotides, ApG and GpG, seem to be conserved at one of the termini, while the other terminus is matched by the preferential occurrence of their complements, CpT and CpC. It is suggested that such an asymmetry at this location could be used to orient the asymmetric linker histone, targeting the binding of helix I11 of the globular domain of H1 or H 5 (see Fig. 5B) to one end of the chromatosome. Depending on where the second, more diffuse DNA-binding region situated on the opposite site of the globular domain binds to a second site in the chromatosomal DNA, the degree of constraint imposed on the path of the superhelix would be determined (for more detailed discussion of this intriguing possibility, see Ref. 120). These results may help explain the finding of an asymmetric binding of the globular domain of H5 in the 5-S DNA-reconstituted chroma-

240

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

tosome (116)as a consequence of particular features of the underlying DNA sequence.

2. DISPOSITION OF THE LINKER HISTONETAILS The way in which the globular domain of the linker histones binds to the chromatosome may be of structural consequence to the way the “unstruct u r e d tails are located. Although the carboxy-terminal domain of the linker histones is known to exist as a random coil when the proteins are free in aqueous solution, secondary structure predictions and C D measurements under various conditions suggested that the C-terminal domain might assume a segmented a-helical conformation on binding to DNA (121).These rigid helical segments might track the phosphate backbone of the linker DNA and help determine the conformation of the linker between nucleosomes. Important factors for this role would be the length, stability, and charge density of the helical segments. Because the termini are, in addition, highly positively charged, their binding to the linker DNA is expected to at least partially neutralize its charge (122), which might facilitate bending of the linker. Alternatively, if the histone C-tail binds to only one side of the duplex DNA, this asymmetric charge neutralization should produce DNA bending (123, 124). In either case, closer approach between nucleosomes may be promoted by such bending (for a further discussion of these points see Section II,C,2). In this respect, it is significant that reconstitution onto linker histone-depleted chromatin of a peptide containing only the globular and C-terminal domains is sufficient to induce salt-dependent chromatin folding, whereas the globular domain in itself is not sufficient (122).The role of the N-terminal domain of the linker histone is less clear; it may act as an anchor for the rest of the molecule to be positioned properly in the fiber (122).

3.

RELATIVE

ORIENTATION OF

ADJACENT

LINKERHISTONES

Asymmetrical binding of the globular domain to the chromatosomal DNA would determine the mutual disposition of linker histone molecules sitting on neighboring linkers, i.e., whether the successive histone molecules are situated with respect to each other in a head-to-tail, head-to-head, or tail-to-tail orientation. Analysis of isolated H l / H l dimers, obtained from chemical cross-linking in chromatin or nuclei, has been carried out by several groups of researchers, using enzymatic or chemical degradation. Contacts in all possible combinations (between two C termini, between two N termini and between C and N termini) have been reported (125, 126). In another study, contacts mainly between C-fragments, and some binding between the N terminus of one molecule and the C terminus of another have been observed, whereas contacts between two N termini have not been

LINKER HISTONES IN CHROMATIN

24 1

found (49).The C-terminal tails have also been identified as major sites of histone-histone contacts in purified H5/H5 dimers (127). In contrast, a predominantly polar, head-to-tail arrangement of histone H5 was suggested on the basis of similar analysis in extended chicken erythrocyte chromatin (128). This arrangement persists on compaction, which also brings C-terminal domains in closer juxtaposition than in the extended state, accounting for the increase in C-terminal to C-terminal cross-linking observed in high salt concentrations. The seemingly contradictory results obtained by various workers make it obvious that further studies are needed.

4. D o LINKERHISTONESFIX LINKERLENGTHS? The fact that H1 and its cognates bind to the linker DNA has led to the suggestion that the species- and tissue-specific differences in linker lengths could be explained by differences in the structure of different linker histone variants. However, experiments addressed to solve this issue fail to show a direct relation between changes in the nature of the H1 complement and changes in linker lengths. Thus, comparison of linker lengths in immature and mature chicken erythroid cells and in liver with the composition of the linker histones in these sources revealed no meaningful correlations (129). Also, the increase in nucleosoinal repeat length and the increase of H5 seen during erythrocyte maturation are not related in a proportional way, suggesting that H5 is not the major determinant of the corresponding repeat lengths (130). Similarly, the changes in linker length observed during development of rat brain neurons are not accompanied by significant changes in the H1 complement (131). The most striking evidence comes from in uiuo studies using mouse/ human inter- and intraspecies somatic-cell hybrids (132). All hybrids expressed the H 1 complement of only one of the parent cells. Still, in some of them, the length of the repeat was inherited from the same parent, whereas in others this was inherited from the parent whose H 1 was not expressed. This implies that some other nuclear factor or factors dictate linker length. Finally, in transfection experiments, the gene for histone H5 was expressed in cells that normally contain only H1 (130). Replacement of H 1 by H5 did not lead to a change in the repeat length, but only increased the stability of the chromatin fiber. A potential caveat with respect to such experiments may be that the exogenous linker histone may not be properly incorporated in the chromatin of these cells, which have never encountered histone H5 in the course of evolution. The cells used complete a round of DNA replication and this would allow a window of opportunity for new nucleosome spacing to be imposed in the presence of H5. However, because linker histones are believed to be deposited onto replicating chromatin last (133, 134), it is still possible that they cannot change the already established

242

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

nucleosomal density. If histone H 5 affects linker length in conjunction with other nuclear factors, which coevolved with H5, these other factors may not be present in the types of cells used for transfection. These in vivo results do not always accord with in vitro studies. For example, histone H5 restores the native spacing of about 200 bp of nucleosomes in “randomized” chromatin in oitro (135, 136).The nucleosomal repeat length of nucleosomal arrays reconstituted by introducing naked DNA in cell-free extracts can change on addition ofhistone Hl(137); however, this could only be achieved by nonphysiologically high amounts of the linker histone. A cloned 6.2-kb chicken p-globin DNA fragment assembled into chromatin with chicken core histones and histone H5 as the only cellular components assembled into chromatin with a regular 180 bp repeat, similar to the one observed in this region in erythroid cells where the gene is active; a specific region downstream from the gene was required for this nucleosoma1 arrangement (138).The same gene in chick oviduct, where it is inactive, has a 196-bp repeat (139), and histone H5 is not present in this tissue. Thus, although histone H5 was somehow involved in determining specific repeat lengths over the active p-globin gene, the actual length induced in the H5containing chromatin was in fact smaller than in the H5-lacking chromatin. This is contrary to the general expectation that H 5 should promote longer linker lengths, comparable to the 208 bp in chicken erythrocytes. It is evident that the possible participation of linker histones in determining linker lengths still remains to be understood.

B. Linker Histones and the Chromatin Fiber at Low Ionic Strength 1. STRUCTURE OF

THE

LOW-IONIC-STRENGTH FIBER

Valuable information concerning the role played by linker histones in determining chromatin fiber structure can be obtained from studies at ionic strengths below 10 mM. Under these conditions, the fiber is expanded, and it becomes possible to resolve individual nucleosomes by microscopic techniques and to study their disposition along the fiber. The pioneering work of Thoma et al. (6)gave to many the impression that the fiber at low ionic strength was a flattened zig-zag. However, it was pointed out (6) that this appearance could well be the consequence of flattening of some kind of extended helix on the grid surface. Indeed, the results of theoretical analysis and solution studies (11, 13, 15, 140-148) are more compatible with a three-dimensional, helixlike structure. The theoretical argument is straightforward: if the linker DNA is straight under low salt concentration conditions, and the entrylexit angle of DNA into and out of the nucleosome is fixed by the linker histones (see Section II,A,l), the conforma-

LINKER HISTONES IN CHROMATIN

243

tion of the fiber will be dictated b y these factors, plus the length of the linker. The length plays a crucial role under these circumstances, for it dictates the rotation of each nucleosome with respect to the preceding one (see Fig. 2A). If all linkers are of the same length, some kind of regular helix will necessarily be generated (e.g., 15, 149). However, we know that, in native chromatin, linker lengths are heterogenous, probably even in a local sense (e.g., 150, 151; for further references see Ref. 1).Because of this, the structure that is generated will be an irregular, helixlike fiber (13-15). In modeling such fibers, we have assumed that the D N A wraps 1.75 turns about the octamer; this corresponds to an exitlentry angle of 90” (Fig. 2A). The structures predicted on these assumptions strongly resemble native conformations, as revealed by the relatively nondisturbing technique of scanningforce microscopy (see Fig. 2 E ) . The fibers observed at low salt concentration by scanning-force microscopy are clearly not planar zig-zags, but are irregular, helixlike structures consistent with the results of solution studies. The diameters of the fibers average about 30 nm.

2. STRUCTURAL EFFECTSOF LINKERHISTONEREMOVAL The ability to observe structural details of chromatin fibers at low ionic strength allows critical tests of the effect of removal of linker histones. Figure 13 shows SFM images of chicken erythrocyte chromatin depleted of histones H1 and H5, again compared to computer-simulated images. In this case, to obtain simulations comparable to the observed structures, it was necessary to relax the restriction on the entry/exit angle. In fact, the number of turns of D N A about the histone core was allowed to vary from 1.0 to 2.0, each limit corresponding to an entry/exit angle of 180”. As can be seen in Fig. 13, the result of removal of linker histones is, as predicted, the production of an extended, “beads-on-a-string” structure. The helical zig-zag structure found in the presence of linker histones has been lost. Furthermore, the distribution of nucleosome center-to-center distances (Fig. 1 3 D and E) now shows a broad tail toward longer lengths, which can only result if D N A has been “ p e e l e d from the nucleosome core. The maximum center-to-center distance observed (about 50 nm) corresponds to addition of as much as 80 bp to the linker between two nucleosomes, equivalent to one whole turn of D N A , distributed between contributions from two adjacent nucleosome cores. That such peeling off may be facile follows from topological analysis (152). It turns out that the observed D N A winding pattern around the histone octamer, with 1.75 turns, is singled out from all other geometrically feasible winding patterns: it allows partial unwrapping of the D N A from the octamer without compensatory writhing or twisting of the unwrapped D N A . In other words, with 1.75 turns of D N A in the core, unwrapping of up to one turn may proceed with topological impunity.

LINKER HISTONES IN CHROMATIN

245

Some earlier biochemical and biophysical data support the idea that in the absence of H1 the DNA at the ends of the core particle is less tightly constrained by the histone octamer than in the presence of the histone. Removal of H1 from intact nuclei increases the susceptibility to micrococcal nuclease at the ends of the particle (153). A large increase in the negative linear dichroism of dinucleosomes on H1 removal has been observed and interpreted as resulting from unwinding of the DNA tails (154, 155). A similar interpretation was offered to explain circular dichroism and thermal denaturation data (156).Thus, it is clear that linker histones play a major role in determining chromatin fiber structure at low ionic strength. 3. IMPLICATIONS FOR CHROMATIN TRANSCRIPTION

In vivo, the transition into extended structures is expected to occur at sites of active transcription, to allow access by regulatory factors and enzymes to the underlying DNA template. This transition may involve, among other things, partial or complete removal of histone H1, or some weakening of its interaction with linker DNA (reviewed in Ref. 19). Such changes will have a double structural effect: allowing the relaxation of the condensed higher order structure and destabilizing the nucleosome itself, allowing considerable unwrapping of the DNA from around the histone octamer. These structural transitions should facilitate the passage of transcribing polymerase molecules. Another important but less well-studied aspect of the involvement of linker histones in regulation of transcription is the possible participation of these histones in nucleosome positioning. The occupancy by a nucleosome of specific DNA sequences containing cis-regulatory elements and elements

FIG. 13. Model of linker histone-depleted chromatin fiber at low salt concentration, its simulated SFM image, and an actual SFM image. (A) A computer-generated model of a linker histone-depleted chromatin fiber. The simulation is the same as in Fig. 2, except that the number of turns of DNA around the octamer is allowed to vary randomly between 1 and 2 , and the length of the linker between 51 and 73 bp. (B) Simulated SFM image of the fiber in A, assumed to have been scanned and partially flattened by a parabolic tip with a radius of curvature of 10 nm. (C) Experimental SFM image of an HlIH5-depleted chicken erythrocyte Chromatin fiber after fixation with glutaraldehyde and imaging on mica. Images are 400 nm x 400 nm in size. (D and E) Distribution of center-to-center distances of adjacent nucleosoines along the DNA path for fibers of the kind shown in B and C, respectively. About 700 nieasurements were made for each histogram. The simulated fibers have a slightly shorter mean internucleosome distance than do the actual ones, probably as a result of the simple projection method used to simulate the deposition of the fiber to the surface of the mica. The distributions show that linker histone removal leads to a release of the DNA from the histone cores and the formation of longer length linkers (the mean value for native chicken erythrocyte chromatin is about 22 nm). Reprinted with permission from Nature (Ref. 14). Copyright 1994 Macmillan Magazines Limited.

246

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

involved in transcription initiation may have profound effects on transcription (157, 158). The positioning of nucleosomes is governed by multiple factors, specific DNA sequences, DNA curvature, and boundary effects being among the most important of them (159, 160). In a study of nucleosome positioning choices, linker histones did not override the positioning signals in the underlying DNA template but did change the relative distribution of nucleosome positions among possible 10-bp-spaced alternatives (161). Recent studies from the same group (162) demonstrated that linker histone binding suppresses the general mobility of nucleosomes over 10-bp DNA intervals, which is observed under conditions of relatively low ionic strength. Immobilization of nucleosome cores over specific sequences may be an important mechanism by which H1 could affect transcription outside the context of chromatin condensation. An important study of chromatin organization over the oocyte- and somatic-type 5-S RNA genes in uiuo (163)suggests that H1 is instrumental in promoting formation of nucleosomes over the repressed oocyte-type genes. Conversely, removal of H 1promoted disruption of this specific nucleosomal arrangement, concomitant with promotion of transcription. That linker histones can order nucleosomes in a sequence-specific manner has been demonstrated in model studies of nucleosome positioning on plasmid pBR327 (164). The H5-mediated formation of positioned arrays of nucleosomes depends on the presence of a specific sequence in the plasmid; however, H5 did not bind to this signal sequence. Later, these studies were extended to include eukaryotic gene regions: H1-induced spreading of nucleosome alignment depended on specific DNA positioning signals in the cases of chicken ovalbutnin gene (165)and chicken P-globin gene (138).The exact mechanism for the observed effect of linker histones on creation of positioned arrays of nucleosomes remains to be determined.

C. Linker Histones and the Condensed Chromatin Fiber As briefly discussed in the Introduction, the mechanisms of chromatin condensation into the higher order structure and the details of that structure remain largely unknown despite numerous physical and microscopic studies. We believe that there is no substantial evidence for a regular helical structure of any kind, at least in significant regions of the chromatin fiber (detailed argumentation is given elsewhere) (10).A similar view is shared by C. Woodcock and R. Horowitz (personal communication) based on a careful analysis of published electron-microscope observations, and is supported by the most recent microscopic techniques (17). Because it seems that a major role of linker histones is in establishing and stabilizing this structure, it is important to examine critically the evidence concerning its conformation.

247

LINKER HISTONES IN CHROMATIN

1. MAIN STRUCTURAL CHARACTERISTICS CONDENSED FIBER

OF THE

What are the “firm facts” and what are the main controversies surrounding the issue of the higher order structure of chromatin?

a. Fiber Diameter. The average diameter of the fiber seems to be around 30 nm, although the reported values vary from around 25 nm to around 45 nm, depending on the source of chromatin, the method for its, isolation and preparation of the sample, and the method of investigation. This average value has, in fact, given the condensed fiber the name “30-nm” fiber. Our recent SFM measurements have indicated that even the extended low-ionic-strength fibers have diameters of about 30-35 nm (13); similar values were estimated using physical methods (140, 141, 143, 144, 147). To avoid further confusion, we have recently proposed that the term 30-nm fiber be no longer used to denominate the higher order structure formed in high salt concentration; this can be termed instead the condensed or compacted fiber (10). A major unresolved issue is whether or not the diameter of the fiber depends on the linker length: experimental results supporting both possibilities have been reported in the past. Recent measurements of fiber diameters in cryosections of different types of nuclei seem to shed some light on this issue: the values determined in situ were very similar, in striking contrast to those measured on chromatin isolated from these cell types, where a strong positive correlation between diameter and repeat length had been established (166). Thus, it seems possible that the reported dependence of the diameter on linker length may be just an artifact encountered in isolated chromatin fibers.

h. Mass per Unit Length. The density of the fiber is usually expressed as number of nucleosomes per defined fiber length. The reported values show a transition from one nucleosome per 10 nm at zero salt concentration to six to eight nucleosomes at 70-100 mM or at milliniolar concentrations of Mg2+. Higher values of about twelve nucleosomes per 10 nm have also been reported, but it has been suggested that at least some of them may originate from aggregated fibers (1). c. Orientation of the Nucleosomes. Several kinds of evidence (X-ray diffraction from oriented fibers, linear dichroism) have been cited to suggest that the flat faces of the nucleosomes are approximately parallel to the fiber axis, but may exhibit considerable variability. However, the evidence is difficult to evaluate, for several reasons. Insofar as X-ray diffraction is concerned, patterns from semioriented

248

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

fibers do show some orientation of the 11-nm reflection (corresponding to nucleosome diameter) parallel to the fiber axis (e.g., 167). But in the absence of accompanying measurement offiber orientation, it is not possible to evaluate these data quantitatively. Linear dichoism measurements (flow dichroism and electric dichroism) suffer from other complications. A major source of difficulty arises from the fact that the orientation of the linker DNA (which comprises about 25% of the whole) is unknown. In one instance an attempt has been made to correct for this, using photochemical dichroism to determine linker orientation (168):however, the results remain ambiguous. A more fundamental problem with dichroism studies for such systems can be seen by careful examination of the data. The maximum dichroism values observed are invariably low, with both positive and negative values being reported, but most results clustering around M I A = -0.1. This is a very small value when compared, for example, to the maximum electric dichroism of naked DNA, which approximates -1.3. This has been interpreted as indicating that the average angle between the DNA base planes and the fiber axis is around 60",close to the magic angle of 54.7" at which the dichroism passes through zero and changes sign. This would correspond to a tilt of about 30" of the long axis of the nucleosome away from the fiber axis, a result consistent with a number of models. However, to interpret the data in this way ussumes that all of the nucleosomes make approximately the same angle with respect to the fiber axis. There is, in fact, an alternative explanation for a very low value for the dichoism, that the nucleosomes are nearly randomly oriented with respect to the fiber axis, so that even complete alignment of the fibers still results in nearly random alignment of DNA chromophores. This possibility does not seem to have been considered; this means that we must be very cautious in using linear dichroism data as evidence for particular fiber models. In our opinion, there is no strong evidence to support any specific helical model for the condensed fiber; indeed, a largely irregular structure seems more consistent with the existing data (10).

2. LOCATION AND ORGANIZATION OF

THE

LINKERDNA

This aspect of the fiber structure has been a matter of considerable controversy. It is still not clear what the path of the linker DNA is. In principle, without invoking any specific helical regularity in the structure, there are three major ways in which the linker DNA might be organized: (1) coiled in some way inside a condensed structure with peripherally situated nucleosomes, (2)continuing the superhelical path of the DNA of the core particle, and (3) remaining straight and rigid, as in the low-ionic-strength extended fibers.

LINKER HISTONES IN CHROMATIN

249

Reliable data on the location of the linker DNA remain scarce. The contribution of the linker DNA to the low-resolution X-ray scattering pattern is negligible; therefore, these patterns reflect mainly the intrinsic features of the core particles and their mutual arrangements in the fiber. Some information about the location and structure of the linker DNA has been acquired by biochemical methods (1). The results, based on analysis of the kinetics and the products of digestion with nucleases (e.g., 169, 170),were interpreted as showing that the linker DNA is organized in a manner very similar to the organization of the core DNA and thus follows, together with the latter, a continuously supercoiled path. Such a model has been proposed (17 4 , mainly on the basis of electric dichroism data. Digestion of chromatin fibers with micrococcal nuclease under extremely mild conditions, using either membrane-immobilized or free enzyme, has permitted subtle effects of fiber structure on the digestion parameters to be revealed (12, 172). Although the linker DNA is readily accessible to nucleolytic attack in the extended low-salt conformations, it is almost completely protected against digestion in the condensed high-salt conformation; fibers of intermediate degrees of compaction are digested to intermediate degrees. These results suggest that access to the linkers is most probably limited by high compaction rather than by internalization to the center of the fibers. This interpretation is supported by the observation that cleavage of the linker DNA by a small molecule, methidium-propyl-EDTA-Fe(II),proceeded at similar rates for all types of conformations (12). Although recent studies of chromatin fiber structure at low ionic strength (13-15) reveal linker conformation in these circumstances, they say nothing about linker organization in the condensed fiber. The linkers that appear to be straight in the extended fibers (13-15) may bend or coil on condensation or may remain straight. Investigators tend to think that condensation is brought about by bending of the linker DNA, which is known to become more flexible as the salt concentration is raised (173), thus allowing interactions among neighboring nucleosomes to take place. Evidence that linker DNA may bend or fold in isolated dinucleosomes as conditions for chromatin condensation are approached has been presented by Yao et al. (174, 175). However, examination of the sedimentation data of Butler and Thomas (1 76) leads to the opposite conclusion. No change in S,,, for dinucleosomes was observed in the salt range from 5 to 140 mM, although a 25% increase would be expected if the nucleosomes approached so close as to contact one another. Whatever the in vitro studies may show, whether such bending can occur in vivo in intact chromatin remains to be seen. It is also far from clear whether nucleosomes, at least those that are successive in the linear array of nucleosomes, actually interact in vivo. The early electron-microscope obser-

250

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

vations (177), showing the formation of arcs and helices from closely face-toface packed isolated core particles, suggested that such interactions may be of significance in the process of condensation. Consequently, interaction between adjacent nucleosomes has been made an intrinsic feature of most models of higher order structure. In fact, we are not aware of any data pointing in this direction, Recent electron-microscope imaging of chromatin in sections of low-temperature-embedded starfish sperm nuclei (15, 16) seems to suggest that successive nucleosomes may actually not be in contact with each other, because the linker DNA between them is extended even under physiological conditions. If nucleosome-nucleosome interactions occur in such a structure, they must involve nucleosomes that are not adjacent on the linear fiber. An independent approach to the question of the linker DNA path in nuclei has been developed (178)using photo-induced thymine dimer formation as a structural probe. This method depends on the fact that the rate of thymine dimer formation is affected by the direction and degree of DNA bending. To obtain information about the structure of linkers, DNA from dinucleosomes isolated from irradiated nuclei was examined for distribution of thymine dimers. The results were interpreted to mean that the linker contained very little bending, at least for the subset of dinucleosomes studied. That the linker DNA may not continue the superhelix of the core particle was also inferred from photochemical dichroism studies (179). In summary, the disposition of the linker DNA within the condensed chromatin fiber remains a major puzzle that impedes our effort to understand the structure of the chromatin fiber.

3. LOCATION OF THE LINKER HISTONESIN CONDENSED FIBER

THE

The location of the linker histones in the condensed fiber is also unresolved. Most studies have used immunochemical approaches, with conflicting results and interpretations (180). The general approach has been to determine whether and to what extent the accessibility of the histones, or certain domains thereof, to antibodies change on salt-induced compaction of soluble fibers. Some authors interpret their data as indicating no change in the location of the linker histones on condensation of the fiber, whereas others assert that the histones are internalized. A paper that is often cited as supporting the internalization view is that of Dimitrov et al. (IBf),which takes an ingenious approach. Antibodies against the globular domain of histone H5 were attached to bulky ferritin molecules (the size of a compact dinucleosome) so as to create a probe too large to penetrate into a condensed fiber. The loss of immunological reactivity of the fiber on increasing the salt concentration was seen as an indication of internal location of the globular

LINKER HISTONES IN CHROMATIN

25 1

domain in the compacted structure. However, it should be noted that the immunological reaction, being low even in the fully extended fiber, was already lost at intermediate ionic strength, long before the fiber could attain its condensed state. Moreover, the same gradual decrease in the intensity of the reaction was evident with the free nonconjugated antibody. Thus, it may be that these results reflect the general steric hindrance to bulky probes that develops on fiber compaction, similar to the steric hindrance observed in the case of micrococcal nuclease (12, 172). Recently, immobilized proteases such as trypsin and chymotrypsin have been used as an alternative approach to this problem (182). The data obtained concerned the location of the N- and C-terminal portions of the linker histones. The tails of histone H1 remained accessible on fiber condensation; therefore, it was concluded that they did not change their location. On the other hand, the tails of H5 became significantly inaccessible in the condensed fiber. Why the two linker histones should differ in this way is not clear; perhaps the difference in their location in the fiber may be among the mechanisms by which they exert their differential effect on functions such as transcription and replication. The use of chymotrypsin as a probe to the globular domain turned out to be inappropriate, because phenylalanine, the site of preferential cleavage, was hidden even in the mononucleosomal particle. Alternative proteases, selectively cleaving peptide bonds in the globular domain, should be applied to further elucidate this issue. A potentially very powerful approach to locating the linker histones is neutron scattering, using deuterated protein. Recent studies (183) g’ive an unambiguous answer for the average distance of H 1 molecules from the fiber axis: 6-6.5 nm (Fig. 14). This is smaller than the average distance of nucleosome centers from the axis (11.5nm) and implies that the linker histones lie preferentially inside. However, such an uueruge value does not mean that all linker histones are precisely arranged at this radius, as implied by the figure; some may be closer to the periphery, and some closer to the center, as might be expected in a less regular fiber. The result does show unequivocally that the linker histones are neither confined to the periphery or to the very center of the fiber. It has been known for years that application of bifunctional cross-linking agents to chromatin and nuclei results in the formation of H 1 to H5 homopolymers. This observation has often been interpreted as indicating that the linker histones are located in the center of the compacted fiber. In fact, all it shows is that linker histone molecules are situated closely enough in the chromatin fiber to allow cross-linking to occur. This is clearly demonstrated by data (41) showing that similar cross-linked products are formed both at low ionic strength, when the fiber is in an extended conformation, and at high ionic strength, when the fiber is highly compacted. In either case, very

252

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

FIG. 14. A schematic drawing of a cross-sectional view of a condensed chromatin fiber based on results from neutron scattering analysis using deuterated H I (hatched circles). The nucleosome is represented by a box of dimensions 110 x 57 A. The inner face of the nucleosome falls at approximately the same radial location as the H I center of mass, which may be interpreted as indicating that H1 binds to the face of the nucleosome that is presented to the interior of the filament (for a discussion of this view, see text). Reprinted with permission from Nature (Ref. 183).Copyright 1994 Macmillan Magazines Limited.

few cross-linked products larger than hexamers are found. Similar experiments (184)revealed discrete H 1 homopolymers, in this case integer multiples of 12 H 1 molecules. Again, the structure that determines this crosslinking pattern exists not only in nuclei, but in the extended nucleosomal chain. Thus, the organization of the nucleosomal chain may include structural features sufficient to determine the higher order structure of chromatin. The cross-linking data suggest that the main contacts between H1 molecules in the condensed fiber may be between molecules sitting on adjacent linkers rather than across successive turns of the fiber. Finally, it is important to realize that the actual location of the molecules of the linker histones with respect to the nucleosome may differ in the extended and the condensed fibers. Using covalent protein/DNA crosslinking procedures, Mirzabekov and co-workers (170) have shown that H1 protects both ends of the chromatosomal DNA only in isolated nucleosomes and in unfolded chromatin. Within nuclei, however, the histone interacts with the linker DNA at just one side of the nucleosome, so that its globular domain interacts with the central portion of the linker, with additional con-

LINKER HISTONES IN CHROMATIN

253

tacts on the linker DNA of neighboring nucleosomes, or on more distant positions in the higher order structure. On decondensation of chromatin, H1 is redistributed in such a manner that its globular domain becomes bound to the linkers on both side of the same nucleosome, and this leads to the appearance of a chromatosome. In accordance with this view, the interaction of H 5 with DNA in nuclear chromatin of chicken erythrocytes differs from that in isolated mononucleosomal particles and in unfolded chromatin (115). Interestingly, a similar idea was forwarded earlier by Krueger (185), who suggested that H1 binds to the linker DNA only at low ionic strength, whereas at physiological conditions it interacts with other sites determined by packaging of chromatin into higher order structures. If such a “switch” in linker histone interactions occurs, it would play a major role in condensation. If, for example, one or more of the contacts to the entry/exit DNA were lost, a relaxation of constraints on the entry/exit angle would occur, which might aid in the folding of the extended fiber into the condensed form. Can chromatin fibers condense in the absence of linker histones? There is ample evidence that, in some sense, they do. Thoma et al. (6) describe the formation of “clumps” of nucleosomes when H 1-stripped chromatin is subjected to 100 mM salt. Schwarz and Hansen (186) have shown that the sedimentation coefficient of reconstituted dodecameric oligonucleosomes (minus linker histones) increases markedly on increase of salt concentration. The distribution of sedimentation coefficients exhibits a limit at about 55 S, before aggregation begins (although more slowly sedimenting molecular species are also present). It is suggestive that 55 S is also the approximate value expected for a two-turn helical coil. However, it is very difficult to distinguish such a structure from a globular aggregate of twelve nucleosomes, which would have a similar sedimentation coefficient. Proof of helix formation would require using much longer oligonucleosome chains. It is also possible that formation of a regular helical structure will be much easier if nucleosomes are regularly spaced, as they are in such constructs. At the moment, we can state that the high salt condensation of chromatin does not require H1, but there is much evidence that the formation of physiologically relevant condensed fiber does need these histones (5, 6).

111. What Do We Know? What Do We Need to learn?

Our knowledge of the linker histones is remarkably complete in many ways, and yet surprisingly limited in others. We now know a great deal about

254

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

the proteins, including structures of at least their globular regions. We are beginning to understand quite a bit about how they interact with D N A in its various topological forms, and have some insight into the location of linker histone in the chromatosome. Although we know much more than we did a decade ago, two key questions still await resolution: (1) Precisely where are the linker histones located (with respect to nucleosomes) in the condensed fiber, and (2)Just what role do they plu y in that condensation? Both are difficult questions to approach, and both would be easier if we knew more about the structure of the condensed fiber. It seems likely that some of this structural information will be forthcoming from studies using new microscopic techniques, such as X-ray contact microscopy and scanning-force microscopy, both of which can be carried out in aqueous solution. These may also allow us to follow the condensation of the chromatin fiber in a “dynamic” way, as the ionic environment is continuously changed. Nevertheless, this kind of study will not in itself answer the above questions. A useful complementary approach will be to make use of reconstituted polynucleosome fibers, formed on tandem repeats of nucleosome “positioning” sequences. Here we may hope to obtain the regularity of structure that will allow high-resolution scattering studies, and the flexibility to adjust nature and amount of the linker histones added. A study of such fibers using neutron scattering, especially if deuterated linker histones were employed, might yield important information. Cross-linking studies, using the Mirzabekov technique (170), might provide much more definitive results with such model systems than with heterogeneous native chromatin. It must always be kept in mind, however, that such artificial constructs can be misleading because of special properties of regular systems. Nevertheless, they seem to offer hope of clarity in a confused field. Another aspect of linker histone location that has not been sufficiently explored has to do with the nature and significance of strong binding sites on DNA. How frequent are they, and what do they do? In this area, there is simply the need for more exploratory research. Most important of all, and clearly tied to the above questions, is the problem of how linker histones relate to transcription. We now know that linker histones are present in actively transcribed chromatin, albeit probably in reduced amounts. What features determine the content of linker histones in specific chromatin regions? How can this content be regulated in accordance with the needs of the cell for expression of specific genes? Do other modifications associated with active chromatin modify linker histone/DNA interactions so as to weaken higher order structure? Advances on these questions may yield a key to the more general problem of selective gene expression as the basis for differentiation and development.

LINKER HISTONES I N CHROMATIN

255

ACKNOWLEDGMENTS This work was supported in part by National Institutes of Health Grant GM50276 to K. V. H. and J. Z.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8.

9. 10. 11. 12. 13.

14. 15. 16. 17.

18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.

van Holde, “Chromatin.” Springer-Verlag, Berlin and New York, 1988. L. Olins and D. E. Olins, J . Cell B i d . 59, 252a (1973). L. Olins and D. E. Olins, Science 183, 330 (1974). L. F. Woodcock, J . Cell B i d . 59, 368a (1973). J. T. Finch and A. Klug, PNAS 73, 1897 (1976). F . Thoma, Th. Koller and A. Kl~ig,J. Cell B i d . 83, 403 (1979). G. De Murcia and T. Koller, B i d . Cell 40, 165 (1981). R. Tsanev, G . Russev, I. Pashev and J. Zlatanova, “Replication and Transcription of Chromatin.” CRC Press, Bocd Raton, FL, 1992. L. A. Freeman and W. T. Garrard, Crit. Reu. Euk. Gene E r p r . 2, 165 (1992). K. van Holde and J. Zlatanova, J B C . 270, 8373 (1995). M. H. J. Koch, in “Protein-Nucleic Acid Interactions” (W. Saenger and V. Heineman, eds.), p. 163. MacMillan, London, 1989. J. Zlatanova, S . H. Leuba, G. Yang, C. Bustamante and K. van Holde, PNAS 91, 5277 (1994). S. H . Leuba, G. Yang, C. Robert, B. Samori, K. van Holde, J. Zlatanova and C. Bustamante, PNAS 91, 11621 (1994). G . Yang, S. H. Leuba, C. Bustamante, J. Zlatanova and K. van Holde, Nature Struct. B i d . 1, 761 (1994). C. L. Woodcock, S. A. Grigoryev, R. A. Horowitz and N. Whitaker, PNAS 90,9021 (1993). R. A. Horowitz, D. A. Agard, J. W. Sedat and C. L. Woodcock, J. Cell B i d . 125, l(l994). Y. Kinjo, K . Shinohara, A. Ito, H. Nakano, M. Watanabe, Y. Horiike, Y. Kikuchi, M. C. Richardson and K. A. Tanaka, J . Microsc. 176, 63 (1994). J. Zlatanova, Trends Biochem. Sci. 15, 273 (1990). J. Zlatanova and K. van Holde, J . Cell Sci. 103, 889 (1992). R. D. Cole, Znt. J . Peptide Prot. Res. 39, 433 (1987). J. M. Neelin, P. X. Callahan, D. C. Lamb and K. Murray, Can. J. Biochern. 42, 1743 (1964). B. L. A. Miki and J. M. Neelin, Can. J. Biochern. 53, 1158 (1975). S . M. Seyedin and W. S. Kistler, JBC 255, 5949 (1980). K. M. Newrock, C. R. Alfageme, R. V. Nardi and L. H. Cohen, CSHSQB 42,421 (1978). R. C. Smith, E. Dworkin-Rastl and M . B. Ilworkin, Genes Deu. 2, 1284 (1988). S. Ilimitrov, 6. Almouzni, M. Dasso and A. P. Wolffe, Deu. Biol. 160, 214 (1993). J. Zlatanova and D. Doenecke, FASEB J. 8, 1260 (1994). P. G . Hartman, G. E. Chapman, T. Moss and E. M. Bradbury, EJB 77, 45 (1977). 6. M. Clore, A. M. Gronenborn, M. Nilges, D. K. Sukumaran and J. Zarbock, E M B O J . 6, 1833 (1987). C. Cerf, 6. Lippens, S. Mnyldermans, A. Segers, V. Ramakrishnan, S . J. Wodak, K. Hallenga and L. Wyns, Bcheni 32, 11345 (1993). V. Ramakrishnan, J. T. Finch, V. Graziano, P. L. Lee and R. M. Sweet, Nature 362, 219 (1993).

K. A. A. C.

256

JORDANKA ZLATANOVA AND KENSAL VAN HOLDE

K. L. Clark, E. D. Halay, E. Lai and S. K. Burley, Nature 364, 412 (1993). M. Wu, C . D. Allis, R. Richman, R. G. Cook and M. A. Gorovsky, PNAS 83,8674 (1986). L. J. Hauser, M. L. Treat and D. E. Olins, NARes 21, 3586 (1993). F. Azorin, C. Olivarez, A. Jordan, L. Perez-Grau, Cornudella, L. and J. A. Subirana, E r p . Cell Res. 148, 331 (1983). 36. B. 0. Glotov, L. G. Nikolaev, V. K. Dashkevich and S. F. Barbashov, BBA 824, 185 (1985). 37. M. J. Smerdon and I. Isenberg, Bchern 15, 4233 (1976). 38. E. Russo, V. Giancotti, C. Crane-Robinson and G. Geraci, Int. J. Bchem. 15, 487 (1983). 39. A. F. Protas, S. N. Kharpunov and G. D. Berdishev, Ukr. Biokhim. (Russian)58,22 (1986). 40. J. D. Maman, T. D. Yager and J. Allan, Bchern 33, 1300 (1994). 41. J. 0. Thomas and A. J. A. Khabaza, EJB 112, 501 (1980). 42. P. H. Draves, P. T. Lowary and J. Widom, J M B 225, 1105 (1992). 43. M. A. Tarnowka, C. Baglioni and C. Basilico, Cell 15, 163 (1978). 44. S. Dalton, J. R. Coleman and J. R. E. Wells, MCBiol. 6, 601 (1986). 45. D. Doenecke, W. Albig, H. Bouterfa and B. Drabent, J. Cell. Biochern. 54,423 (1994). 46. E. M. Bradbury, R. J. Inglis and H . R. Matthews, Nature 247, 257 (1974). 47. S. Y. Roth and C. D. Allis, Trends Biochern. Sci. 17, 93 (1992). 48. J. Zlatanova and J. Yaneva, DNA Cell Biol. 10, 239 (1991). 49. D. Ring and R. D. Cole, JBC 258, 15361 (1983). 50. D. J. Clark and J. 0. Thomas, JMB 187, 569 (1986). 51. T. Diez-Caballero, F. X.Aviles and A. Albert, NARes 9, 1383 (1981). 52. D. J. Clark and J. 0. Thomas, EJB 178, 225 (1988). 53. D. E. Olins and A. L. Olins, J M B 57, 437 (1971). 54. R. D. Cole, G. M. Lawson and M. W. Hsiang, C S H S Q B 42, 253 (1977). 55. A. T. Rodriguez, L. Perez, F. Morln, F. Montero and P. Suau, Biophys. Chem. 39, 145 (1991). 56. G . S . Manning, Q.Rev. Biophys. 11, 178 (1978). 57. R. W. Wilson and V. A. Bloomfield, Bchern 18, 2192 (1979). 58. V. A. Bloomfield, Biopolymers 31, 1471 (1991). 59. A. Belmont and C. Nicolini, J . Theor. Biol. 90, 169 (1981). 60. J. A. Subirana, FEBS Lett. 302, 105 (1992). 61. F. Watanabe, BBRC 172, 1129 (1990). 62. D. J. Clark and T. Kimura, J M B 211, 883 (1990). 63. G. S. Manning, Biopolymers 18, 2929 (1979). 64. M. Watanabe, NARes 14, 3573 (1986). 65. J. 0. Thomas, C. Rees and J. T. Finch, NARes 20, 187 (1992). 67. T. Vogel and M. F. Singer, PNAS 72, 2597 (1975). 68. T. Vogel and M. F. Singer, JBC 251, 2334 (1976). 69. C. Iovcheva and 6. N. Dessev, Mol. Biol. Rep. 6 , 21 (1980). 70. J. Yaneva, J. Zlatanova, E. Paneva, L. Srebreva and R. Tsanev, FEBS Lett.263,225 (1990). 71. D. Krylov, S . Leuba, K. van Holde and J. Zlatanova, PNAS 90, 5052 (1993). 72. D. M. J. Lilley, Nature 357, 282 (1992). 73. D. R. Duckett, A. I. H. Murchie, A. Bhattacharyya, R. M. Clegg, S. Diekmann, E. von Kitzing and D. M. J. Lilley, EJB 207, 285 (1992). 74. M. E. Bianchi, M. Beltrame and G. Paonessa, Science 243, 1056 (1989). 75. P. Varga-Weisz, K. van Holde and J. Zlatanova, JBC 268, 20699 (1993). 76. P. Varga-Weisz, J. Zlatanova, S. H. Leuba, G. P. Schroth and K. van Holde, PNAS 91, 3525 (1994). 77. P. Varga-Weisz, K. van Holde and J. Zlatanova, BBRC 203, 1904 (1994). 78. J. Zlatanova and J. Yaneva, Mol. B i d . Rep. 15, 53 (1991). 32. 33. 34. 35.

LINKER HISTONES IN CHROMATIN

257

M. Suzuki, EMBOJ. 8, 797 (1989). M. E . A. Churchill and M. Suzuki, EMBOJ. 8, 4189 (1989). L. N. Marekov, G. Angelov and B. Beltchev, Biochirnie 60, 1347 (1978). M. Leng and G. Felsenfeld, PNAS 56, 1325 (1966). M. Renz, PNAS 72, 733 (1975). S. L. Berent and J. S. Sevall, Bchem 23, 2977 (1984). J. S. Sevall, Bchem 27, 5038 (1988). U. Pauli, J. F. Chiu, P. Ditullio, P. Kroeger, V. Shalhoub, T. Rowe, 6. Stein and J. Stein, J. Cell. Physiol. 139, 320 (1989). 87. J. Yaneva and J. Zlatanova, DNA Cell Biol. 11, 91 (1992). 87u. J. Yaneva, G. P. Schroth, K. E. van Holde and J. Zlatanova, PNAS 92, 7060 (1995). 88. A. P. Wolffe, J. Cell Sci. 107, 2055 (1994). 89. A. Jerzmanowski and R. D. Cole, JBC 265, 10726 (1990). 90. M. J. Smerdon and I. Isenberg, Bchem 15, 4242 (1976). 91. S. S. Yu and T. G. Spring, BBA 492, 20 (1977). 92. P. D. Cary, K. V. Shooter, G. H. Goodwin, E. W. Johns, J. Y. Olayemi, P. 6. Hartman and E. M. Bradbury, BJ 183, 657 (1979). 93. L. A. Kohlstaedt and R. D. Cole, Bchem 33, 570 (1994). 94. J. B. Jackson, J. M. Pollock, Jr. and R. L. Rill, Bchem 18, 3739 (1979). 95. S. C. Albright, J. M. Wiseman, R. A. Lange and W. T. Garrard, JBC 255, 3673 (1980). 96. J. Bernu6s and E. Querol, BBA 1008, 52 (1989). 97. K. Grade, C.-U. von Mickwitz, R. Messelwitz and R. Lindigkeit, Studiu Biophys. 89, 1 (1982). 98. L. A. Kohlstaedt, E. C. Sung, A. Fujishige and R. D. Cole, JBC 262, 524 (1987). 99. M. Carballo, P. PuigdomBnech and J. Palau, EMBO J. 2, 1759 (1983). 100. E. Espel, J. Bernuks, E. Querol, P. Martinez, A. Barris and J. Lloberas, BBRC 117, 817 (1983). 101. J. Palvimo and P. H. Maenpa, BBA 952, 172 (1988). 102. P. J. Alfonso, M. P. Crippa, J. J. Hayes and M. Bustin, JMB 236, 189 (1994). 103. V. Graziano and V. Ramakrishnan, J M B 214, 897 (1990). 104. R. T. Simpson, Bchem 17, 5524 (1978). 105. A. V. Belyavsky, S. G. Bavykin, E. 6. Goguadze and A. D. Mirzabekov, JMB 139, 519 (1980). 106. W. M. Bonner, NARes 5, 71 (1978). 107. B. 0. Glotov, A. V. Itkes, L. G. Nikolaev and E. S. Severin, FEBS Lett. 91, 149 (1978). 108. D. Ring and R. D. Cole, JBC 254, 11688 (1979). 109. T. Boulikas, J. M. Wiseman and W. T. Garrard, PNAS 77, 127 (1980). 110. E. Espel, J. BernuBs, J. A. PBrez-Pons and E. Querol, BBRC 132, 1031 (1985). 111. J.-L. BanBres, L. Essalouh, I. Jariel-Encontre, D. Mesnier, S. Garrod and J. Parello,JMB 243, 48 (1994). 112. K. Ura, A. P. WoEe and J. J. Hayes, JBC 269, 27171 (1994). 113. J. J. Hayes and A. P. Wolffe, PNAS 90, 6415 (1993). 114. J. Allan, P. J. Hartman, C. Crane-Robinson and F. X.Aviles, Nature 288, 675 (1980). 115. A. D. Mirzabekov, D. V. Pruss and K. K. Ebralidse, JMB 211, 479 (1989). 116. J. J. Hayes, D. Pruss and A. P. Wolffe, PNAS 91, 7817 (1994). 117. F. J. Aviles, 6. E. Chapman, G. 6. Kneale, C. Crane-Robinson and E. M. Bradbury, EJB 88, 363 (1978). 118. D. 2. Staynov and C. Crane-Robinson, EMBO J. 7, 3685 (1988). 119. S. Lambert, S. Muyldermans, J. Baldwin, J. Kilner, K. Ibel and L. Wijns, BBRC 179,810 (1991). 79. 80. 81. 82. 83. 84. 85. 86.

258

JOHDANKA ZLATANOVA AND KENSAL VAN HOLDE

120. S. Muyldermans and A. A. Travers, JMB 235, 855 (1994). 121. D. J. Clark, C. S. Hill, S. R. Martin and J. 0. Thomas, EMBOJ. 7, 69 (1988). 122. J. Allan, T. Mitchell, N. Harborne, L. Bohm and C. Crane-Robinson, J M B 187, 591 (1986). 123. G. S. Manning, K. K. Ebralidse, A. D. Mirzabekov and A. Rich, J. Biomol. Struct. Dynum. 6, 877 (1989). 124. J. K. Strauss and L. J. Maher, Science 266, 1829 (1994). 125. L. 6. Nikolaev, B. 0. Glotov, A. V. Itkes and E. S. Severin, FEBS Lett. 125, 20 (1981). 126. L. 6. Nikolaev, B. 0. Glotov, V. K. Dashkevich, S. F. Barbashov and E. S. Severin, FEBS Lett. 163, 66 (1983). 127. E. Kotthaus and W. H. Stratling, JBC 259, 13640 (1984). 128. A. C. Lennard and J. 0. Thomas, EMBOJ. 4, 3455 (1985). 129. M. L. Wilhelm, A. Mazen and F. X. Wilhelm, FEBS Lett. 79, 404 (1977). 130. J.-M. Sun, 2. Ali, R. Lurz and A. Ruiz-Carrillo, EMBOJ. 9, 1651 (1990). 131. P. D. Greenwood, J. J. Heikkila and I. R. Brown, Neurochem. Res. 7, 525 (1982). 132. N. Hsiung and R. Kucherlapati, J. Cell B i d . 87, 227 (1980). 133. A. Worcel, S. Han and M. L. Wong, Cell 15, 969 (1978). 134. S. Bavykin, L. Srebreva, T. Banchev, R. Tsanev, J, Zlatanova and A. Mirzabekov, PNAS 90, 3918 (1993). 135. A. Stein and P. Kunzler, Nature 302, 549 (1983). 136. P. Kunzler and A. Stein, Bchem 22, 1783 (1983). 137. A. Rodriguez-Campos, A. Shimamura and A. Worcel, JMB 209, 135 (1989). 138. K. Liu, J. D. Lauderdale and A. Stein, MCBiol 13, 7596 (1993). 139. M. Bellard, G. Dretzen, F. Bellard, P. Oudet and P. Chamhon, EMBOJ. 1, 223 (1982). 140. A. M. Campbell, R. I. Cotter and J. F. Pardon, NARes 5, 1571 (1978). 141. C. Marion, P. Bezot, C. Hesse-Bezot, B. Roux and J.-C. Bernengo, EJB 120, 169 (1981). 142. J. A. Subirana, S. Mufioz-Guerra, M. Radermacher and J. Frank, J. Biomol. Struct. Dynam. 1, 705 (1983). 143. C. Marion, J. Biomol. Struct. Dynam. 2, 303 (1984). 144. L. Perez-Grau, J. Bordas and M. H. J. Koch, NARes 12, 2987 (1984). 145. J. Bordas, L. Perez-Grau, M. H. J. Koch, M . C. Vega and C. Nave, Eur. Biophys. J. 13, 157 (1986). 146. J. Bordas, L. Perez-Grau, M. H. J. Koch, M. C. Vega and C. Nave, Eur. Biophys. J. 13, 175 (1986). 147. S. P. Williams, B. D. Athey, L. J. Muglia, R. S. Schappe, A. H. Gough and J. P. Langmore, Biophys. J. 49, 233 (1986). 148. S. E. Gerchman and V. Ramakrishnan, PNAS 84, 7802 (1987). 149. A. D. Gruzdev and 6. P. Kishchenko, Biofizika 26, 249 (1981). 150. D. 2. Martin, R. D. Todd, D. Lang, P. N. Pei and W. T. Garrard, JBC 252, 8269 (1977). 151. F. Strauss and A. Prunell, NARes 10, 2275 (1982). 152. S. Strogatz, J. Theor. B i d . 103, 601 (1983). 153. M. J. Smerdon and M. W. Lieberman, JBC 256, 2480 (1981). 154. C. Houssier, 1. Lasters, S. Muyldermans and L. Wyns, Int. J. B i d . Macromol. 3, 370 (1981). 155. C. Houssier, I. Lasters, S. Muyldermans and L. Wyns, NARes 9, 5763 (1981). 156. M. K. Cowman and 6. D. Fasman, Bchem 19, 532 (1980). 157. 6. Felsenfeld, Nature 355, 219 (1992). 158. D. S. Gilmour, A. R. Buchman and J. L. Workman, in “Transcription: Mechanisms and Regulation” (R. C. Conaway and J. W. Conaway, eds.), p. 515. Raven Press, New York, 1994.

LINKER HISTONES I N CHROMATIN

159. 160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. 175. 176. 177. 178. 179. 180. 181. 182. 183. 184. 185. 186. 187. 188. 189.

259

F. Thoma, BBA 1130, l(1992). Q. Lu, L. L. Wallrath and S. C. R. Elgin, J. Cell. Biochen~.55, 83 (1994). 6. Meersseman, S. Pennings and E. M. Bradbury, JMB 220, 89 (1991). S. Pennings, 6. Meersseman and E. M. Bradbury, PNAS 91, 10275 (1994). C. C. Chipev and A. P. WolEe, MCBiol. 12, 45 (1992). S. Jeong, J. D. Lauderdale and A. Stein, J M B 222, 1131 (1991). J. D. Lauderdale and A. Stein, NARes 20, 6589 (1992). C. L. Woodcock, J . Cell Biol. 125, 11 (1994). J. Widom and A. Klug, Cell 43, 207 (1985). D. Sen, S. Mitra and D. M. Crothers, Bchein 25, 3441 (1986). V. L. Karpov, S. G. Bavykin, 0. V. Preobrazhenskaya, A. V. Belyavsky and A. D. Mirzabekov, NARCS 10, 4321 (1982). S . G. Bavykin, S. I. Usachenko, A. 0. Zalensky and A. D. Mirzabekov, J M B 212, 495 (1990). J. D. McGhee, J. M. Nickol, G. Felsenfeld and D. C. Rau, Cell 33, 831 (1983). S. H. Leuba, J. Zlatanova and K. van Holde, JMB 235, 871 (1994). P. J. Hagerman, Annu. Reu. Biophys. Biophys. Chem. 17, 265 (1988). J. Yao, P. T. Lowary and J. Widom, PNAS 87, 7603 (1990). J. Yao, P. T. Lowary and 1. Widom, Bcheni 30, 8408 (1991). P. J. G. Butler and J. 0. Thomas, JMB 140, 505 (1980). J. Dubochet and M. Noll, Science 202, 280 (1978). J. R. Pehrson, PNAS 86, 9149 (1989). S. Mitra, D. Sen and D. M. Crothers, Nature 308, 247 (1984). J. S. Zlatanova, MCBchem 92, I (1990). S. I. Dimitrov, V. R. Russanova and I. 6. Pashev, E M B O J . 6, 2387 (1987). S. H. Leuba, J. Zlatanova and K. van Holde, JMB 229, 917 (1993). V. Graziano, S . E. Gerchmali, D. K. Schneider and V. Ramakrishnan, Nature 368, 351 (1994). A. V. Itkes, B. 0. Glotov, L. G. Nikolaev, S. R. Preem and E. S. Severin, NARCS. 8, 507 (1980). R. C. Krueger, Mol. B i d . Rep. 11, 189 (1986). P. M. Schwarz and J. C. Hansen, JBC 269, 16284 (1994). B. J. Sugarman, J. B. Dodgson and J. D. Engel, JBC 258, 9005 (1983). D . Doenecke and R. Tonjes, J M B 187, 461 (1986). G. Briand, D. Kmiecik, P. Sautiere, D. Wouters, 0. Borie-Loy, 6. Biserte, A. Mazen and M . Champagne, F E B S Lett. 112, 147 (1980).

Development of Antisense and Antigene Oligonucleotide Analogs’ PAUL S. MILLER Department of Biochemistry School of Hygiene and Public Health The Johns Hopkins University Baltimore, Maryland

I. Nuclease-resistant Antisense Oligonucleotide Analogs; Oligonucleoside Methylphosphonates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Structure and Chemical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Hybridization with Complementary Oligonucleotides C. Psoralen-conjugated Oligonu D. Uptake by Cells in Culture E. Biological Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Antigene Oligonucleotide Analogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Targeting G C Base-pairs in Duplex DNA with 8-Oxoadenine . . . . 8. Targeting C . G and T.A Interruptions in Homopurine Tracts . . . . . . C. Triplex Formation by Oligonucleoside Methylphosphonates . . . . . . 111. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

262 263 265 267 270 273 276 278 282 285 287 287

In recent years, there has been considerable interest in exploring the possibility of using oligonucleotides to control gene expression. Besides the prospect of developing agents that can be used to study gene function in cells in culture, it is also clear that such reagents might have considerable therapeutic potential and could provide a means of rationally designing drugs of high selectivity. Two types of oligonucleotides, antisense oligonucleotides and antigene oligonucleotides, have been developed for this purpose. Antisense oligonucleotides are designed to interact with cytoplasmic messenger RNA or with precursor mRNA in the nucleus. The oligomers are complementary to and designed to bind to a functional region of the mRNA, such as the initiation codon region or a splice junction of precursor mRNA. As a consequence of binding, translation or splicing of the mRNA is prevented. There are a number of mechanisms by which this might occur, including prevention of assembly of ribosomes or degradation of the mRNA by nucleases. Abbreviations: d-OMP, Oligodeoxyribonucleoside methylphosphonate; mr-OMP, oligo2’-O-methylribonucleoside methylphosphonate. Progress in Nucleic Acid Research and Molecular Biology, Vol. 52

26 1

Copyright 0 1996 by Academic Press, Inc. All rights of repmduction in any form reserved.

262

PAUL S . MILLER

Although the precise mode of action of antisense oligonucleotides is often not well understood, it does appear to depend on the nature of the oligonucleotide and the site on the mRNA to which the oligonucleotide is targeted. Antigene oligonucleotides are designed to bind to double-stranded DNA via the formation of triple-stranded complexes (triplexes). Thus antigene oligonucleotides directly target the gene at the DNA level. In its current state of development, the targets for antigene oligonucleotides are homopurine tracts in double-stranded DNA. Such tracts, which consist of runs of As and Gs, are frequently found in the promoter regions of eukaryotic DNA. Thus, in principle, it should be possible to use antigene oligonucleotides to prevent binding of RNA polymerase and/or transcription factors to the gene promoter region and thereby inhibit gene transcription. Antisense oligonucleotides depend on Watson-Crick hydrogen-bonding interactions between the oligomer and its target, whereas antigene oligonucleotides make use of Hoogsteen or reversed-Hoogsteen hydrogenbonding schemes. In either case, the binding interactions are well understood and thus, in principle at least, it should be possible to design an antisense or antigene oligonucleotide using only the sequence information of the targeted gene. With the advent of modern sequencing techniques, such information is often available or can readily be obtained. However, in practice, many other factors intervene in determining the effectiveness of a particular oligomer. These include the ability of the oligomer to be taken up by the cell, the stability of the oligomer in the intracellular environment, and the ability of the oligomer to reach and interact selectively with the chosen target. These requirements have led to the development of a wide variety of oligonucleotide analogs. Many excellent reviews and monographs have been written describing the chemistry and biological activity of various antisense and antigene oligonucleotides (1-17). This report focuses on two lines of research from my laboratory that are aimed at the development of antisense and antigene oligonucleotide analogs. The first line of research involves the development of nuclease-resistant antisense oligonucleotide analogs, the oligonucleoside methylphosphonates. The second line of research involves the development of nucleoside and oligonucleotide analogs that can be used to recognize sequences in double-stranded DNA.

1. Nuclease-resistant Antisense Oligonucleotide Analogs; Oligonucleoside Methylphosphonates Although the pioneering work of Zamecnik and Stephenson (18)showed that oligodeoxyribonucleotides could be used as antisense agents in cell

263

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

cultures, it was clear that the phosphodiester linkages of these oligomers are susceptible to nuclease hydrolysis, thus rendering them unsuitable for therapeutic applications. Such oligonucleotides are rapidly degraded when injected into animals (19). For this reason, many workers have expended considerable effort to design and synthesize oligonucleotide analogs with nuclease-resistant backbones. Such analogs include oligonucleotide phosphorothioates (20) and phosphorodithioates (21-23), a-anomeric oligonucleotides (24, 25), and peptide nucleic-acid analogs (26). My efforts have focused on the oligonucleoside methylphosphonates.

A. Structure and Chemical Synthesis As shown in Fig. 1, oligonucleoside methylphosphonates contain a modified 3’-5’ internucleotide phosphodiester bond in which one of the negatively charged, nonbridging oxygens has been replaced with a methyl group. As a consequence, the methylphosphonate linkage is nonionic. In addition the phosphorus atom is now chiral and each methylphosphonate linkage can exist in an Rp or Sp configuration. The methylphosphonate group is tetrahedral and, as shown by x-ray diffraction (27), the bond lengths and bond angles are very similar to those found in the natural phosphodiester linkage.

1

a

b

R=H R=-OCH3

2

a

b

R=H

R=-OCHI

FIG. 1. Structures of an oligonucleoside methylphosphonate, 1, and a nucleoside methylphosphonamidite synthon, 2.

264

PAUL S . MILLER

Oligonucleoside methylphosphonates with two different types of sugarphosphonate backbone have been prepared. The oligodeoxyribonucleoside methylphosphonates (d-OMP), la, contain 2’-deoxyribose sugars and are similar in structure to oligodeoxyribonucleotides (28, 29). The olig0-2’-0methylribonucleoside methylphosphonates (mr-OMP), Ib, contain 2‘-0methylribose and are analogs of oligoribonucleotides (30).Oligoribonucleoside methylphosphonates containing 3’-5’ methylphosphonate linkages are not stable. This instability is presumably due to intramolecular attack by the 2’-hydroxyl on the neutral methylphosphonate linkage with subsequent cleavage of the 3’ 0 - P or 5’ 0-P bond (P. S . Miller, unpublished results). Interestingly, the synthesis and isolation of 2’-5‘-linked oligoribonucleoside methylphosphonates has been achieved (31). Although d-OMPs and mr-OMPs are nonionic, they are quite soluble in water. The solubility depends on the base composition and to some extent on the sequence of the oligomer. Oligomers that contain high percentages of G tend to be less soluble than oligomers with less G . Nonetheless, solubilities up to 1.7 mM have been reported for 18-mers in water (32). Both d-OMPs and mr-OMPs are stable in neutral aqueous solutions for prolonged periods of time. We have stored oligomers in water or in 50% aqueous acetonitrile for months at 4°C with no apparent degradation. The methylphosphonate linkages of these oligomers are cleaved under basic conditions and in the presence of certain amines such as piperidine. Complete hydrolysis by piperidine results in mixtures of nucleosides and nucleoside 3’ or 5’ 0-methylphosphonates. Treatment of d-OMPs with 1 N HCL at 3 T C results in depurination. In contrast, the N-glycosyl bond of purines in mr-OMPs is completely resistant to hydrolysis by 1 N HCI, even on prolonged treatment (32).Unlike abasic sites in oligodeoxyribonucleotides, an abasic site in a d-OMP undergoes spontaneous cleavage, even at pH 7. The sugar residue at the abasic site is completely removed, leaving two oligomers, one of which terminates in a 5’ 0-methylphosphonate group and the other in a 3’ 0-methylphosphonate group (33). This facile cleavage presumably results from intramolecular attack by the 4‘-hydroxyl group at the abasic site on the adjacent methylphosphonate linkages. Oligonucleoside methylphosphonates can be prepared by a variety of methods. Early work employed coupling of protected 5’-O-dimethoxytritylnucleoside-3 -O-methylphosphonates using dicyclohexylcarbodiimide or arenesulfonyl chlorides or tetrazolides as activating agents (28, 33). Later studies used protected nucleoside 3’-O-methylphosphonic chlorides or imidazolides as synthons for solid-phase synthesis on silica or polystyrene supports (34, 35). The advent of phosphoramidite chemistry resulted in the development of

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

265

protected 5 -O-dimethoxytritylnucleoside3 -O-N,N-bis-diisopropylaminomethylphosphonamidites (structure 2 in Fig. 1)(36-38). These synthons are commercially available. The base-protecting groups include benzoyl for A and isobutyryl for G and C. In the presence of a catalyst (tetrazole), these synthons allow rapid, high-yield syntheses to be carried out in automated DNA synthesizers on controlled-pore glass supports (39). It is most convenient to synthesize the oligomer in such a way that the 5’-terminal internucleotide bond is a phosphodiester linkage. This is readily accomplished by using a standard protected nucleoside 3 -O-P-cyanoethyl-N,N-bis-diisopropylaminophosphoramidite as the last synthon in the synthesis. The protected oligomers are cleaved from the support and the base-protecting groups are removed by a brief treatment with ammonium hydroxide followed by incubation with ethylenediamine for 6 hours at room temperature (40). Oligomers that contain a 5’-terminal internucleotide phosphodiester linkage are readily purified by ion-exchange chromatography on a weak anion-exchanger such as DEAE-cellulose. Short, “failure” sequences are uncharged and are therefore not absorbed by the column. The singly charged, full-length oligomer is eluted from the column with dilute sodium phosphate buffer and can be further purified by C,, reversed-phase HPLC. Methylphosphonate oligomers can be characterized by a variety of methods. Their molecular weights can be confirmed by electrospray mass spectrometry. Oligomers containing a 5’-terminal phosphodiester linkage can be phosphorylated using polynucleotide kinase and ATP. 5’-Phosphorylated oligomers migrate according to chain length on polyacrylamide gels, and the sequence of the oligomers can be characterized by chemical sequencing procedures similar to those used to characterize oligodeoxyribonucleotides (41).

B. Hybridization with Complementary Oligonucleotides Oligonucleoside methylphosphonates form stable duplexes with complementary single-stranded nucleic acids. Initially, studies were carried out on the interactions between d-OMPs and synthetic complementary oligodeoxyribonucleotides. The melting temperatures in 0.1 M sodium chloride of duplexes formed by d-OMPs are similar to those formed by unmodified oligodeoxyribonucleotides (42). In the absence of salt, the melting temperature of the dOMP.DNA duplex remains essentially unchanged, whereas that of the DNA. DNA duplex decreases dramatically. This difference in behavior is attributed to the lack of charge repulsion between the nonionic sugarmethylphosphonate backbone of the d-OMP and the negatively charged sugar-phosphodiester backbone of the target. The stability of duplexes formed between oligonucleoside methylphosphonates and their complementary targets depends on the configuration of

266

PAUL S . MILLER

the methylphosphonate linkage. This was originally shown to be the case for oligothymidylates that contain alternating methylphosphonateiphosphodiester linkages (43).Oligomers containing Rp methylphosphonate linkages form more stable duplexes than those containing Sp linkages. Studies with oligomers containing single methylphosphonate linkages show that the Sp linkage destabilizes duplex stability by 2-3°C. Similar results have been observed by others for oligomers having mixed sequences of bases (44-48). The stabilities of duplexes formed between d-OMPs and RNA targets are lower than those of dOMP.DNA duplexes of comparable sequence (42, 43). This may reflect the inherent lower stability observed for DNA.RNA vs DNA. DNA oligonucleotide duplexes (49). The observation that olig0-2’-0methylribonucleotides form exceedingly stable duplexes with complementary RNA suggests that the corresponding methylphosphonate analogs should behave in a similar manner. This is indeed the case. For example, the melting temperature of an mrOMP.RNA 12-mer duplex is at least 15”higher than that of the corresponding d0MP.RNA duplex (42). .As is the case for d0MP.DNA duplexes, the configuration of the methylphosphonate linkage determines oligonucleoside methy1phosphonate.RNA duplex stability. Oligo-2 -O-methylribonucleotidesthat contain a single Rp or Sp methylphosphonate linkage were prepared. This linkage was placed in the middle or at the 5’ end of the dodecamer. Although the oligomer-RNA duplex containing the Rp linkage had the same melting temperature as the corresponding RNA.RNA duplex, the melting temperature of the duplex containing the Sp linkage was reduced 2 to 4” (42). When oligonucleoside methylphosphonates are synthesized on controlledpore glass supports, each methylphosphonate linkage consists of both Rp and Sp isomers and consequently an oligomer containing R linkages is a mixture of 2” diastereoisomers. As described above, the configuration of the methylphosphonate linkage plays a role in determining duplex stability; thus each diastereoisomer can be expected to have its own unique binding affinity for its complementary target. Because UV melting experiments are carried out under conditions in which the oligomer and the target are present in stoichiometric amounts, it is perhaps not surprising that the melting transitions observed for mrOMP RNA duplexes are somewhat broader than those observed for oligo-2 -O-methylribonucleotide~RNA duplexes. The apparent isothermal binding constants of oligonucleoside methylphosphonates for their complementary targets can be determined by gel electrophoresis under nondenaturing conditions. Under ordinary conditions, the similarly charged oligomers comprising an RNA.RNA duplex can reequilibrate as the mixture migrates down the gel. However, dissociation of an oligonucleoside methylphosphonate. RNA duplex results in the formation of two oligomers of quite different electrophoretic mobility. The RNA strand

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

267

will migrate rapidly away from the essentially neutral methylphosphonate oligomer, and the two strands cannot reassociate. Thus only duplexes that have very long half-lives can be detected by this method. Long-lived mrOMP.RNA duplexes have in fact been observed on denaturing polyacrylamide gels (42). A new technique called constant activity gel electrophoresis (CAGE) has been developed to circumvent this problem (50).In this approach, individual gel lanes are cast with increasing concentrations of the oligonucleoside methylphosphonate. The radioactively labeled target is then electrophoresed through this “field” of methylphosphonate oligomer, which, because it is electroneutral, does not migrate. The mobility of the target strand decreases as the concentration of the methylphosphonate oligoiner increases, and the apparent dissociation constant can be determined from the methylphosphonate oligomer concentration that causes the target to migrate half the distance between the free and completely bound forms. Using this procedure, we have shown that mrOMP,RNA duplexes have dissociation constants t to & of those of the corresponding dOMP.RNA duplexes at 22°C (P. S . Miller, unpublished results). The magnitude of the difference appears to depend upon the G + C content of the duplex and the sequence of the oligonucleoside methylphosphonate.

C . Pso ra Ien- co niugated 0Iigo nuc Ieos ide Methylphosphonates As demonstrated in the preceding section, association between antisense oligonucleoside methylphosphonates and RNA targets can be enhanced by using 2 -O-methylribonucleosidesin the sugar-methylphosphonate backbone. It is also apparent that synthesis of oligomers with Rp methylphosphonate linkages should lead to enhanced binding, although present synthetic methodology does not permit stereoselective syntheses to be carried out on DNA synthesizers. Another strategy to enhance binding interaction involves formation of a covalent linkage between the oligomer and the target RNA. In this strategy, the oligomer first finds the correct complementary binding site on the target and, as a consequence of binding, positions an appended functional group such that it can form a covalent bond with some substitutent of the target molecule. We have found that 4,5,8-trimethylpsoralen, a well known photoreactive cross-linker, is a suitable functional group for this process (51). The structure of a psoralen-conjugated oligodeoxyribonucleoside methylphosphonate is shown in structure 3 of Fig. 2. The psoralen is attached via a phosphoramidate linkage to the 5’ end of the oligomer. Linkage of psoralen to internal positions in the oligomer has also been described (52). Target sequences are selected such that when the oligomer binds, the 3,4 double

268

PAUL S . MILLER

bond of the pyrone ring of the psoralen group is positioned to react with the 5,6 double bond of a pyrimidine in the target sequence (as shown in 4 of Fig. 2). When irradiated with long-wavelength UV (365 nm), a 2 2 cycloaddition reaction occurs, forming a cyclobutane bridge between the pyrone ring and the pyrimidine target base. Because psoralen reacts most efficiently with thymine or uracil, cross-linking is essentially base specific and the psoralen acts in effect like an extra base in the oligomer. Psoralen-conjugated d-OMPs are readily synthesized by a two-step procedure (53).The 5'-phosphorylated oligomer is first converted to an imidazole phosphoramidate derivative by reacting the oligomer with imidazole in the presence of a water-soluble carbodiimide. The imidazolide is then reacted with 4’-[(N-aminoethyl)amino]methyl-4,5’,8-trimethylpsoralento give 3. The derivatized oligomer, which can also be prepared in the 32Plabeled form, is purified either by gel electrophoresis or by C,, reversedphase HPLC. Psoralen-conjugated d-OMPs have also been further derivatized with tetramethylrhodamine (54). Such derivatization provides a convenient fluorescent tag, which can be used in various biological experiments. The interaction of psoralen-conjugated d-OMPs with dN, targets has been studied extensively in vitro (51, 55-57). In general, the cross-linking reaction is complete within 10 minutes; the reaction appears to plateau after this time. Up to 95% cross-linking has been observed in some systems. Quantitative cross-linking is not observed because the psoralen also undergoes photochemical degradation, which renders it inactive. The extent of cross-linking depends on the sequence and concentration of the oligomer and on the temperature of the reaction. Plots of cross-linking versus temperature mimic the UV melting-curves of the duplexes formed between the oligomer and its target. This suggests that interaction between the oligomer and the target is driven by Watson-Crick base-pairing interactions and not by interaction between the psoralen group, which is a weak intercalator, and the target. Consistent with this hypothesis is the observation that base-pair mismatches between the oligomer and the target result in little or no cross-linking. Target structure also plays an important role in determining the extent of cross-linking by psoralen-conjugated d-OMPs (57). The extent of crosslinking between psoralen-conjugated d-OMPs and a linear target, that is, an oligo-DNA target with no secondary structure, was compared to that with a hairpin structure (57).The sequences of the oligomer binding sites were the same in each target. For a d-OMP targeted to a 12-nucleotide region located in a single-stranded loop of the hairpin, the extent of cross-linking, 76%,was similar to that of the linear target, 83%, over the temperature range 4-20°C. Cross-linking to the hairpin rapidly diminished above 20°C, whereas crosslinking to the linear target began to diminish above 35°C. Because the stem

+

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

269

region of the hairpin target must dissociate to some extent to allow binding by the d-OMP, cross-linking to the linear target would be expected to be more effective than cross-linking to the hairpin target. This notion received further support from the observation that a psoralen-conjugated d-OMP 16 nucleotides long, which was complementary to the loop region as well as to nucleotides in the stem, cross-linked to the same extent to both the linear and the hairpin target over the temperature range 4-55°C. The site of cross-linking can be detected by treating the photoadduct with 1 M aqueous piperidine at 90°C. This treatment produces a strand break whose location can be analyzed by polyacrylamide gel electrophoresis (58). When a psoralen-conjugated d-OMP was reacted with a target that contained three contiguous thymidine residues 3’ to the binding site of the d-OMP, various levels of cross-linking were observed to each of the thymidines. This result suggested that intervening bases can “loop out” to accommodate binding by the psoralen group. Psoralen-conjugated d-OM Ps also cross-link to oligo-RNA targets (57). In this case, the extent of cross-linking decreased more rapidly with increasing temperature than was the case for the oligo-DNA targets. This greater sensitivity to temperature is consistent with the reduced binding affinity of d-OMPs for single-stranded RNA versus single-stranded DNA targets. A rhodamine-tagged, psoralen-conjugated d-OMP cross-links with the same efficiency to the oligo RNA target as does the psoralen-conjugated d-OMP (59).Thus the presence of the fluorescent tag does not interfere with either oligomer binding or photoadduct formation. Cross-linking between psoralen-conjugated d-OMPs and cellular or viral mRNAs in vitro has been studied. Oligomers were targeted to various regions of rabbit a-or p-globin mRNA (60)or to the coding regions of vesicular stomatitis virus M protein or N protein mRNA (61). These oligomers crosslink specifically with their targeted mRNA. Cross-linking depends on the sequence of the oligomer, the temperature at which the reaction is carried out, and the concentration of the oligomer. For example, the extent of crosslinking of a 16-mer complementary to nucleotides 387-402 in the coding region of vesicular stomatitis virus N protein mRNA varied from 67 to 36% over the temperature range 0-37°C at an oligomer concentration of 5 p M . In the case of rabbit globin-specific oligomers, the extent of cross-linking is greatest for oligomers whose binding sites were in known nuclease-sensitive regions of the mRNA. Thus it appears that mRNA secondary and tertiary structures play an important role in determining the efficiency of crosslinking. Translation of mRNA in a cell-free rabbit reticulocyte system is specifically inhibited as a consequence of cross-linking to psoralen-conjugated d-OMPs. Thus an oligomer targeted to the coding region of a-globin mRNA

270

PAUL S. MILLER

specifically inhibits a-globin synthesis 43%, whereas an oligomer targeted to the coding region of P-globin mRNA specifically inhibits p-globin synthesis 67%(60).The amount of inhibition observed is similar to the extent of crosslinking observed for the two oligomers to their respective mRNAs.

D. Uptake by Cells in Culture The methylphosphonate linkages of oligonucleoside methylphosphonates are completely resistant to hydrolysis by exo- and endonucleases. Oligomers incubated with mammalian cells in culture in a serum-containing medium are recovered intact, even after prolonged periods of incubation. These results suggest that oligonucleoside methylphosphonates should have long half-lives in cells and could therefore be useful as antisense agents in cell culture and in animals. Oligonucleoside methylphosphonates are taken up by mammalian cells in culture. This was first demonstrated by using d-OMPs containing a thymidine that was tritium labeled in the 5-methyl position of the thymine ring (62).When carried out on transformed Syrian hamster fibroblasts growing in monolayer, oligomer uptake as measured by radioactivity associated with the lysate appeared to be linear for approximately 1 hour and then continued to increase but at a reduced rate over the next 3 hours. The same rates of uptake were observed for oligomers 2 , 5, and 9 nucleotides in length. Most of the radioactivity, 94%, was recovered from the cell lysate in the form of intact oligomer. Radioactivity was also recovered in the form of [3H]d'ITP and as [3H]dpT and [3H]dT after digestion of cellular DNA with a combination of deoxyribonuclease and snake venom phosphodiesterase. This latter result suggests the oligomers could be degraded in the cells, possibly by a glycosylase activity, and the resulting thymine base converted to the nucleoside triphosphate, which was subsequently incorporated into DNA. Psoralen-conjugated oligonucleoside methylphosphonates of the types shown in Fig. 2 are also taken up by mouse L cells (P. S. Miller, unpublished results). The kinetics of uptake seem similar to those of the unmodified oligomer. In the case of oligomers that contain only deoxyribonucleosides (3a), intact oligomer as well as degradation products resulting from cleavage of the phosphodiester linkage are recovered from the cell lysate. When cells are incubated with 3b, only intact oligomer is recovered, even after 24 hours. This difference in oligomer stability reflects the increased resistance of the phosphodiester linkage of 3b to hydrolysis by endonucleases such as S1 nuclease. Careful examination of the uptake of 32P-labeled and rhodamine-tagged oligonucleoside methylphosphonates showed that the oligomers are taken up by a temperature-dependent mechanism (63). Experiments carried out

271

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

CH 3-

I

P=O I

3 a R=H

b R=-OCHs

5‘

3

‘Ui 4

FIG. 2. Structure of an oligonucleoside inethylphosphonate conjugated with 4’4”(aminoethyl)amino])methyl-4,5,8-trimethylpsoralen, 3. The interaction of a psoralen-conjugated oligomer with an RNA target is shown schematically at the bottom of the figure. The psoralen group is represented by the open rectangle.

in a Chinese hamster ovary tumor cell line showed that rhodamine-tagged d-OMPs are sequestered in perinuclear vacuoles consistent with endosomal localization. Similar experiments in mouse L929 cells showed similar intracellular distribution for both rhodamine-tagged d-OMPs and rhodaminetagged, psoralen-conjugated d-OMPs (59).In these latter experiments, some fluorescence was also observed in the cytoplasm and nucleus of the cell, suggesting that the oligomer can escape from the endosomal compartment.

272

PAUL S . MILLER

Oligonucleoside methylphosphonates microinjected directly into the cytoplasm migrate immediately to the nucleus (64).Similar behavior is observed with oligonucleotide phosphodiesters and phosphorothioates. Uptake of normal oligodeoxyribonucleotides and oligodeoxyribonucleoIn tide phosphorothioates appears to involve cell-surface receptors (65,66). contrast, uptake of oligonucleoside methylphosphonates is apparently not mediated by receptors. Thus, while uptake of oligodeoxyribonucleotides shows saturation at higher oligomer concentrations and can be inhibited by addition of nonlabeled oligonucleotides, saturation is not observed for uptake of oligonucleoside methylphosphonates and uptake is not inhibited by the presence of either exogenous oligodeoxyribonucleotides or oligonucleoside methylphosphonates (63). The results of uptake experiments suggest that oligonucleoside methylphosphonates enter mammalian cells in culture primarily by nonreceptormediated endocytosis. It appears that leakage from the endosome must occur for the oligomer to gain access to the cytoplasm or nucleus of the cell. This transport mechanism may represent a “bottleneck” for the efficient biodistribution of the oligomers. Although d-OMPs 2 to 20 nucleotides long are taken up by mammalian cells, only short oligomers, up to 4 nucleotides, are taken up by bacterial cells, such as E . coEi (67). It appears that the bacterial cell wall acts as a sieve and excludes oligomers above a certain size. When uptake experiments were carried out on mutants of E . coli that lack an intact cell wall, 7-mers were found to be taken up readily. This size-exclusion phenomenon is likely to restrict the use of antisense oligonucleoside methylphosphonates, and most likely other oligonucleotide analogs, in these organisms. The distribution and metabolism of an 3H-labeled 12-mer, dTp[3H]TCCTCCTGCGG, where the underline indicates the positions of methylphosphonate linkages, has been studied in mice (68). The oligomer was administered by injection into the tail vein. Plasma levels of the oligomer declined rapidly in a biexponential manner with half-lives of 6 and 17 minutes corresponding to the distribution and elimination phases, respectively. Approximately 70% of the total radioactivity injected was found in the urine after 120 minutes. The remaining radioactivity was found in various organs and tissues of the animal, with the highest levels found in the kidney and very little oligomer found in the brain. Two forms of the oligomer were recovered from the plasma, urine, and various organs as assayed by C,, reversed-phase HPLC. Intact oligomer was observed along with d-[3H]TCCTCCTGCGG. The latter oligomer most likely was formed as a consequence of nuclease hydrolysis of the 5’-terminal phosphodiester. This linkage is known to be susceptible to cleavage by calf spleen phosphodiesterase, a 5’-exonuclease. Importantly, there was no evi-

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

273

dence for cleavage of the methylphosphonate linkages of the oligomer in these studies.

E. Biological Activity The biological activity of an tisense oligonucleoside methylphosphonates in cell culture has been studied for oligomers targeted against a number of viral and cellular genes, including those of human immunodeficiency virus (HIV), (69-71), herpes simplex virus (HSV), (72-75), and the human rus oncogene (76). Studies in the HSV system showed that d-OMPs 8 and 12 nucleotides in length, d-TpCCTCCTG and d-TpTCCTCCTGCGG, could inhibit virus replication in HSV-1-infected Vero cells when added to the culture medium. These oligomers are complementary to nucleotide sequences in the acceptor splice junction of HSV-1 immediate-early mRNAs 4 and 5, which are identical in both of these precursor mRNAs. Inhibition of virus replication depends on the oligomer concentration and reaches levels of greater than 90% at oligomer concentrations of 100 pM. When the two central bases of d-TpCCTCCTG were exchanged, the resulting “mutant,” d-TpCCCTCTG, was no longer inhibitory (72). This suggested that the oligomer indeed interacted with its target. The octamer did not inhibit cellular DNA or protein synthesis. The 12-mer, d-TpTCCTCCTGCGG, is complementary to six nucleotides in the intron and six nucleotides at the splice junction (73). This oligomer specifically inhibits replication of HSV-1, but not HSV-2, in virus-infected Vero cells. The sequences of the two viruses differ in this region of the premRNA. The 12-mer gave 98% inhibition at a concentration of 100 pM. Other 12-mers complementary predominantly to nucleotides in the intron or the exon of the splice acceptor region showed little or no inhibitory effect on the replication of either HSV-1 or HSV-2. This result suggested that the secondary structure in the splice acceptor region may affect oligomer binding and thus the efficacy of the oligomer. The 12-mer was designed to inhibit mRNA splicing. Examination of RNA extracted from virus-infected cells showed the presence of increased levels of unspliced immediate-early mRNA 4 in those cells treated with d-TpTCCTCCTGCGG. Thus itappeared that the oligomer had successfully interacted with its intended target in the cells. The psoralen conjugate of d-TpCCTCCTGCGG was also tested for its ability to inhibit HSV-1 replication. In these experiments, virus-infected cells were treated with 5 p M oligomer at the time of infection. The cells were then irradiated for 10 minutes with 365-nm light at various times postinfection, and the virus titers were measured after 24 hours. Greater than 90% inhibition of replication was observed when HSV-1 infected cells were irradiated 1to 3 hours postinfection. Irradiation after 3 hours resulted

274

PAUL S. MILLER

in a decrease in inhibition. Thus, for example, when the cells were irradiated at 6 hours, approximately 60%inhibition was observed, and when cells were irradiated at 12 hours, only 20% inhibition was observed. Irradiation itself gave only a slight (5%) inhibition of virus replication. The results of these experiments are consistent with the proposed mechanism of action of the oligomer. Thus, the effect of the oligomer should be greatest shortly after infection when the immediate-early genes are expressed. Inhibition at this time results in a shutoff of subsequent early and late gene expression, and thus of virus replication. Irradiation at times subsequent to immediate-early gene expression is less effective because the proteins required for activation of early and late genes have already been synthesized. These experiments also demonstrate two important features of psoralenconjugated d-OMPs. First, these oligomers can be “triggered” to inactivate their target in a controlled manner. This feature may prove useful in studying temporal expression of genes in cells. Second, the oligomers function at significantly lower concentrations than do the underivatized oligomers. This enhanced activity is most likely a consequence of the ability of the oligomer to form covalent adducts with its target. The ability of antisense oligonucleoside methylphosphonates to inhibit mRNA expression should be a direct function of their ability to bind to their target. This idea was tested by examining the ability of the 2’-0methylribonucleoside version of the 12-mer to inhibit HSV-1 replication in cell culture (P. S . Miller, unpublished). The apparent dissociation constant of mr-UpUCCUCCUGCGG is approximately a fifth of that of d-TpTCCTCCTGCGG as determined by the CAGE technique described in Section 1,B. A series of dose-response experiments showed that the IC, value, the oligomer concentration at which virus replication is inhibited 50%, is 26 pM for d-TpTCCTCCTGCGG. In contrast, the IC,, value for mrUpUCCUCCUGCGG was only 6 pM.This approximate decrease to 22% in the IC,, is consistent with the lower dissociation constant of the 2’-0methylribo oligomer versus the 2’-deoxyribo oligomer. Oligodeoxyribonucleoside methylphosphonates have also been targeted to other HSV-1 immediate-early mHNAs (74, 75). For example, a d-OMP, d-GpCGGGGCTCCAT was prepared (75). This oligomer is complementary to the initiation codon and the following three codons of HSV-1 immediateearly mRNA 1.The oligomer inhibited virus replication with an IC5, value of approximately 17 p M . “Mutated” versions of this oligomer, in which two or four of the bases in the middle portion of the sequence were rearranged, were not inhibitory. When this oligomer was combined with d-TpTCCTCCTGCGG, the d-OMP complementary to the immediate-early mRNA 4 and 5 splice junction, a synergistic inhibitory effect was observed. Thus the IC,,

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

275

value of each oligomer in the combination was approximately 4 pM, a reduction to 4 to Q of that of the individual oligomers. In order to be effective antisense agents, oligonucleoside methylphosphonates must be able to interact selectively with their targets and inhibit target expression. The high degree of specificity possible was demonstrated in the rus oncogene system (76). In these experiments, two 11-mer d-OMPs, d-TpCCTCCTGGCC (rus-T) and d-TpCCTCCAGGCC (ras-A), were synthesized. The oligomers are complementary to a sequence including the codon for amino-acid 61 of normal human c-Ha-rus mRNA, rus-T, or to the same region of a human lung carcinoma c-Ha-rus mRNA, which contains a single mutation in the middle base of codon 61, rus-A. Two cell lines that contain multiple copies of the normal gene or the gene with the point mutation were used to test the selectivity of the two oligomers. Each cell line produces a rus-p21 protein and these proteins can be distinguished by their electrophoretic mobilities on polyacrylamide gels. Mixed cultures of the two cell lines were treated with various concentrations of either rus-T or rus-A and the relative amounts of the p21 proteins were analyzed by gel electrophoresis after immunoprecipitation. A dosedependent decrease of “normal” p21 synthesis was observed for cells treated with rus-T. The extent of inhibition was 89% in the presence of 150 pM rasT. No inhibition of “normal” p21 was observed when the cells were treated with rus-A. Conversely, increasing concentrations of ras-A resulted in decreasing amounts of “mutated” p21 and little or no inhibition of “normal” p-21 was observed in the presence of this oligomer. Thus, 150 p M rus-A inhibited “mutated p21 synthesis 97% whereas “normal” p21 synthesis was reduced on 21%. The experiments were also carried out with the psoralen-conjugated versions of rus-T and rus-A. Again, selective inhibition of p21 synthesis was observed. However, the concentration of oligomer required to give levels of inhibition comparable to those observed with the underivatized oligomers was reduced by approximately a factor of 10. Thus 15 p M rus-T inhibited “normal p21 synthesis 96% whereas 15 pM ras-A inhibited “mutated” p21 synthesis almost 100%. These results suggest that under certain conditions, antisense oligonucleoside methylphosphonates can distinguish point mutations in mRNA. It appears that this level of discrimination is maintained by the psoralenconjugated d-OMPs as well. This latter observation is consistent with the previously observed ability of psoralen-conjugated d-OMPs to discriminate mispaired bases in in vitro cross-linking experiments and again suggests that binding is a function of the oligomer portion of the psoralen-d-OMP conjugate. Antisense oligodeoxyribonucleoside methylphosphonates and their

276

PAUL S. MILLER

psoralen conjugates also have activity against HSV infection in animals (74). A mouse ear-pinna model system was developed to assess the effects of d-TpTCCTCCTGCGG on HSV-1 replication. The ear pinna was injected with a solution containing HSV-1 and 100 to 500 p M oligomer. The oligomer was then applied topically as a suspension in 50% aqueous polyethylene glycol on subsequent days postinfection. Reductions to Q and 3 in virus titer were observed 1 to 5 days postinfection. The psoralen conjugate of d-TpTCCTCCTGCGG inhibited HSV-1 replication 86-91% after irradiation Z h 365-nm light, but at a tenth of the concentration required for the underivatized oligomer. This oligomer specifically inhibited HSV-1 replication. When mice infected with HSV-2 were similarly treated, only a 27% inhibition of virus titer was observed. These experiments suggest that antisense oligonucleoside methylphosphonates and their psoralen conjugates have therapeutic potential, at least in situations where topical application is possible. More extensive studies of the pharmacokinetics of the molecules in animals is obviously required before the full potential of the oligomers can be realized.

II. Antigene Oligonucleotide Analogs Antigene oligonucleotides are designed to interact with double-stranded DNA (dsDNA) through the formation of triple-stranded complexes. Although formation of triple-stranded complexes at the polynucleotide level had been recognized for some time (79, the demonstration that shorter oligonucleotides could also participate in such interactions sparked interest in using oligonucleotides to target genes specifically (78). Antigene oligonucleotides have been targeted to homopurine sequences in dsDNA. The oligomers can interact with these homopurine sequences in essentially two different ways. Third-strand oligopyrimidines can bind in the major groove and contact the purine strand through the formation of Hoogsteen-type hydrogen bonds. This is the so-called pyrimidine-purinepyrimidine, Y.(R.Y), motif. The third strand is written first and its association with the duplex is indicated by a centered dot. Two types of base triads, T.(A.T) and C+.(G.C),are formed. Their structures are shown in the top half of Fig. 3. The T.(A.T)triad involves formation of two hydrogen bonds between the T of the third strand and the A of the target. Formation of the C+ .(G.C)triad also involves two hydrogen bonds between a protonated form of C in the third strand and a G of the duplex. Because protonation is required, the stability of these Y.(R.Y) triplexes decreases as the pH increases. The polarity of the third stand is parallel to that of the purine target strand of the duplex, and the bases in the triads are isomorphous.

277

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

dR-

A

T A T

\

dR

C?G C

.d R

H

T A T

G G C

FIG.3. Hydrogen bonding interactions involved in base triad formation.

The purine-purine-pyrimidine, R.(R.Y), motif involves formation of A.(A.T)or T.(A.T) and G.(G.C)triads. The structures of the latter two triads, which have been confirmed by N M R studies (79), are shown at the bottom half of Fig. 3. Here, reversed-Hoogsteen hydrogen bonds are formed between the bases of the third strand and the purine target, and the polarity of the third strand is oriented antiparallel to that of the purine target tract. Bases comprising the Y.(R.Y) triads are isomorphous, whereas bases comprising Y.(R.Y) triads are not. Most studies involving triplex-forming oligonucleotides have centered around understanding the structure and physical chemistry of triplestranded complexes (80-87) and exploring various strategies to extend the scope of triple-strand recognition (88-93). There are encouraging results

278

PAUL S. MILLER

suggesting that triplex-forming oligonucleotides have biological activity in cell culture (94-97). Our efforts in this area have focused on developing oligonucleotide analogs that can be used in the Y.(R.Y) motif. As described in the following sections, we have developed a base analog, 8-oxoadenine (8-0x0-A), capable of interacting with G.C base-pairs in a pH-independent manner. We have also designed and synthesized base analogs that can interact with pyrimidine “interruptions” in the homopurine target site. The ability of oligodeoxyribopyrimidine methylphosphonates to participate in triplex formation has also been explored.

A. Targeting GC Base-pairs in Duplex DNA with 8-Oxoadenine

As described above, G.C base-pairs can interact with C in third-strand oligopyrimidines provided that the pH allows protonation of the C. For oligopyrimidines that contain only a few unclustered Cs, stable triplex formation can be observed even at pH 7. However, oligodeoxyribopyrimidines that contain a high percentage of Cs and/or blocks of contiguous Cs usually form stable triplexes at pH 6.5 and below. This pH dependence makes the use of such antigene oligomers problematic in biological studies. A number of solutions have been developed to overcome this problem. One of the simplest is to substitute 5-methyldeoxycytidine (structure 5 , Fig.

5

d-CTTCTTTTTE-TTTT d-GAAGAAAAAAGAAAl CTTCTTTTTTCTTTT-d

A

B C

6

FIG.4. Triplex containing 5-methyl-Z -deoxycytidine, C.

279

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

4) for deoxycytidine in the oligodeoxyribopyrimidine (98). 5-Methyldeoxycytidine is protonated at a higher pH than is deoxycytidine when incorporated into an oligodeoxyribopyrimidine that is undergoing triplex formation. The reason for this increased apparent pK is not clear. Nevertheless, for oligomers that contain few 5-niethyldeoxycytidine residues, relatively stable triplex formation can be observed even at pH 8.0. Thus, for example, the melting temperature of 6A in triplex 6 is 18°C at pH 8.0 (99). Although oligomers containing low percentages of 5-methyldeoxycytidine can form stable triplexes at physiological pH, the stability decreases significantly for oligomers containing multiple, contiguous 5-methyldeoxycytidines. To circumvent this problem, base analogs have been prepared that have two available hydrogen-bond donor sites and that therefore do not require protonation in order to interact with G. For example the C nucleoside, 2 -O-methylpseudoisocytidine(structure 7, Fig. 5), can be

H 7

cH32il

v

HO

d

R

O

G

<

o,,,,,,, H-N

H.< $

dR

8

-

H

r

W

N-H@iitO

4

M

~

dR

8

FIG.5. Structures of 2 -O-methylpseudoisocytidine,7, and 4-arnino-l-(P-2-deoxy-~erythropentofuranosy1)-5-methyl-2,6-[1II,3N]-pyrirnidione, 8. The interaction of each analog with a 6 . C base-pair is shown to the right of the structure of each nucleoside.

280

PAUL S . MILLER

substituted for deoxycytidine in oligodeoxyribopyrimidines. Oligomers containing this base analog form triplexes whose stability is independent of pH (100). A similar pyrimidine analog, 4-amino-l-(p-2-deoxy-~-erythropentofuranosyl)-5-methyl-2,6-[1H,3H]-pyrimidione(structure 8, Fig. 5), also forms stable triplexes over the p H range 6.4 to 8.5 (101). Analogs of purines have also been explored as substitutes for deoxycytidine (102).We and others have found that the purine analogs, 8-OXO-2’deoxyadenosine (99, 103-1 06) or 6-methyl-8-oxo-2 -deoxyadenosine (107) (structures 9a,b, respectively, Fig. 6), can substitute for deoxycytidine in triplex-forming oligomers. As shown in Fig. 6, this nucleoside analog has two unique structural features that allow it to interact with G C base-pairs. The base of the nucleoside exists in the keto tautomeric form and adopts the syn conformation. This orientation places the hydrogen-bond donors at the 6-amino and N-7 positions of 8-oxo-A in proper orientation to interact with the 0 - 6 and N-7 positions, respectively, of the target guanine. Oligomers containing 8-oxo-A are readily synthesized using a now commercially available phosphoramidite synthon. It appears that 8-oxo-A interacts selectively with G.C base-pairs in the target strand. This was demonstrated by the melting behavior of triplex 10, shown in Fig. 6. The melting temperature of the third strand, 10A, of triplex 10 is 29°C at p H 7.0. No triplex formation is observed when the G.C target base-pair opposite the 8-oxo-A is changed to A.T or T.A and a very low-melting-point triplex, T , 9”C, is observed when this base-pair is C.G. This result suggests that oligomers containing 8-oxo-A should interact selectively with their designated targets. Oligomers containing multiple substitutions of 8-oxo-A also form stable triplexes. Oligomer 11A, which contains three 8-oxo-As, forms a triplex with 11B-11C (Fig. 6), whose third-strand melting temperature is 22°C at both pH 7.0 and pH 8.0 (99). The circular dichroism spectrum of this triplex is virtually identical to that of triplex 6 shown in Fig. 4,which contains 5-methyldeoxycytidine instead of 8-oxo-A. Thus, it appears that the overall structures of triplexes formed by 8-oxo-A are very similar to those of “normal,” cytosine-containing triplexes. The psoralen-conjugate of oligomer 12A forms a stable triplex with duplex 12B-l2C, whose melting temperature is 30°C at pH 7.5 (P. S. Miller, unpublished). This 15-mer contains a total of seven 8-oxo-As. The duplex corresponds to a nucleotide sequence found in the promoter region of a gene that codes for the 92-kDa precursor of human type-IV collagenase. When this triplex is irradiated with 365-nm light, a covalent adduct is formed between 12A and the purine-rich strand, 12B, of the target duplex. Very little adduct formation is observed with 12C. Preliminary results suggest that triplex formation by oligomers contain-

28 1

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

HO

‘

H

9 a R = H b R = -CH3

d-CTTCTTTTTTATTTT A d-GAhGmGAAAA B CTTCTTTTTTCTTTT-d C 10

d-ATTATTTTTTATTTT A d-GAAGAAAAAAGAAAh B CTTCTTTTTTCTTTT-d C 11

d-TAAATAAATTTTTAT A d-GGCTGTCAGGGAGGGAAAAAGAGGACAG B CCGACAGTCCCTCCCTTTTTCTCCTGTC-d C 12 FIG. 6. Structure of 8-oxo-2’-deoxyadenosine and triplexes containing 8-oxo-A, _A. The interaction of 8-0x0-A is shown to the right of the structure of the nucleoside.

ing 8-0x0-A may prove useful in regulating gene expression. Thus oligomers that contain 6-methyl-8-oxo-2 -deoxyadenosine reduce the rate of transcription in an in vitro system (108).Transcription was completely inhibited when a base capable of alkylating the target and thus forming a covalent triplex was incorporated into the oligomer. These results suggest that covalent attachment of the third strand may be necessary in order to achieve satisfactory control of the gene, at least at the transcription elongation level.

282

PAUL S . MILLER

B. Targeting C-G and T-A Interruptions in Homopurine Tracts

Oligodeoxyribopyrimidines can be used to target homopurine tracts in double-stranded DNA. To achieve reasonable stability, oligomers containing 15 to 20 nucleotides are usually employed. It is often difficult to find tracts of purines of this length in the target gene. Although eukaryotic genes frequently contain purine-rich sequences, particularly in the promoter region of the gene, these tracts are usually interrupted by an occasional pyrimidine nucleoside. These interruptions provide a challenge to designing triplexforming oligonucleotides. Serious efforts have been made to identify or design bases or base analogs that recognize and interact with these interruptions. Naturally occurring bases will interact with T . A or C.G base-pair interruptions under certain conditions. Thus, G can interact with a single T.A interruption through the formation of a G.(T.A) triad in intermolecular triplexes involving three separate strands (88, 91, 109, 110). For example, at pH 7.0, the T , of 13A in triplex 13 shown in Fig. 7, where X.(Y.Z) is G.(T.A), is 15°C (110). Triplexes of this type, which contain more than one G.(T.A) triad, do not appear to be stable. The G.(T.A)triad has also been detected in a single-stranded oligomer that can form an intramolecular triplex structure characterized by proton NMR (111,112). In this structure a single hydrogen bond is formed between the N2-amino group of G and the 0 4 of T. It also appears that additional hydrogen bonding interactions take place between the N2-amino group and the neighboring bases. It is interesting to note that triplex formation is not observed when X.(Y.Z)in 13 is G.(U.A)(110). Molecular models suggest that hydrogen-bond formation between the N2-amino of G and the 0 4 carbonyl of U should be possible. However, it appears that the availability of a suitable hydrogenbond-acceptor group is not sufficient to form the G-(U.A) triad and that additional interactions, possibly between the neighboring bases, are required for a stable triplex to form. d-CTTCTTTTTTXTTTT d-GAAG-YAAAA CTTCTTTTTTZTTTT-d

A

B C

13 FIG. 7. Triplexes used to study interactions between base analogs, X, and target basepairs, Y.2,in a DNA duplex.

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

283

Interruptions involving C.C base-pairs can also be targeted by unmodi-

fied bases. Thus triplex formation was observed in 13, where Y.Z is C.G, when X is T or U (110). The T,,, values of the third strands in these triplexes are 12"C, when X is T, and 14"C, when X is U, at pH 7.0. Triplex formation was not observed when C, A, or G were substituted at X. Similar results have been observed in other intermolecular triplex systems (91, 109). The formation of T.(C.G) and U.(C G) triads may involve formation of a single hydrogen-bond between the 0 4 group of T or U and the N4-amino group of C of the target C . G base-pair (109, 110). Recent proton NMR studies on an intramolecular triplex containing a single T.(C.G) triad gave results consistent with hydrogen bonding between the 0 2 of T and the N4amino group of C (113). Base analogs have also been devised to interact with the C of the C.G base-pair. The purine analog, 2'-deoxynebularine, is capable of forming a single hydrogen bond between the N-1 of the purine base and the C-4 of the C.G base-pair (114). This base analog binds to both C.G and A.T base-pairs when incorporated into third-strand oligodeoxyribopyrimidines. A completely synthetic base, 4-(3-benzamidophenyl)imidazole, interacts selectively with C.G and T.A base-pairs (115, 116). However, it appears that this base analog does not participate in hydrogen bonding. Proton NMR studies on an intramolecular triplex containing 4-(3-benzamidophenyl)imidazole show that the base analog intercalates between base-pairs in the target duplex (117). Another approach to recognition of pyrimidine-purine interruptions has been to insert an abasic site in the third-strand oligomer opposite the interruption. This strategy results in significant destabilization of the triplex (118) or in no triplex formation at all. Thus 13A, which contains an abasic site at position X, does not form a triplex with 13B.13C when Y.Z is C.G, T.A, G.C, or A.T (P. S. Miller, unpublished). Inclusion of the abasic site may disrupt stacking interactions along the third strand. Such stacking interactions appear to be an important contributor to overall triplex stability. Unlike the purine of a purine-pyrimidine base-pair, the pyrimidine base of a pyrimidine-purine interruption provides only a single site for hydrogen bonding interactions, the N4-amino group of C or the 0 4 of T. The inability to form two hydrogen bonds may account for the reduced stabilities of triplexes that contain these interruptions. To circumvent this problem, we have attempted to design base analogs that can span the major groove of the DNA target and interact with both bases of the pyrimidine-purine interruption. We first investigated analogs of deoxycytidine that contain aliphatic chains attached to the C-4 amino group of the base (119). Oligomers containing various functionalized alkyl chains were prepared by selective bisulfite-

PAUL S. MILLER

284

catalyzed transamination of a cytosine residue at position X in oligomer 13A (120).This reaction selectively modifies the cytosine but not the two 5-methylcytosines of 13A because bisulfite does not form an adduct with 5-methylcytosine. N4-(3-Acetamidopropyl)cytosine(structure 14, Fig. 8) selectively interacts with a C . G interruption in triplex 13 where X is the base analog and Y . Z is C . G . The melting temperature of 13A of this triplex is 20°C at pH 7.0. Triplex formation is not observed when Y.Z is T.A or A.T and only weak triplex formation is seen when Y.Z is G . C . The T , of the third strand of the latter triplex is 8°C. Triplex formation depends on the presence of a hydrogen-bond donor group at the end of the alkyl chain. Thus replacement of the acetamido group with an amino group also leads to triplex formation, whereas substitution of the acetamido group with a methyl or a carboxyl group results in loss of

I dR 15

H

FIG. 8. Structures of N4-(3-acetamidopropyl)cytosine, 14, and N4-(6-aminopyridinyl)cytosine, 15. Possible hydrogen-bonding schemes for the interaction of each analog with a C.C base-pair are shown to the right of the structure of each nucleoside.

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

285

triplex formation (119). Examination of molecular models suggests that the alkyl chain can reach across the major groove of the duplex and, as a consequence, position the acetamido group to form a hydrogen bond with the 0 6 of guanine of the C.G base pair. As shown in Fig. 8, an additional hydrogen bond could also form between the N4 of 14 and the C-4 amino group of the cytosine of the C.G base-pair if 14 exists in the imino tautomeric form. The formation of a stable triplex by oligomer 13A where X is N4-(3acetamidopropy1)cytosine is somewhat surprising given the inherent flexibility of the alkyl chain. It seemed reasonable that incorporation of a more rigid chain could result in the formation of more stable triplexes. To test this possibility, oligomer 13A containing N4-(6-aminopyridinyl)cytosine(structure 15, Fig. 8) at position X was synthesized (121).This was accomplished using the fully protected phosphoramidite synthon of N4-(6-aminopyridiny1)-2'deoxycytidine, which was prepared by conventional chemical procedures. Oligomer 13A with N4-(6-aminopyridiny1)cytosineat position X forms a stable triplex with 13B.13CYwhere Y.Z is C.G. The melting transition of the third strand is approximately 36°C at pH 7.0 and appeared be biphasic, suggesting that the base analog may interact with the duplex in two distinct ways. N4-(6-Aminopyridinyl)cytosinecould interact with C-Gas shown schematically in Fig. 8. This hydrogen-bonding scheme invokes the participation of the imino tautomeric form of the base. The existence of the imino tautomer was supported by proton NMR and UV spectroscopic data obtained for the nucleoside. Triplex formation between oligomer 13A, where X is 15, and duplex 13B.13C was also observed when Y.Z was A.T, G.C, and T.A. The melting temperature of 13A in the A.T triplex is 32"C, whereas the melting temperatures for the G.C and T.A triplexes are 18 and 20"C, respectively. Thus the selectivity of triplex formation by N4-(6-aminopyridinyl)cytosineappears to be less than that observed for N~-(3-acetamidopropyl)cytosine. This reduced selectivity may arise in part from hydrophobic interactions between the pyridinyl ring of 15 and neighboring bases that serve to stabilize the triplex. Oligonucleotides containing multiple N4-(6-aminopyridinyl)cytosinesubstitutions have also been prepared. These oligomers forin stable triplexes with their appropriate DNA targets when the target base pairs are C.G. These results suggest that base analogs of this type may be used to extend the range of sequence recognition by triplex-forming oligodeoxyribopyrimidines.

C. Triplex Formation by Oligonucleoside Methylphosphonates

Most studies on triplex-forming oligonucleotide analogs have focused on modifications involving the bases. Less is known about the effects of modifying the internucleotide linkages.

286

PAUL S . MILLER

Oligonucleotides containing phosphorothioate linkages appear to form less stable Y .(R.Y)-type triplexes than the corresponding oligonucleotides, which contain phosphodiester linkages (122). The stability of the triplex appears to depend on the configuration of the phosphorothioate linkage. Theoretical studies indicate that oligodeoxyribonucleoside methylphosphonates should be able to form stable triplexes with double-stranded DNA (123). These studies suggest that the methylphosphonate oligomer can be accommodated in the major groove of the target and should adopt a conformation different than that of the corresponding phosphodiester triplexforming oligomer. Experimentally, triplex formation has been observed between the oligodeoxyribonucleotide d(A-G), and the oligodeoxyribonucleoside methylphosphonate d(C-Z'), (124).At pH 7.0, d(A-G), and d(C-T), form a duplex whose structure appears to be quite different from that of the allphosphodiester duplex d(A-G),.d(C-T), as determined by circular dichroism spectroscopy. A t pH 4.2, a triplex is formed, d(C-T),.[d(A-G),.d(C-T),], whose circular dichroism spectra is quite similar to that of the allphosphodiester triplex, d(C-T),. [d(A-G),.d(C-T),]. In addition, circular dichroism studies and ethidium bromide fluorescence assays suggest that the triplex d(C-T),.[d(A-G),.d(C-T),] can also form at pH 5.5. This latter triplex serves as a model for the interaction of a methylphosphonate oligomer with a double-stranded DNA target. Recent studies have shown that the psoralen conjugate of oligonucleoside methylphosphonate, d-TpT?TC'I??TCTTC'IT, where C represents 5-methyldeoxycytidine, can form a triplex with a 27-mer duplex (P. S. Miller, unpublished). Triplex formation was observed at pH 6.2 by UV melting experiments. In addition, irradiation of the triplex with 365-nm light resulted in photoadduct formation between the psoralen and thymidine residues in both strands of the target duplex. This result suggests it may be possible to design photoreactive methylphosphonate oligomers that target homopurine sequences in genomic DNA. Oligodeoxyribonucleoside methylphosphonates can also target singlestranded DNA and RNA through the formation of triple-stranded complexes. For example, a second-strand methylphosphonate oligomer, dAWAWAAWAAAqAA, where W is deoxypseudouridine, and a third-strand methylphosphonate oligomer, d-TCTCTTCTITC'IT, were shown by UV melting experiments to form a stable triplex with the target strand, d-'ITATTTA'ITATTAT, at pH 7.0 (125). In this triplex, the polarities of the methylphosphonate oligomers are antiparallel to that of the target phosphodiester strand. Because the C nucleoside, deoxypseudouridine, contains hydrogen bonding groups on both sides of the heterocyclic ring, it serves as a unique binding site in the second strand of the triplex. Thus, substitution of deox-

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

287

yuridine for deoxypseudouridine in the second-strand oligomer resulted in loss of triplex formation. Single, oligodeoxyribopurine methylphosphonates 16 nucleotides in length form triple-stranded complexes with complementary single-stranded oligopyrimidine RNA targets as shown by UV melting, circular dichroism spectroscopy, and gel mobility-shift assays (126).When the pyrimidine target sequence was inserted into the 3’ side of the initiation codon of chloramphenicol acetyltransferase mRNA, the methylphosphonate oligomers specifically inhibited translation of this mRNA in a cell-free system in a dosedependent manner. This result suggests that under certain circumstances triplex-forming oligonucleoside methylphosphonates might be used as selective inhibitors of protein synthesis in cells by targeting cellular mRNA.

111. Conclusion Antisense and antigene oligonucleotide analogs hold great promise for the design of reagents that can be used to inhibit gene expression selectively at either the mRNA or DNA level. Efforts continue to develop analogs that have increased nuclease resistance, improved binding capabilities, and enhanced uptake and biodistribution properties. Success in this area can be expected to lead eventually to the development of highly selective therapeutic agents.

REFERENCES 1 . J. S. Cohen (ed.), “Oligonucleotides. Antisense Inhibitors of Gene Expression.” Macmillan, London, 1989. 2. E. Wickstrom (ed.), “Prospects for Antisense Nucleic Acid Therapy of Cancer and AIDS.” Wiley-Liss, New York, 1991. 3. R. P. Erickson and J. G. Izant (eds.), “Gene Regulation. Biology of Antisense RNA and DNA.” Raven Press, New York, 1992. 4. J. A. H. Murray (ed.), “Antisense RNA and DNA.” Wiley-Liss, New York, 1992. 5. S. T. Crooke and B. Lebleu (eds.), “Antisense Research and Applications.” CRC Press, Boca Raton, FL 1993. 6. P. S. Miller and P. 0. P. Ts‘o, Annu. Rep. Med. Chem. 23, 295 (1988). 7. E. Uhlmann and A. Peyman, Chem. Rev. 90, 543 (1990). 8. B. J. Dolnick, Biochem. Phann. 40, 671 (1990). 9. D. M. Tidd, Anticancer Res. 10, 1169 (1990). 10. C. Helene and J.-J. Toulme, BBA 1049, 99 (1990). 1 1 . P. D. Cook, Anti-Cancer Drug Design 6 , 585 (1991). 12. A. D. B. Malcolm, J. Coulson, N. Blake and L. C. Archard, Biochern. Soc. Trans. 20, 762 (1992). 13. S. T. Crooke, Antisense Res. Deo. 3, 1 (1993).

288

PAUL S. MILLER

14. J. F. Milligan, M. D. Matteucci and J. C. Martin, J. Med. Chem. 36, 1923 (1993). 15. C. A. Stein and Y.-C. Cheng., Science 261, 1004 (1993). 16. J. P. Leonetti, G. Degols, J. P. Clarenc, N. Mechti and B. Lebleu, This Series 44, 143 (1993). 17. J. S. Kiely, Annu. Rep. Med. Chem. 29, 297 (1994). 18. P. C. Zamecnik and M. L. Stephenson, €“AS 75, 280 (1978). 19. S. Agrawal, J. Temsamai, W. Galbraith and J. Y. Tang, Clin. P h a m . 28, 7 (1995). 20. W. J. Stec, G. Zon, W. Egan and B. Stec, JACS 108, 6077 (1984). 21. W. Brill, J.-Y. Tang, Y.-X. Ma and M. H. Caruthers, JACS 111, 2321 (1989). 22. K. Bjergarde and 0. Dahl, NARes 19, 5843 (1991). 23. M. E. Piotto, J. N. Granger, Y. Cho, N. Farschtschi and D. G. Gorenstein, Tetrahedron Lett. 47, 2449 (1991). 24. F. Morvan, B. Rayner, J. P. Leonetti and J.-L. Imbach, NARes 16, 833 (1988). 25. F. Debart, B. Rayner, G . Degols and J.-L. Imbach, NARes 20, 1193 (1992). 26. P. E. Nielsen, M. Egholm, R. H. Berg and 0. Buchardt, Science 254, 1497 (1991). 27. K. Chacko, R. Linder, W. Saenger and P. S. Miller, NARes 11, 2801 (1983). 28. P. S. Miller, J. Yano, E. Yano, C. Carroll, K. Jayaraman and P. 0. P. Ts’o, Bchem 18,5134 (1979). 29. K. L. Agarwal and F. W i n a , NARes 6, 3009 (1979). 30. P. S. Miller, P. Bhan, C. D. Cushman, J. M. Kean and J. T. Levis, Nucleosides Nucleotides 10, 37-46 (1991). 31. C. G. Wang, L. X. Wang, X. B. Yang, T. Y. Jiangand L. H. Zhang, NARes21,3245(1993). 32. P. S. Miller, P. 0. P. Ts’o, R. I. Hogrefe, M. A. Reynolds and L. J. Arnold, Jr., in “Antisense Research and Applications” (S. T. Crooke and B. Lebleu, eds.), p. 189. CRC Press, Boca Raton, F L (1993). 33. P. S. Miller, C. H. Agris, A. Murakami, M. P. Reddy, S. A. Spitz and P. 0. P. Ts’o, NARes 11, 6225, (1983). 34. P. S. Miller, C. H. Agris, M. Blandin, A. Murakami, M. P. Reddy and P. 0. P. Ts’o, NARes 11, 5189 (1983). 35. P. S. Miller, M. P. Reddy, A. Murakami, K. R. Blake, C. H. Agris and S. B. Lin, Bckem 25, 6268 (1986). 36. M. A. Dorman, S. A. Noble, L. J. McBride and M. H. Caruthers, Tetrahedron 49, 95 (1984). 37. A. Jager and J. Engles, Tetrahedron Lett. 25, 1437 (1984). 38. S. Agrawal and J. Goodchild, Tetrahedron Lett. 28, 3539 (1987). 39. P. S. Miller, C. D. Cushman and J. T. Levis, in “Oligonucleotides and Analogues. A Practical Approach (F. Eckstein, ed.), p. 137. IRL Press, Oxford, 1991. 40. R. I. Hogrefe, M. M. Vaghefi, M. A. Reynolds, K. M. Young and L. J. Arnold, Jr., NARes 21, 2031 (1993). 41. A. Murakami, K. R. Blake and P. S. MiIler, Bckem 24, 4041 (1985). 42. J. M. Kean, C. D. Cushman, H. Kang, T. E. Leonard and P. S. Miller, NARes 22,4497 (1994). 43. P. S. Miller, N. Dreon, S. M. Pulford and K. B. McParland, JBC 255, 9659 (1980). 44. M. Bower, M. F. Summers, C. Powell, K. Shinozuka, J. B. Began, G. Zon and W. D. Wilson, NARes 15, 4915 (1987). 45. Z. J. Lesnikowski, M. Jaworska and W. J. Stec, NARes 16, 11675 (1988). 46. M. Durand, J. C. Maurizot, U. Asseline, C. Barbier, N. T. Thuong and C. Helene, NARes 17, 1823 (1989). 47. Z. J. Lesnikowski, M. Jaworska and W. J. Stec, NARes 18, 2109 (1990).

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

289

48. E. V. Vyazovkina, E. V. Savchenko, S. G. Lokhov, J. W. Engels, E. Wickstrom and A. V. Lehedev, NARes 22, 2404 (1994). 49. H. Inoue, Y. Hayase, A. Imura, S. Iwai, K. Miura and E. Ohtsuka, NARes 15,6131 (1987). 50. T. L. Trapane and P. 0. P. Ts’o, Biophys. Soc., 43rd Annu. Mtg., Ahstr. #W-POS514 (1993). 51. B. L. Lee, A. Murakami, K. R. Blake, S. B. Lin and P. S. Miller, Bchem 27,3197 (1988). 52. M. A. Reynolds, T. A. Beck, R. I. Hogrefe, A. McCaErey, L. J. Arnold, Jr. and M. M. Vaghefi, Bioconj. Chem. 3, 366 (1992). 53. P. S. Miller, in “Methods in Enzymology,” Vol. 211, p. 54. Academic Press, New York, 1992. 54. J. Thaden and P. S. Miller, Biucunj. Chem. 4, 395 (1993). 55. P. Bhan and P. S. Miller, Bioconj. Chem. 1, 82 (1990). 56. B. L. Lee, K. R. Blake, and P. S. Miller, NARes 16, 10681 (1988). 57. J. M. Kean and P. S. Miller, Bchem 33, 9178 (1994). 58. J. M. Kean and P. S. Miller, Bioconj. Chem. 4, 184 (1993). 59. J. Thaden and P. S. Miller, Bbconj. Chem. 4, 386 (1993). 60. J. M. Kean, A. Murakami, K. R. Blake, C. D. Cushman and P. S. Miller, Bchem 27,9113 (1988). 61. J. T. Levis and P. S. Miller, Antisense Res. Deu. 4 , 223 (1994). 62. P. S. Miller, K. B. McParland, K. Jayaraman and P. 0. P. Ts’o, Bchem 20, 1874 (1981). 63. Y. Shoji, S. Akhtar, A. Periasamy, B. Herman and R. L. Juliano, NARes 19,5543 (1991). 64. D. J. Chin, 6. A. Green, 6. Zon, F. C. Szoka, Jr. and R. M. Struhinger, New Biol. 2, 1091 (1990). 65. S. L. Loke, C . A. Stein, X. H. Zhang, K. More, M. Nakanishi, C. Suhasinghe, J. S. Cohen and L. M. Neckers, PNAS 86, 3474 (1989). 66. L. A. Yakuhov, E. A. Deeva, V. F. Zarytova, E. M. Ivanova, A. S. Ryte, L. V. Yurchenko and V. V. Vlassov, PNAS 86, 6454 (1989). 67. K. Jayaraman, K. B. McParland, P. S. Miller and P. 0. P. Ts’o, PNAS 77, 1537 (1981). 68. T.-L. Chen, P. S. Miller, P. 0. P. Ts’o and 0. M. Colvin, Drug Metub. Disposition 18,815 (1990). 69. P. S. Sarin, S. Agrawal, M. P. Civeira, J. Goodchild, T. Ikeuchi and P. C. Zamecnik, PNAS 85, 7448 (1988). 70. J. A. Zaia, J. J. Rossi, G. J. Murakawa, P. A. Spallone, D. A. Stephens, B. E. Kaplan, R. Eritja, R. B. Wallace and E. M. Catin, J . Virol. 62, 3914 (1988). 71. J. Laurence, S. K. Sikder, J. Kulkosky, P. Miller and P. 0. P. Ts’o, J. Virol. 65,213 (1991). 72. C. C. Smith, L. Aurelian, M. P. Reddy, P. S. Miller and P. 0. P. Ts’o, PNAS 83, 2787 (1986). 73. M. Kulka, C . C. Smith, L. Aurelian, R. Fishelevich, K. Meade, P. Miller and P. 0. P. Ts’o, PNAS 86, 6868 (1989). 74. M. Kulka, M. Wachsman, S. Miura, R. Fishelevich, P. S. Miller, P. 0. P. Ts’o and L. Aurelian, Antioirul Res. 20, 115 (1993). 75. M. Kulka, C. C. Smith, J. Levis, R. Fishelevich, J. C. R. Hunter, C. D. Cushman, P. S. Miller, P. 0. P. Ts’o and L. Aurelian, Antimicrobial Agents Chemotherapy 38, 675 (1994). 76. E. H. Chang, P. S. Miller, C. Cushman, K. Devadas, K. F. Pirollo, P. 0. P. Ts’o and 2. P. Yu, Bchem 30, 8283 (1991). 77. G. Felsenfeld, D. Davies and A. Rich, JACS 79, 2023 (1957). 78. H. E. Moser and P. B. Dervan, Science 238, 645 (1987). 79. I. Radhakrishnan and D. J. Patel, Structure 1, 135 (1993). 80. P. Rjagopal and J. Feigon, Bchem 28, 7859 (1989).

290

PAUL S . MILLER

E. Plum, Y.-W. Park, S. Singleton, P. B. Dervan and K. Breslauer, PNAS 87, 9436 (1990). I. Radhakrishnan, C. de 10s Santos and D. J. Patel, J M B 221, 1403 (1991). R. W. Roberts and D. M. Crothers, PNAS 88, 9397 (1991). S . F. Singleton and P. B. Dervan, JACS 114, 6957 (1992). M. Roughee and C. Helene, Bchem 31, 9269 (1992). R. Macaya, E. Wang, P. Schultze, V. Sklenar and J. Feigon, J M B 225, 755 (1992). R. W. Roberts and D. M. Crothers, Science 258, 1463 (1992). 88. L. C. Griffin and P. B. Dervan, Science 245, 967 (1989).

81. 82. 83. 84. 85. 86. 87.

89. P. A. Beal and P. B. Dervan, Science 251, 1360 (1991). 90. J. L. Mergny, J . 6 . Sun, M . Rougee, T. Montenay-Garestier, F. Barcelo, J. Chomilierand C. Helene, Bchem 30, 9791 (1991). 91. K. Yoon, C. A. Hobbs, J. Koch, M. Sardaro, R. Kutny and A. L. Weis, PNAS 89, 3840 (1992). 92. P. A. Beal and P. B. Dervan, JACS 114, 4976 (1992). 93. N. T. Thuong and C. Helene, Angew. Chem. 32, 666 (1993). 94. M. Cooney, G. Czernuszewicz, E. H. Postel, S. J. Flint and M. E. Hogan, Science 241, 456 (1988). 95. E. H. Postel, S. J. Flint, D. J. Kessler and M. E. Hogan, PNAS 88, 8227 (1991). 96. F. M. Orson, D. W. Thomas, W. M. McShan, D. J. Kessler and M. E. Hogan, NARes 19, 3435 (1991). 97. W. M. McShan, R. D. Rossen, A. H. Laughter, J. Trial, D . J. Kessler, J. G. Zendegui, M. E. Hogan and F. M. Orson, JBC 267, 5712 (1992). 98. T. J. Povsic and P. B. Dervan, JACS 111, 3059 (1989). 99. P. S. Miller, P. Bhan, C. D. Cushman and T. L. Trapane, Bchem 31, 6788 (1992). 100. A. Ono, P. 0. P. Ts’o and L.-S. Kan, J. Org. Chem. 57, 3225 (1992). 101. G. Xiang, W. Soussou and L. W. McLaughlin, JACS 116, 11155 (1994). 102. J. S . Koh and P. B. Dervan, JACS 114, 1470 (1992). 103. M. C. Jetter and F. W. Hobbs, Bchem 32, 3249 (1993). 104. E. C. Davison and K. Johnson, Nucleosides Nucleotides 12, 237 (1993). 105. D. M. Noll, J. L. O’Rear, C. D. Cushman and P. S. Miller, Nucleosides Nucleotides 13, 997 (1994). 106. Q. Wang, S . Tsukahara, H. Yamakawa, K. Takai and H. Takaku FEBS Lett. 2 5 5 , l l (1994). 107. S. H. Krawczyk, J. F. Milligan, S. Wadwani, C. Moulds, B. C. Froehler and M. D. Matteucci, PNAS 89, 3761 (1992). 108. S. L. Young, S. H. Krawczyk, M. D. Matteucci and J. J. Toole, PNAS 88, 10023 (1991). 109. J.-L. Mergny, J . 3 . Sun, M. Rougee, T. Montenay-Garestier, F. Barcelo, J. Chornilier and C. Helene, Bchem 30, 9791 (1991). 110. P. S. Miller and C. D. Cushman, Bchem 32, 2999 (1993). 111. I. Radhakrishnan, X. Gao, C. de 10s Santos, D. Live and D. J. Patel, Bchem 30, 9022 (1991). 112. E. Wang, S. Malek and J. Feigon, Bchem 31, 4838 (1992). 113. I. Radhakrishnan and D. J. Patel, J M B 241, 600 (1994). 114. H. U. Stilz and P. B. Dervan, Bchem 32, 2177 (1993). 115. L. C. Griffin, L. L. Kiessling, P. A. Beal, P. Gillespie and P. B. I)ervan,JACS 114, 7976 (1992). 116. L. L. Kiessling, L. C. Grimn and P. B. Dervan, Bchem 31, 2829 (1992). 117. K. M . Koshlap, P. Gillespie, P. B. Dervan and J. Feigon, JACS 115, 7908 (1993). 118. D. A. Home and P. B. Dervan, NARes 19, 4963 (1991). 119. C.-Y. Huang, C. D. Cushman and P. S. Miller, J. Org. Chem. 58, 5048 (1993). 120. P. S. Miller and C. D. Cushman, Bioconj. Chem. 3, 74 (1992).

ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES

29 1

121. C:Y. Huang and P. S. Miller, JACS 115, 10456 (1993). 122. J. 6. Hacia, B. J. Wold and P. B. Dervan, Bchern 33, 5367 (1994). 123. F. H. Hausheer, U. C. Singh, J. D. Saxe, 0. M. Colvin and P. 0. P. Ts’o, Anti-Cancer Drug Design 5, 159 (1990). 124. D . E. Callahan, T. L. Trapane, P. S. Miller, P. 0. P. Ts’o and L . 3 . Kan, Bchem 30, 1650 (1991). 125. T. L. Trapane, M. S. Christopherson, C. D. Robby, P. 0. P. Ts’o and D. Wang, JACS 116, 8412 (1994). 126. M. A. Reynolds, L. J. Arnold, JI., M . T. Almazan, T. A. Beck, R. I. Hogrefe, M. D. Metzler, S. R. Stoughton, B. Y. Tseng, T. L. Trapane, P. 0. P. Ts’o and T. M . Woolf, PNAS 91, 12433 (1994).

Hidden Infidelities of the Tra ns Iat iona I Stop S igna I1 WARRENP. TATE,~ ELIZABETHS. POOLEAND SALLYA. MANNERING

I

Department of Biochemistry and Center for Gene Research University of Otago Dunedin, New Zealand

I. What Is Unique about a Codon That Specifies Stop?

.....

11. What Influences the Decoding of a Tra 111. What Really Decodes the Stop Signal?

IV. V. VI. VII.

What Is the Mechanism of Decoding Tr Recoding at a Stop Codon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Does the Termination Signal Extend beyond the Stop Codon? . . . . . . . Physiological Advantages of Stop Signals Decoded with Varying Efficiencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Conclusion . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

294 297 303 309 311 318 326 330 330

The solving of the genetic code in the 1960s provided an enormous impetus to our understanding of protein synthesis and the coding problem. To display the code in a simple table revealed the profound knowledge it encompassed. This display was a master stroke for the sense codons because they were decoded by three bases on the tRNA and so rested easily within this triplet-coding framework. However, three of the potential triplet codes were not assigned to amino acids, and were eventually assigned to specify translational stop. The classification of the translational stop codons to the missing unfilled spaces in a table of triplets had a drawback in that it focused attention away from the possibility that the signal might be greater than the three bases specified. It soon became clear that stop codons are unlikely to be decoded by a simple trip1et:triplet decoding scheme. The involvement of protein-decoding molecules meant the signal could be larger than a triplet. The wide variation of the suppression of nonsense codons introduced into 1 Abbreviations: RF, prokaryotic release factor; eRF, eukaryotic release factor; GP, glutathionine peroxidase, 5'-DI, type I iodothyronine 5'-deiodinase; rRNA, ribosomal RNA; mRNA, messenger RNA; tRNA, transfer RNA. 2 To whom correspondence may be addressed.

Progress in Niicleic Acid Research and Molecular Biology, Vol. 52

293

Copyright Q 1996 by Academic Press, Ine. All nghts of reproduction in any form reserved.

294

WARREN P. TATE ET AL.

different contexts as premature stop codons, for example, hinted that the actual stop signal was not precisely defined by just considering a triplet. Twenty-five years later, the discovery of physiologically significant recoding events at translational stop signals operating outside the constraints of the genetic code, with the stop codon as part of the recoding signal, has refocused attention on the true nature of this signal and how it is decoded. Events such as suppression have become recognized as more important than simply an interesting laboratory phenomenon. Do the recoding events just override a simple triplet stop codon, or is there a hierarchy of signals of varying strengths, with bases outside the triplet codon contributing to whether the signal specifies stop or the recoding event? This has been the recent challenge for those interested in the termination mechanism of protein synthesis, and in how the translational stop codons function. Evidence is now accumulating that there is a subtle layer of cellular regulation in which translational stop signals play a part, and for which the relative strength of the signals is critical.

1. What Is Unique about a Codon That Specifies Stop?

A. How Were UAA, UAG, and UGA Originally Deduced to Be Translational Stop Signals? Specific signals for the termination of protein synthesis were predicted from studies of nonsense mutations (reviewed in Ref. 1). Such mutations were associated with the appearance of amino-terminal fragments of the affected gene products, indicating that protein synthesis had been prematurely terminated (2). Studies of nonsense mutations in the Escherichia coli alkaline-phosphatase gene (3)revealed two classes of mutants: the amber (Nl) class (4)and the ochre (N2) class (5).The amber codon was identified as UAG (6) and the ochre codon as UAA (7). A simultaneous investigation with the bacteriophage T4rII gene also identified UAA and UAG as translational stop codons (8). Using this second genetic system, an additional class of UGA nonsense mutants was defined (9). When the complete nucleotide sequence of bacteriophage MS2 was determined there was confirmation that indeed all three nonsense codons are used as natural translational stop signals (10).

B. Stop Codons Can Specify Amino Acids

It was widely assumed that the genetic code is universal, because organisms as diverse as E . coli, Xenopus laevis, and guinea pig use the same code

295

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

for amino acids and for stopping protein synthesis (11). Indeed, it was proposed in the “frozen accident hypothesis” that the genetic code may be universal, because a change would be lethal or strongly selected against (12). Translational stop codons were no exception to this assumption of universality and it was reinforced as more and more data accumulated. It was a complete surprise when it became apparent that vertebrate mitochondria use a code that has some differences from the universal code. For example, UGA specifies tryptophan incorporation instead of stop (13).In fact, with the exception of plant mitochondria, all mitochondria reveal changes in their codes for translational stop (Table I) as well as some changes in the sense codons (14). The small size of these genomes and the fact they operate as isolated genetic systems provided a rationale for why such deviations might be tolerated (15). However, exceptions to the universal genetic code go beyond such isolated systems. Mycoplasma species and Euplotes use UGA for tryptophan and cysteine, respectively. Ciliated protozoans use UAA and UAG for glutamine, and platyhelminths use UAG for tyrosine. Conversely, some mammalian mitochondria use the universal arginine codons, AGG and AGA, to specify translational stop (16) (Table I). This implied that there is nothing sacrosanct about the three universal stop codons with respect to the mechanism of their decoding by ribosomes. Clearly, other codons could substitute

DEVIATIONS FROM

TABLE 1 “UNIVERSAL” GENETICCODE

THE

RELATING TO k N S L A T I 0 N A L mRMINATION

Codona

Decodes as

UGA

Tryptolhaii

Mycoplasma, mitochondria (except plants)

UGA

Cysteine

Euplotes

UGA

Selenocysteine

Many organisms

UAG

Glutaniine

Ciliated protozoans

UAG

Tyrosine

Platyhelminths

UAA

Glutamine

Ciliated protozoans

Organism

AGA

Stop

H u m a n h v i n e mitochondria

AGG

stop

Human mitochondria

CGG

“Nonsense”

Mycoplasma

* Codons that signal stop ill thc universal code may specify an amino acid in particular organelles and organisms Conversely, a codon spccifying an amino acid in the universal code can signal stop in mitochondria. In a few instances, codons remain unassigned and therefore reprcsent “nonsense” (16).

296

WARREN P. TATE ET AL.

and the universal stop codons could specif). other functions. The first infidelities of the “group of three” had come to light.

C. How Does a Codon Become the Code for Translational Stop?

An interesting theory of “codon takeover” has been developed by Lehman and Jukes (17). It suggests that all codons were originally nonsense until a tRNA evolved to recognize them. Here the term “nonsense” refers to the lack of a translational mechanism recognizing the codon, and such codons may have specified translational arrest as a default condition. An example is the codon CGG found in (A T)-rich mycoplasma (18).This organism lacks a tRNA that recognizes CGG, and although CGG causes a stop in translation, it does so poorly. This designation of a nonsense codon is distinct from modern use of the term, where an active decoding mechanism is present to process nonsense codons as translational stop. According to the “codon takeover” model, those nonsense codons unassigned to amino acids have the potential to be captured by a mechanism recognizing them as signals for translational stop, and then further captured by mechanisms specifying them as amino acids. One might visualize an early stage in the evolution of the termination of protein synthesis as being a simple dissociation of the protein from the protoribosome when a nonsense codon appears in the appropriate site. In the further evolution of triplet codes, certain codons could have evolved as specific stop signals, but clearly they had the capacity, along with sense codons, to further change their roles (Fig. 1). The existence of alternative genetic codes provides compelling evidence for such a process. For example, UGA codes for the incorporation of tryptophan, cysteine, or selenocysteine (19), and UAA and UAG code for glu-

+

Unassimed codon

stop e.g.UGA

sense

Sense

e.g. AGA, AGG

stop

FIG. 1. Assignment of codons for sense or stop during evolution. A true “nonsense”codon previously unassigned can be taken over for sense or stop but may have been reassigned later in evolution for a different function.

INFIDELITIES O F TRANSLATIONAL STOP SIGNALS

297

tamine (Table I). The assignment of a universal stop codon to code for an amino acid in a specific organism or cell organelle might have been influenced by genomic G . C or A . T pressure, as seen with the replacement of UGG tryptophan codons with UGA, perhaps as the A.T content of the genome increased (20). In genetic systems using the universal genetic code, the use of UGA in some genes to specify insertion of selenocysteine while signaling stop in other genes may be an example of "stop codon takeover" in progress. The historical development of ideas and discoveries of how the genetic code functions has molded our thinking of the translational stop signal as a triplet. However, experiments investigating the effects of the surrounding nucleotide sequence (the codon context) and the suppression of stop signals by suppressor or noncognate tRNAs suggested hidden infidelities in the signal (reviewed in Ref. 21). These studies were germinal in preparing the way for a reexamination of the true nature of the signal. However, it was the discovery of signals for translational frameshifting, translational hopping, and selenocysteine incorporation in the latter half of the 1980s (reviewed in Ref. 22), where the infidelities of the translational stop signal were more spectacular and unquestionably of physiological relevance, that led to a heightened and renewed interest in translational termination. Phenomena such as readthrough of stop codons and translational frameshifting, once thought by many to be esoteric quirks of in uitro experiments and of little physiological consequence, came to be viewed as a layer of cellular regulation not previously appreciated. Understanding the mechanism of decoding a translational termination signal has become more compelling because it promises to explain why translational stop signals are decoded with high fidelity in most circumstances but with poor fidelity in others.

II. What Influences the Decoding of a Translational Stop Signal?

A. Context Effects in the mRNA There is much evidence suggesting that the efficiency of translational stop signals can be influenced in cis by the surrounding sequence of the mRNA. What has been less clear is whether this is physiologically significant to the organism, or whether it is simply the noise above which the stop codon functions. The efficiency of suppression of nonsense codons in different contexts has received wide attention. In this case, there are two competing events, and it is hard to separate the effect of context on the termination

298

WARREN P. TATE ET AL.

event versus the effect on the tRNA-mediated event. Depending on the stop codon context, the outcome of the competition between a suppressor tRNA and release factors (RFs) varies up to 50-fold (23-25). However, it was difficult to distinguish the effects on translational termination from those on elongation, or indeed secondary effects of transcription. For example, the importance of the nucleotide immediately downstream of the codon was suggested to be important from studies of SupE, a suppressor of UAG codons, whereas a further study of UAG and UGA suppression suggested the two bases following were important, and yet another study identified significant elements outside the immediate downstream nucleotide (reviewed in Ref. 21). These studies looked for coinmon elements, but overlooked possible sitespecific contexts that might be important, so a systematic study of changes introduced into a particular nonsense site was undertaken. The conclusions were that the nucleotides immediately 5’ and 3’ to the codon are significant, as is the wobble nucleotide of the adjacent downstream codon (26). The effects of sequence elements common to all sites (for example, the 3’ nucleotide) might be mediated through termination, whereas the effects of sequence elements specific for a particular site might be mediated through the activity of the suppressor tRNA (27). There are some examples of suppression of translational stop signals by naturally occurring tRNAs that may be physiologically significant. It has been shown that the UGA at the end of the coat protein gene of bacteriophage Qp is an inefficient stop signal and gives rise to a readthrough product that is also a structural component of the mature virion (28).Another example of readthrough of a UGA codon occurs in Sindbis virus genomic RNA, where suppression of the UGA allows regulated production of the putative viral RNA polymerase (29). Similarly, readthrough of UAG codons occurs. A UAG codon in the synthetase gene of the RNA bacteriophage MS2 is prone to readthrough to a new termination codon seven codons later (30). The UAG codon is also an inefficient termination signal in several plant viruses, and in this case it is believed that a normal Tyr-tRNA reads the codon through the context influence of two downstream triplets (34. Apart from these specific examples, each of the three translational stop codons (UAA, UAG, and UGA) can be misread in various contexts by natural tRNAs; UGA is misread by Trp-tRNA at a rate of 1-3% and UAG and UAA are misread at a rate as low as <0.01% by Gln-tRNA (reviewed in Ref. 32). Studies such as these indicate that the termination signal might be longer than just the triplet defined by the genetic code. There are notable questions unanswered by the model of a triplet stop codon: Why are strong suppressor tRNAs tolerated in E . coli? Why is it impossible to set up an in vitro assay for

INFIDELITIES O F TRANSLATIONAL STOP SIGNALS

299

eukaryotic termination using synthetic triplet codons (33)? Recent studies suggest that it is possible to have termination signals of different strengths, and that such signals may have significance in gene regulation (34-36). How efficiently specific signals are decoded by the ribosome can influence whether they are faithfully decoded as stop. The structure of the ribosome itself is clearly critical for the decoding process.

B. The Ribosome and the Decoding of Translational Stop Signals A number of mutations in genes other than those for tRNA affect the ability of translational stop codons to signal termination of protein synthesis. Among these are genes for the ribosomal RNA (rRNA) and ribosomal proteins. For example, a mutation from C1054 to A1054 (originally reported as a deletion of C1054) in helix 34 of the 16-S rRNA of E . coli suppresses predominantly UGA stop codons (37). This led to the proposal of an attractive model for termination codon recognition involving C1054 in helix 34 and basepairing between the codon and tandem UCA triplets in the rRNA (discussed in Section III,B,2). This mutant was later found to suppress also the termination at the other two stop codons under certain conditions, suggesting that its effect on termination was more indirect than was initially thought (38). Indeed, a recent detailed site-directed mutagenesis study of helix 34 provides evidence of nonspecific readthrough of all three stop codons, as well as enhancement of +1 and -1 frameshifting (39). Mutations in other places in the small or large subunit rRNA also cause readthrough of translational stop codons, such as a mutation at C726 (40)and a mutation at U2555 of the large rRNA (41). A mutation in the yeast mitochondrial rRNA, equivalent to G517 position in the “530 loop” of the E . coli rRNA, causes UAA suppression (42).Mutations in several ribosomal proteins can also cause suppression of nonsense mutations, for example, mutations in S4 (the rum A mutation) (43), S5 (44), S12 (45), S17 (46), and L7/L12 (47). Because the ribosome provides the three-dimensional platform where the code of the mRNA is interpreted, it is not surprising that if the structure of this particle is perturbed-either through its rRNA or one of its proteins-the accuracy of translation is likely to be affected. Mutations affecting translational accuracy would influence the decoding of both sense and stop codons, if the conformation of some feature of the ribosome common to both processes were altered. Indeed, the detailed studies of helix 34 of the rRNA provide a good example of this. Helix 34 is now implicated as a ribosomal structure forming an essential part of the decoding site, such that mutations in this feature can affect ribosomal events prior to termination, as well as the termination event itself (39).

300

WARREN P. TATE ET AL.

C. Protein Factors Mediate the Decoding of Translational Stop Signals

Special protein factors are involved in the recognition of translational stop signals as part of the termination machinery (Table 11). Their existence was initially suggested by in vitro studies indicating a factor in cell-free extracts apart from tRNA, ribosomes, or elongation factors that influenced peptide release (48). A protein factor required for release of the peptides at translational stop signals was later identified in bacterial extracts (49). A simple model assay (50) showed that bacteria contain two release-factor (RF) proteins with different codon specificity (RF-1 recognizes UAG and UAA; RF-2 recognizes UGA and UAA) (51).An additional protein factor (RF-3) was also discovered; it had no effect on decoding of the translational stop signals alone but could profoundly influence the activity of the other two factors

(52, 53). In 1970, a single R F species, eukaryotic release factor (eRF), was isolated from rabbit reticulocytes that, unlike the two E . coli factors, functions with all three codons (54). Several years later, S, an RF-3-like protein that stimulates the eukaryotic termination event, was detected during eRF purification (55). Most of the evidence that these factors participate in the decoding of stop signals came from detailed in uitro studies. However, their physiological importance is supported by in uiuo genetic evidence arising from the isolation of specific classes of mutants that allow increased readthrough at specific translational stop signals. For example, the prJA mutant (originally named uar) showed increased suppression of UAA and UAG stop codons (56);this locus was subsequently shown to be the gene for RF-1(57). The prfB mutant (originally called supK) that caused increased suppression of UGA was shown to lie at the chromosomal locus and within the coding sequence of the gene TABLE I1

RF PROTEINSINVOLVEDIN DECODINGOF STOP SIGNALS ~

RFa RF-1 RF-2

RF-3 mRF-1 eRF-1 S

~

Organis m

Codon specificity

Ref.

Prokaryotes Prokaryotes Prokaryotes Mitochondria Eukaryotes Eukaryotes

UAG,UAA UGA, UAA Assists release by RF-I and RF-2 UAG, UAA UAG, UGA, UAA Assists release by eRF

53,154 53,154 53, 59, 59a, 154 150, 155 55, 61, 156, 157 55

a RF proteins from eubacteria, from eukaryotes, and from mitochondria have been characterized. One class, including prokaryotic RF-1, RF-2, mitochondria1 mRF-1, and eiikaryotir eRF-1, decodes the stop signals, whereas a second class, including prokaryotic RF-3 and eukaryotic S, acts as stimulatory molecules for the other RFs.

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

301

for RF-2 (58). Hence, both biochemical and genetic studies firmly established that the translational stop codons are decoded by a protein-mediated mechanism, and that this mechanism is sensitive to changes in the structures of the protein factors. The prfC gene that encodes the poorly characterized RF-3, about which little had been published since 1970, was isolated recently using a deliberate strategy that detected mutants suppressing all three stop codons. Initial selections were for UGA suppression; secondary screens detected suppression of all three nonsense codons without suppression of missense alleles. One mutant with the required phenotype had a disrupted gene for a G protein with significant homology to EF-G. The protein expressed from the wild-type gene had all of the functional characteristics of the originally isolated RF-3 (59, 59a). It is uncertain whether prfA, p r - , and prfC are the full complement of genes encoding protein factors that influence the decoding of translational stop signals. We have some evidence that other proteins, as yet poorly characterized, influence the eficiency of the event (W. Tate and K. McCaughan, unpublished). The genes for the corresponding eukaryotic proteins were elusive, until very recently. Although the yeast mutants, Sup45 and Sup35, had many characteristics of mutations in components of the termination machinery (60), it was not until the peptide sequence for the rabbit reticulocyte eRF became available that Sup45, and homologs from other species, were implicated in a highly conserved family of eRF proteins, termed eRF-1 (61). Mutations in the yeast eRF-1 support the concept that structural alterations in the proteins have a profound effect on the accuracy of reading stop codons. What is not clear from these experiments is whether the proteins have a direct role in decoding the translational stop signals, or whether their role is more indirect and perhaps involves an interaction with a ribosomal motif. It is critical to understand how the proteins participate in the decoding event if we are to understand the true nature of the translational stop signal. A direct RF:mRNA interaction could indicate that the signal may be larger than the triplet specified by the genetic code, because in this event the recognition determinant need not be restricted simply to the triplet codon. In contrast, an indirect effect operating through interaction of the RF with a ribosomal component would support a triplet stop signal, perhaps decoded by the rRNA. What is our current knowledge of the involvement of the rRNA, and what is the nature of the interaction of the RF with the termination complex?

D. The Structure of the Stop Signal The primary structural moieties crucial to the identity of a stop signal have been partially analyzed for E . coli, mainly through studies in vitro. We

302

WARHEN P. TATE ET AL.

have classified the recognition determinants into four elements (I, 11, 111, and IV, Fig. 2),based largely on the work of Smrt et al. (62).Element I is in position one of the stop signal, and both the N-3 proton and the C-4 carbonyl (sulfonyl is also allowed) groups are essential, whereas C-5 may be substituted (62-66). A 2'-hydroxyl on the sugar moiety is also essential, because dU and dT (along with A, G, and C) are not functional, in contrast to U or m5U. Other modified bases (mW, hU, and isoC) are not recognized. The second recognition element (element 11) has been classified as tight because there is only one possibility. The base A is allowed but U, C, G, and I are not recognized. Element I1 is in the second position of the signal when the Element I: First position Recognized: U, m5U, b5U and s4U OorS

Element 11: Tight position (RF-1 position 2, RF-2 position 3) Recognized A

Nr2

H or CH3 or Br H

J

$

I

0

k

Not recognized: A, G , C , dT, dU, m3U, hU and isoC

k Not recognized: U, C, G and I

Element IV: Fourth Position

Element 111: Loose position (RF-1position 3, RF-2 position 2)

Recognized: G, A and I 0 I1

"72

Not Recognized: U and C FIG.2. Structural features of the stop signal essential for its function in translational termination. Four essential elements are proposed. Element I contains structural features of the first nucleotide. Element I1 is a constrained recognition element found in either the second or third nucleotide position of the signal, dependent on the particular signal. Element 111 is a less constrained recognition element also found in either the second or third nucleotide position. Element 1V is represented by structural features in the base immediately following the stop codon. The arrows indicate the possible common bonding potential of bases U and G , and bases C and A, respectively.

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

303

decoding protein is RF-1, and the third position when the decoding protein is RF-2. Certainly, either the second or the third position must be A, because UGGN is not recognized as a stop in viuo (J. F. Curran, E. S. Poole, W. P. Tate and B. L. Gross, unpublished). In contrast to element II, element III is more relaxed, because the three purines (G, A, and I) are recognized whereas the pyrimidines (U and C) are not. This suggests that the common N-7 or N-3 positions might be important. As with element 11, the position in the signal varies depending on which protein decoding molecule is involved. In the fourth position (element IV) U or G function better than A or C as a stop signal in E . coli (discussed in Section V1,B). G and U can base-pair with the common keto hydrogen bond acceptor and the imino bond donor groups in similar orientations (shown by arrows in Fig. 2). C and A also have similar orientations in these positions, and different hydrogen bond conformations are possible (67).

111. What Really Decodes the Stop Signal? A. Models of Decoding Although it had been widely expected and assumed from the early studies of the termination event that there would be a direct interaction of the stop signal with the RFs, this idea has been rightfully challenged. Alternative ideas of decoding through base pairing with elements of the rRNA have been considered (37,68-71).As a result, three basic models for decoding the stop signal have been implied, if not directly stated: (i) that specific recognition is between the RF and the stop signal (49), (ii) that decoding is a specific base-pairing between the stop signal and the rRNA without the necessity for direct involvement of the RF (37),or (iii) that the stop signal may be basepaired to rRNA with this complex in contact with the RF, and specificity is provided by the rRNA, the RF, or both (63) (Fig. 3).

B. A Role for the rRNA 1. rRNA

AND

RIBOSOMEFUNCTION

The rise in popularity of KNA, with the discovery that RNA can have enzyme activity and that it may have been the first macromolecule of life, rekindled the hypothesis that rRNA may be the catalytic center of the ribosome (72). Curiously, after a long period when proteins were regarded as the key elements of the ribosome and the rRNA merely a “chicken wire” framework for the correct orientation of these proteins, there was a complete turnaround with the discovery of ribozymes in the 1980s. It now seemed intuitively obvious that it must be the rRNA that was functionally active,

304

WARREN P. TATE ET AL.

B

C

FIG.3. Three models of stop signal recognition. (A) The stop signal in the mRNA is recognized directly by RF. (B) The RF recognizes an altered conformation of the rRNA, induced by its interaction with the stop signal in the rnRNA, rather than the signal itself (C) The RF recognizes a double-stranded template of mRNA base-paired with rRNA, and other determinants of the rRNA.

with the proteins keeping the rRNA in its correct conformation, and not the other way around. The Shine-Dalgarno interaction has provided the only directly documented functional role for the rRNA. This interaction of a specific sequence in the small subunit rRNA with the mRNA provides an initiation signal in prokaryotes (73). RNA has been implicated in a number of the other h n c tional interactions of the ribosome (74). A bold attempt to show that rRNA alone can catalyze peptide-bond formation found that an “almost proteinfree” ribosome retains peptidyl transferase activity (75). Within this climate, it was not surprising that the generally accepted tenet of termination, that decoding is probably mediated directly by the protein factors, has been examined more rigorously. Could the rRNA be involved in this process? Ironically, over 10 years earlier it had been sug-

INFIDELITIES O F TRANSLATIONAL STOP SIGNALS

305

gested from the sequence that the 3' terminus of the small subunit rRNA might be involved in an RNA:RNA interaction with the translational stop signal (73). The terminus of the eukaryotic 18-S rRNA was complementary to UAA and UGA, and the prokaryotic 16-S rRNA to UAA, and, with wobblebase pairing, the terminus was also complementary to UAG. This suggested that decoding of stop signals might involve RNA:RNA interactions like that of sense codons, and raised the possibility that the role of the RFs might be simply to stabilize a conformation caused by the interaction, perhaps mediated from a remote position. The concept of a Shine-Dalgarno-like interaction for termination was tested with both an antibody raised against a hapten attached to the terminal base of the 16-S rRNA, and with complementary oligonucleotides, but neither of these strategies inhibited RF-mediated termination in vitro (76). On the other hand, cleavage of the 16-S rRNA in the intact ribosome with cloacin DF13, which nicks the rRNA 49 nucleotides from the 3' terminus, did render the ribosome inactive for RF binding, dependent on the stop codon (77). This treatment also blocks a number of other functions involving sense codons and so is unlikely to be specific for termination. Moreover, as more rRNA sequences became available, it became clear that few had potential complementary sequences to the stop codons (78, 79).

2. SITE-SPECIFIC rRNA DECODING OF STOP CODONS An intriguing model for UGA decoding, involving helix 34 of the E . coli 16-S rRNA (80), has been proposed (37)(mentioned in Section 11,B). Nucleotides 1199-1204 of the rRNA are tandem UCA triplets that have the potential to base-pair with UGA translational stop codons, and it was the presence of these triplets that provoked the model suggested for the decoding of UGA. The triplets were altered by site-directed mutagenesis, and it was found that deletion of both was lethal, whereas deletion of one enhanced UGA suppression (81). Phylogenetically the UCA triplets are well conserved in rRNA, although most archaebacteria, which use UGA as a stop signal, lack them, and organisms such as Mycoplasrna, which does not use UGA for termination, have retained at least one of the UCA triplets in their rRNA (78, 82). Now it appears that mutations in helix 34 have multiple effects on ribosome function, supporting the concept that helix 34 must be an essential part of the decoding center (39). Clearly, these studies show that this region of helix 34 of the small subunit rRNA is important in termination, but there have been some puzzling features. On the models current at the time of the original experiments, helix 34 was placed distant from the decoding site (80, 83). If this helix is important in decoding stop signals, the event had to occur at a site on the ribosome different from that used for decoding sense codons. The decod-

306

WARREN P. TATE ET AL.

ing site is believed to be located near the 1400 region of the 1 6 3 rRNA (8486). The specific model involving the UCA triplets could account for the decoding of UGA stop codons, but there was no basis for defining where UAA or UAG codons might be decoded. More recently, this paradox was resolved when it was shown that the 1054 alteration in the 16-S rRNA can also affect suppression of UAA and UAG codons, which suggested that the effect of the mutation on termination may not be directly on the decoding of just one codon (38, 39). The studies supporting a role for this region of the rRNA in termination were consistent with the mapping of the RF-2 binding site by immunoelectron microscopy to the area of the small subunit of the E . coli ribosome where helix 34 had been placed (87).

3.

DOESDECODING OF OCCURON THE RIBOSOME? WHEHE

THE STOP SIGNALS

This question was approached using small designed mRNAs with a single thioU residue providing the first nucleotide of the stop signal. The thioU base functions very similarly to U with respect to base-pairing and acts as a zero-length cross-linking reagent to identify those parts of the ribosome in close contact with the codon during decoding. The strategy involved the formation of a termination complex that included ribosome, designed mRNA, and RF. The complex was cross-linked by illumination at a wavelength specific for the thioU residue and then dissected to identify the position on the ribosome where the thioU had cross-linked, or whether cross-links had occurred with the R F or tRNA in the complex (Fig. 4). The cross-linked rRNA residue was in the 1400 region (tentatively identified as

photoactivation

I 330

rRNAribosomal RF proteins

tRNA

FIG. 4. Cross-linking from thioU in the stop signal of the mRNA to components of the translational termination complex. On photoactivation, the thioU residue (su)cross-links to a component of the termination complex in close proximity. This could include the rRNA, ribosomal proteins, RF, or the adjoining tRNA in the ribosomal P-site.

I N F I D E L I T I E S OF TRANSLATIONAL STOP SIGNALS

307

FIG. 5 . Regions of the 16-S rRNA implicated in translational termination. T h e approximate locations of the regions of the rRNA implicated in translational termination are indicated on the background of a schematic 30-S ribosomal subunit. Note that the evidence for t h e anti-ShineDalgarno sequence being involved is now marginal.

C1407) and was obtained whether the RF was present or not (63). However, the RF suppressed the yield of the cross-link by about half. Sense codons containing thiou residues have been found to cross-link to this region (13981419), with the exact nucleotide thought to be either C1402 or C1407 (88). A1408 could also be protected b y an A-site-bound tRNA, and this base is part of the decoding site of the small subunit rRNA (86). A diagram illustrating the regions of the rRNA implicated in translational termination is shown in Fig. 5. Biochemical studies with antibiotic probes and the partial reactions of the termination event strongly support the involvement of two regions of the rRNA, helix 34 and the 1400 region at the base of helix 44, in the same termination steps (89). Therefore, the apparent spatial separation of these regions in the ribosome was puzzling. These studies and others, in addition to the recent work of Moine and Dahlberg (39), implied that helix 34 must contact the area of the 3 0 3 subunit where decoding occurs. There is now evidence that helix 34 is part of the decoding region and not, as previously defined, in a topographically distant region of the subunit (90). Using site-directed cross-linking from a reactive thioU

308

WARREN P. TATE E T AL.

residue placed at different sites in specifically designed mRNAs, informative cross-links to different residues of the rRNA were observed. Of particular significance was that when the thioU was in the first position of the A-site (+4,the same position as the thioU in the stop codon of our mRNA:termination complex), the cross-linked position was to the 1400 region of the decoding site (identified as C1402). However, when the thioU was in the third A-site position (+6), only two nucleotides away, the cross-linked site was to helix 34 (U1052) (91). This finding influenced the production of a refined model of the 30-S subunit, in which previously unconnected topographical regions, helix 34 and the 1400 region of the decoding site at the base of helix 44,were brought together (90).The paradox of how two apparently disparate regions of the small subunit rRNA could be important for decoding of the translational stop signals now seems to be resolved. In fact, the regions are closely positioned.

C. A Role for the Release Factor Experiments using site-directed cross-linking from the first residue of an mRNA containing UAA gave the first direct evidence that the RF may physically recognize the translational stop codon. There were two key observations that led to this conclusion. First, low yields of an RF:RNA cross-link were isolated, and second, the RF depressed the cross-link from UAA to the ribosomal components (63). With UGA-containing mRNA templates (see Fig. 4),a much higher yield of R F cross-linked to the stop codon in the A-site was obtained. Moreover, this yield increased to approximately 20% of the ribosome-bound mRNA as the R F was titrated to a 20-fold molar excess (92). These results suggest that, whatever the template for decoding, the RF is an active decoding molecule making intimate contact with the RNA when the translational stop signal enters the ribosomal A-site. These studies complement earlier evidence showing that RFs affect the decoding of stop signals on the ribosome in a codon-specific manner in uitro. For example, RF-2 mediates the release of a model peptide in response to UGA and not UAG, and RF-1 mediates the converse (51).The bacterial RFs increase the af€inity of the specific stop codons for the ribosome by up to 10fold in a codon-specific manner (54). Significantly, when labeled UAA was held in this complex with RF-2, either UGA or UAA (but not UAG) could compete it out, and when it was held in the complex with RF-1, either UAG or UAA (but not UGA) could compete with the labeled UAA (93). A weak association between bacterial RFs and tetramers containing stop codons, but not the codons themselves, 1.6 x 104). However, had been observed using equilibrium dialysis (Ka the association was not R F or stop-codon specific (53).

-

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

309

IV. What Is the Mechanism of Decoding Translational Stop Signals?

A. The Template for the Decoding Event The simplest model for the template for stop signal decoding by the R F would be that the factor binds directly to the single-stranded mRNA. The rRNA and proteins would provide binding determinants for the factor so it had just the right orientation to the codon in the ribosomal A-site. However, there is a provocative alternative, namely, that decoding is actually an RNA:RNA event involving the rHNA and the single strand of the mKNA, with t h e R F recognizing this complex (see Fig. 3C). In this model the RF, when bound to this complex, would trigger the subsequent events of polypeptide chain termination, and in particular the hydrolysis of the completed polypeptide from the terminal tRNA. This RNA:RNA decoding scheme for stop codons is attractive if the modern ribosome evolved from a RNA protoribosome, and if a specific mechanism for termination had originated before proteins became part of the ribosome structure. There is circumstantial evidence that supports this idea. A thioU cross-link from the first position of the stop codon, when positioned in the A-site, was to C1407 in the singlestranded region linking helix 44 to helix 28 of the 16-S rRNA. The unpaired nucleotides in positions 1406-1408, U-mW-A, have the potential to form two base-pairs with UAA and UAG and three base-pairs with UGA, and could thus form a double-stranded template that stacks on helix 44 that is recognized by the R F (63). Additionally, a third base-pair is possible with UAA and UAG through an interaction of m5C with the A in the second position of these codons. Such an ni5C.A interaction occurs during the decoding of UGA as selenocysteine by a specific tRNA, where m5C is able to base-pair with A in the wobble position (94). An mRNA containing UAAU, with thioU in the first and fourth positions (s4U-A-A-s4U), cross-links to both C1407 and C139S of the rRNA (92). This could be explained if the rRNA forms a single-stranded A helix, thereby placing C1395 near the base following the stop codon. The picture that emerges is that the R F probably recognizes a number of nucleotides of the rRNA in and around the decoding site, perhaps some in a base-paired structure with the mRNA, and perhaps some unpaired. Therefore, codon recognition by RF seems to b e controlled by contacts with the termination signal in the mRNA and with adjacent decoding site rRNA (1400 region, 530 loop, and helix 34, and might involve contacts with a number of ribosomal proteins (S3, S4, SS, S7, S10, L7/L12, and L l l ) , which have been shown, in a wide range of studies, to influence the termination event (reviewed in Ref. 95) (Fig. 6).

310

WARREN P. TATE ET AL.

FIG.6. Ribosomal proteins and rRNA features in possible contact with the release factor. Ribosomal components implicated in interactions with the release factor are illustrated in the schematic diagram with their approximate positions indicated. The dotted line represents the 504 ribosomal subunit and the solid line represents the 30-S subunit of the E . coli ribosome.

B. Coupling of Codon Recognition to PoIy pept ide ReIea se It has been proposed that the RF can span the two active centers on the ribosome from the decoding site to the peptidyltransferase center, rather like a tRNA (96) (Fig. 7). The E. coli RFs have a protease-sensitive site in what appears to be a highly exposed loop in the region of the protein structure most highly conserved between the two RFs, and also among the RFs from different prokaryotic organisms or organelles. This loop is within domain 111 of a proposed theoretical model for the structure of the RFs (97), and this domain is a promising candidate for interaction with the highly conserved region of the rRNA, encompassing the peptidyltransferase activity (96, 97). Proteolyic cleavage at the sensitive site in the RF loop abolished peptidyl-tRNA hydrolysis activity at the peptidyltransferase center. At the same time, proteolytic cleavage potentiated codon recognition at the decoding site. These contrasting effects could be explained by a conformational coupling mechanism between two R F domains; that is, between the domain involved in codon-dependent ribosome binding at the decoding site and the domain involved in an interaction with the rRNA at the peptidyltransferase center (96).

311

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

Peptidyltransferase

b

Hydrolysis domain Codonlribosome binding domain

5

)8

LL

2

d

Decoding site FIG. 7 . A tRNA analog model for the organization of release factor domains. A pictorial representation of the in vitro assay for release factor with the factor bound in the ribosomal A-site. The factor spans the two active centers of the ribosome, the decoding site and the peptidyltransferase site. In this way it is analogous to the two arms of a tRNA (shown here in the ribosomal P-site), which also spans the two active centers of the ribosome.

C. Can the Termination Mechanism Explain Inefficiencies in Decoding Stop Signals?

Current understanding is that decoding of stop signals involves a careful positioning of the RF with respect to the stop codon in the ribosomal A-site. That this involves a number of interactions with the rRNA and ribosomal proteins seems highly likely. With this scenario, any feature of the mRNA that distorts the normal position of the stop codon in the A-site, or allows it to form interactions with the rRNA, could perturb the delicate balance of the termination event. If such a balance were disturbed, then it is not too difficult to imagine that the specific decoding mechanism, mediated through the RFs, might compete less well with other potential events. Such competing events include noncognate decoding by tRNAs, or recoding mediated by specific features of the mRNA that have evolved for a physiologically important alternative translational event at the site of the stop codon.

V. Recoding at a Stop Codon

A. Frameshifting In 1985, an in-phase UGA stop codon was discovered at codon position 26 in the gene for E . coli RF-2. The full-length protein is obtained through a + 1 frameshift around that site (98)(Fig. 8). At the time it seemed likely that this was a unique example carefully tailored as an autoregulation circuit for

312

WARREN P. TATE ET AL.

A

B

GCA AQC TTC ClT-

TAT C l T W

T

A COG QAT CCA A

p

TGG TGC

C

D

dl

A M U UUA

UGAAGGUM

dl

GCU CGU GUC G

U M U GGC GC

FIG. 8. Recoding sites in mRNAs of genes whose expression depends on unusual translational events. (A) The prokavotic RF-2 mRNA and (B) the eukaryotic ornithine-decarboxylase antizyme mRNA promote + 1 frameshifting. ( C ) The Rous sarcoma virus RNA promotes - 1 frameshifting on eukaryotic ribosomes. (D) The UGA codon in prokaryotic formate-dehydrogenase mRNA is used as a site for selenocysteine incorporation. In addition to the stop signal, stimulators for the recoding events are indicated by the underline and overline in A, and by the secondary structures in B, C, and D.

fine-tuning control of RF-2 cellular concentration. RF-2 recognizes UGA specifically, and so could feed back to decode the internal UGA in its own mRNA, hence controlling whether an active molecule is synthesized or whether premature termination occurs. The implication of this is that there is competition at the stop codon site between termination of protein synthesis and frameshifting, which would allow translation to continue with successful synthesis of the RF-2 molecule. Measurement of the relative efficiencies of the stop and frameshift events both in vivo and in vitro revealed a surprisingly high failure rate of the stop codon. About three out of ten ribosomes frameshifted instead of terminating synthesis (99-101). However, the stop-signal efficiency increased to 100%by elevating the concentration of RF-2 in vitro (although still maintaining it within the ranges of cellular concentration of the factor). Conversely, a defective RF-2 with an altered amino-terminus increased in vivo frameshifting to 100% at the site (101).This strongly suggests that the regulation circuit can operate in vivo, with the potential to block RF-2 synthesis completely as

INFIDELITIES O F TRANSLATIONAL STOP SIGNALS

313

cellular concentration of the factor rises, and to release the control as cellular concentration falls. A gene where there is an autoregulatory circuit involving a frameshift at a stop codon has also been discovered in mammals. Ornithine decarboxylase antizyme, an enzyme responsible for destabilizing the decarboxylase for subsequent rapid degradation, requires a + 1frameshift at the codon preceding the stop codon in the sequence UCC-UGA-U (102, 103) (Fig. 8). The stimulators for this frameshift event are the stop codon itself, a downstream pseudoknot, and an upstream sequence. The reason for the importance of the upstream primary sequence is not yet understood. Polyamines, the products of the ornithine decarboxylase-catalyzed reaction, stimulate the synthesis of antizyme, which in turn marks the decarboxylase for degradation, limiting the further synthesis of polyamines. Why does the stop codon perform so poorly in these circumstances? There are three possibilities. First, there are stimulatory elements for frameshifting that simply override a stop codon that would normally signal stop with 100% efficiency. Second, the codon itself is part of a signal that is particularly poor and is linked with a slippery motif. Third, there is a combination of these two possibilities. Evidence to date suggests that a combination of a poor stop signal and positive stimulatory elements for frameshifting are important. For example, stimulatory elements are present in the E . coli RF-2 gene frameshift site. A Shine-Dalgarno element in the mRNA six bases upstream from the stop codon interacts with the rRNA. In addition, a leucine codon (CUU) immediately 5’ to the UGA stop contributes to a run of Us that, together with the Shine-Dalgarno element, are critical determinants for + 1 frameshlfting (99, 104). However, it should be emphasized that the stop codon here is also a stimulator for frameshifting; without it, the efficiency falls significantly. The stop codon may induce a decoding pause at the site, with subsequent frameshifting efficiency relating to the nature of the stop signal and how quickly it is decoded. The synthesis of E . coli RF-2 and ornithine decarboxylase antizyme are examples of a +1 frameshift event. However, many retroviruses use a -1 frameshifting event to regulate the balanced production of proteins (Gag and Gag-Pol) that are needed in quite different amounts. The primary sequence signal here is D-DDW-WWM (where D is A, U, or G; W is A or U; and M is C or A) with a stimulator, either a stem-loop or a pseudoknot, as a downstream secondary structure (22). An interesting feature is that some retroviral sites contain stop codons as the next codon after the frameshifting or slippery sequence signal. For example, Rous sarcoma virus has the sequence A-AAU-UUA-UAG (Fig. 8). The generally accepted mechanism for the frameshifting event is a - 1 simultaneous slippage at the AAU-UUA codons

314

WARREN P. TATE ET AL.

in the ribosomal P- and A-sites, respectively (105). In this mechanism, the stop codon would be outside the ribosomal sites when slippage occurs and should have no more effect than a simple context influence. Why then is it present at a significant number of the sites? A stop codon immediately downstream of the slippery sequence could prevent frameshifting at sequences downstream of the primary site when frameshifting fails to occur at this position. Translation should then be continuing in the zero frame. Without a stop codon, decoding through the homopolymeric slippery sequence might significantly enhance slippage above the natural low level of frameshifting downstream from the primary site and result in the synthesis of undesirable protein products. Indeed, a stop codon two codons downstream from the shift site in HIV-1 does reduce the background signal (106). Alternatively, the stop codon may influence the frameshifting event directly. If the simultaneous slippage of the two tRNAs is from the ribosomal E/P-site rather than from the P/A-site, the stop codon would be in the ribosomal A-site and have the potential to influence the frameshift event (Fig. 9). Simultaneous P/A-site slippage should be independent of changes

A. Simultaneous Slippage

-.

B. P-E Slippage Post-Translocation

FIG.9. Two models of simultaneous slippage for - 1 frameshifting. In model A (105), the codon immediately downstream from the heptanucleotide motif is not associated with the ribosome and therefore should not directly influence frameshifting. In model B (107),the codon downstream from the motif can influence frameshifting because it is in the ribosomal A-site when slippage occurs. Experimental evidence supports the proposal that P- and E-site slippage, as described by model B, can occur on prokaryotic ribosomes.

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

315

in the concentrations of the RF because it represents “normal termination” at a presumably strong stop signal. However, simultaneous E/P-site slippage is like the RF-2 1frameshift event described above, where there would be coinpetition between slippage and decoding of the stop codon in the zero frame, and frameshifting should be influenced by modulating levels of the RF, just as it is in the RF-2 fraineshift window. The HIV-1 Gag-Pol frameshift site was used to test these ideas (107). Unlike some other retroviral sites, this site does not have a stop codon following the slippery heptanucleotide sequence but rather a GGG glycine codon. This codon was changed to each of the three stop codons and, because it was possible to manipulate RFs in E . coli and not in eukaryotic systems to date, the experiments were carried out using prokaryotic ribosomes. Fortunately, the -1 frameshifting event occurs with this heptanucleotide motif on either class of ribosome. The presence of any of the three stop codons decreased frameshifting at the site to 10-20% of the value obtained with the natural sequence, and this could be further depressed by overexpressing RFs from exogenous plasmids. In contrast, a defective RF increased the frameshifting efficiency at the site (107). The implication of these findings is that the frameshifting efficiency can be influenced by an RF-mediated mechanism, and this has led to the proposal that simultaneous slippage occurs from the E- and P-sites with the next codon in the A-site influencing the event, rather than from the P- and A-sites. This provides an explanation as to why stop codons are often found at the -1 frameshifting motifs, although the question is still open as to whether these recoding events have evolved around a natural stop codon, or whether the stop codon has been a later addition to the motif as a means of fine-tuning the regulation of the two events.

+

B. Hopping over the Stop Codon Slippage or frameshifting can occur over a considerable distance. This has been called translational hopping, and the now classic example is that found with the phage T4 topoisomerase subunit gene 60. Fifty nucleotides of mRNA are avoided during translation of this protein (108).In this case, there is an identical codon at the take-off and landing sites. A stop codon at the 5’ junction is avoided by the hop, and a number of other determinants are also important, such as an interregional hairpin. A ribosomal mutation in L9 substitutes in part for the function of the hairpin, enhancing hopping either in its absence or when the hairpin is altered (109).In this example the stop codon again is part of a series of stimulators, although with the carA gene of Pseudornonas aeruginosa, a hop occurs during mRNA translation without a stop codon being present at the 5’ take-off junction (110).

3 16

WARREN P. TATE ET AL.

C. Stop Codon Takeover: Selenocysteine Incorporation into Proteins The termination codon UGA has been known since 1979 to specify tryptophan, and not stop, in certain genetic systems. It is now known to also act as a codon specifying selenocysteine, the twenty-first amino acid, in genetic systems using the universal genetic code where UGA also specifies stop. This is a unique situation where this one codon has two functions. A small number of proteins from a wide range of organisms incorporate selenocysteine into the growing polypeptide chain in response to an in-frame UGA. Examples of these proteins are formate dehydrogenase in bacteria, and glutathione peroxidase and type-I iodothyronine 5'-deiodinase (5'-DI) in mammals. Four genes are essential for selenoprotein synthesis in prokaryotes (111). The product of one of these genes, SelC, is a unique tRNASer that cannot be recognized by elongation factor Tu, but is recognized by a specific elongation factor that is the product of the selB gene. A sophisticated mechanism has evolved for the incorporation of selenocysteine at internal UGA codons. Immediately following the UGA codon in E . coli formate dehydrogenase is a stem-loop structure that binds the SelB elongation factor carrying the specific Sec-tRNA (see Fig. 8). This proximity allows the amino acid to be delivered for incorporation into the growing polypeptide when the UGA codon enters the ribosomal A-site. It is not known yet whether this mechanism simply overrides the termination event by excluding the RF from binding in a termination complex with the UGA, or whether there is active competition between the two processes for use of the codon. Eukaryotic selenoprotein synthesis has some similarities to the prokaryotic pathway. There is a unique tRNA, and the codon used to specify selenocysteine is UGA. However, a major difference is that there is no stemloop immediately downstream of the UGA to provide a binding site for a specific elongation factor. Instead essential stein-loop structures are present in the 3' untranslated region hundreds of nucleotides away from the UGA selenocysteine incorporation site (112-114) (Fig. 10). The exact structural motifs in the 3' untranslated region necessary for successful translation have not been identified but there are some structural similarities between these regions of the different selenoprotein mKNAs. An intriguing example is selenoprotein-P which contains 10 internal UGA codons but only two secondary structural elements in the 3' untranslated region that apparently serve all of the selenocysteine incorporation sites (115).

D. Readthrough of Stop Codons: Failure to Stop It has been suggested that translational termination might be thought of as a programmed recoding event (116). There is competition between two

317

I N F I D E L I T I E S OF TRANSLATIONAL STOP SIGNALS

Selenoprotein-P W- UGAC-UGAC-UUGAC-UGAC-UGAA-UGAkUGAA-UGAA-UAAA

-AuG--uGAu

Glutathione peroxidase -Am

If

B*

UAAG

UGPG

B

-$,*

Type I iodothyronine 5'-deiodinase -AuG

w;Ac

FIG.10. Secondary structural elements in the 3' untranslated region of mammalian selenoprotein mRNAs. Elements in the 3' untranslated regions of mammalian selenoprotein mRNAs (SECIS elements), which are essential for the incorporation of selenium into the proteins, are shown. Selenoprotein-P has two elements (I and 11) and glutathione peroidase and 5'-deiodinase have a single element (I).

processes-normal decoding by near-cognate tKNAs (or cognate tRNAs where suppressors are present), and the decoding of stop codons by RFs. Occasionally, a mutation in the ribosome may change the balance between these competing events to favor one of them. An example of such a mutation favoring readthrough are the mutations in rRNA described in Section II,B. From this concept it is possible to imagine how some genes might have evolved a mechanism of controlled readthrough of stop codons for the specific purpose of producing two proteins from the same piece of nucleic acid. There are increasing numbers of examples of this mechanism of dual coding. One early study of bacteriophage Qp KNA described readthrough of a UGA stop codon where tryptophan was inserted at nearly 3% efficiency (28).This low-efficiency readthrough produced a much longer product that is essential for infectivity (117). A similar situation exists in tobacco mosaic virus, where 10% readthrough of a UAG codon results in a protein also essential for

318

WARREN P. TATE ET AL.

infectivity (118). Several other unrelated plant viruses have been found that use this dual-coding UAG-readthrough mechanism. They fall into two groups, each with a different conserved motif, CAA-UAG-CAR-YYA (31) or AAA-UAG-G (119). Conservation of these elements suggests that they are significant for the maintenance of readthrough efficiency. Some members of the alphavirus family, such as the Sindbis virus, use controlled readthrough of a stop codon to synthesize a replicase subunit. However, in contrast, other members of this family use readthrough of a sense codon and not a stop codon to translate this protein (120,121). Several retroviruses use readthrough, as opposed to frameshifting, to produce the Gag-Pol fusion protein. For example, the Murine leukemia virus reads through a UAG with 10%efficiency in t-ioo, with a major stimulator being a downstream pseudoknot. All three termination codons are inefficiently read as stop at the site (122). This seems to represent the classic example of where the pseudoknot has perturbed the overwhelming advantage that the RF normally has for decoding a stop codon. Decoding by noncognate tRNAs, although still the minority event, is more favored.

E. Stop Codons: Faithful Codons Most of the Time? For many years, we have thought of stop codons and sense codons as faithfully participating in a major translational event with a high degree of efficiency. The picture that has now emerged is that, although in most cases the stop codon is faithfully decoded with very high efficiency, it is subject to a multitude of “temptations” to which the sense codon is not. There are hidden infidelities in the function of the stop codon that are not at all obvious from simple examination of the primary sequence of the mRNA.

VI. Does the Termination Signal Extend beyond the Stop Codon?

A. Natural Stop Codons Because healthy E . coli mutants containing high-efficiency suppressors of UAG and UGA could be isolated, it was thought that UAG and UGA might be rare termination signals. Over 2500 E . coli genes available in the database have now been examined; these represent about half of the likely genes in this genome. UGA occurs at the end of about 25% of the E . coli genes analyzed, and UAG at the end of about 5% (65). Furthermore, the double stop codons (UAGUGA), found in some bacteriophage genes, that would protect against suppressors are uncommon in bacterial chromosomal genes

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

319

(65). A large proportion of wild-type coliform populations also contain UAG suppressor tRNAs (123), as do many commonly used laboratory strains of E . coli-for example, HBlOl and JM105 (124). These mutations are quite stable, and although they can be reverted with a single point mutation, this does not usually confer a selective advantage. Therefore, it is puzzling why these suppressors are tolerated. In contrast to this situation, UAA suppression is generally very inefficient and cells carrying these suppressors grow slowly, suggesting that suppressor tRNAs are indeed interfering deleteriously with the termination of protein synthesis at these codons. There are two RFs to compete with the suppressor at the UAA codon, and this may contribute to the low levels of suppression observed (125).Studies in eukaryotic systems suggest that UAA is also particularly resistant to suppression. UAA suppressor mutants have been difficult to isolate, perhaps due to toxicity, whereas many UAG and UGA suppressors have been described (94). However, the resistance of natural stop codons to suppression perhaps indicates that an additional sequence element, or the general codon context, protects these codons.

B. An Analysis of the Contexts of the Stop. Codon The discovery of the subtle ways in which stop codons can function in cells, and of their resistance to suppression, warranted a further examination of what actually constitutes a termination signal. Is it just the termination codon as defined by the genetic code, or does the codon represent the core of the signal with important elements extending beyond? If the signal for translational termination were larger than the triplet codon, then statistical evidence might be found in the sequence surrounding the codon within genes, because additional features of the signal might be constrained during evolution. Indeed, statistical analyses of the sequences around stop codons have shown localized nonrandoinness (126, 127). In particular, there is a significant bias in the nucleotide immediately following the stop codon (Fig. 11).For each organism where there were sufficient gene sequences in the databases for a statistical analysis, the same pattern emerged. The pattern was particularly marked in E. coli, Bucillus subtilis, Saccharomyces cerevisiae, and Drosophih inelanogaster, and also obvious in genes from a group of mammals. However, it was not the same nucleotide that contributed to the fourth base bias in each organism. Although there was something significant about the fourth base in general terms, the specificity varied from one organism to another. An analysis of the sequences of the 2500 E . coli genes available in the database reveals a strong bias in the frequency of occurrence of particular four-base signals. Of the twelve possible four-base signals (UAAN, UAGN, and UGAN), UAAU and UAAG were much more frequent as stop signals

320

WARREN P. TATE ET AL.

-

B

E. coli

A

-

67 x

67x

Lo Lo

200-

z

al

0 0

0 U

e

5 c

L

g 400

k g 200

a

100-

1 1

0) u)

8

E'

Mammals

73 C

Lo

0

0

z

z 0 -20

-10

-1 +4 +10

Relative position

+20

0 -20

-10

-1 +4 + I 0

+20

Relative position

FIG. 11. Statistical analysis of the context of E. coli and mammalian termination codons. The sequences around the termination codons of 2513 E . coli genes and 5208 mammalian genes were analyzed. The nonrandomness (x') observed at each position was determined (36, 126).

than expected from their occurrence in noncoding regions, whereas UAAC occurs at the expected frequency, and apparently the other nine signals have been selected against. It should be noted that the occurrence of these tetranucleotides in the noncoding regions of genes is also nonrandom, reflecting nonrandom distribution of di-, tri-, and tetranucleotides in the whole genome. Highly expressed genes in E . coli have a strong bias for sense codons, and it is possible to select this subset of genes from the main set for analysis. This set tends to be genes that encode proteins involved in information flow in the cell, such as the transcriptional and translational components. The rationale for considering highly expressed genes was that the codon bias in some of the genes indicates an evolutionary pressure for the optimization of their translational elongation. Therefore, they might be expected to use optimal termination signals so that the translational rate is not compromised. In this case the results are quite dramatic; most of the subset (-80%) use UAAU and UAAG, with the hierarchy of usage in this subset being UAAU > UAAG > UGAU > UAAA UAAC. The other six signals are used either rarely or not at all (34). Several studies of eukaryotic stop codon contexts suggest that purines are the most common 3' nucleotide following the stop codon (127-130). The sequences around the stop codons of over 5200 mammalian genes has now been compiled as a database representing six species: human, mouse, rat, cow, pig, and rabbit. In addition to the obvious bias in the fourth base, there was also nonrandomness in the fifth base, and indeed in the first eight

-

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

321

positions following the stop codon as well as in the three positions preceding the codon (36). This nonrandomness in the fifth base and beyond was not seen in the analysis of the E . coli genes. In this case, of the bases following the stop codon, the fourth base alone showed striking bias, although there was also some nonrandomness in the preceding codons (126). The occurrence of the four-base signals in the mammalian genes reflects the frequency of these sequences in the noncoding regions. However, G in the fourth position is more abundant, and U is less abundant than expected ( P < 0.001).The frequencies of the four-base signals in a subset of genes such as globins, immunoglobulins, histones, actins, and albumins have been examined and found to be quite different from the complete dataset. These are putative highly expressed genes, but the definition of a such a set of highly expressed genes in the mammalian genome is more arbitrary than those defined in E . coli. Despite this limitation, certain signals (UAAG and UAGG) are overrepresented, whereas other signals (UGAC, UGAU, UAGC, and UAGU) are not used at all. Individual analyses with each particular mammalian species gave essentially the same information as that derived from combining the species (36).

C. Does the Fourth Base Affect Trans lat io na I Term ination ? These analyses suggested that a particular subset of four-base termination signals might be used in special circumstances at the end of highly expressed genes. However, different subsets are favored in mammals and E . coli. What do these context biases mean? The critical question is whether there is a hierarchy of termination signals including the fourth base, and perhaps the fifth and beyond, that influence how the stop codon functions in a cell. Statistical analyses of maininalian genes shows a biased context for initiation codons (131)and this context profoundly influences the initiation step (132). In contrast, codon biases for the other sense codons are determined by the regional (G + C)-content of the genome and probably do not have functional significance (133). It has been tested experimentally whether the termination codon context bias had any functional significance for polypeptide chain termination. For the prokaryotic system, the strengths of each of the twelve possible four-base stop signals were tested in an in vivo assay, using the direct competition of termination with frameshifting at the RF-2 frameshifting sequence motif. For frameshifting to occur, the elements 5' to the stop codon promoting the event must compete with translational termination at the stop codon. A change in either the concentration of RF-2, which recognizes the natural UGA(C), or the specific activity of the factor can change the efficiency of termination over a O-lOO% range (101). With the natural context, the two

322

WARREN P. TATE E T AL.

events compete almost equally, with termination efficiencies of 50-70% (100, 101). If the stop signal is altered in this system, the degree of frameshifting should be influenced by the efficiency of the sequence to signal stop. An immunological assay has been used to measure the ratio of the termination and frameshift products, synthesized in vivo from constructs varying only in their termination codon or fourth base. The termination efficiency varied widely within the UAAN and the UGAN sets. The order of efficiency correlated with how frequently the signals were used in natural contexts (derived from the statistical analysis) and were in the heirarchy of fourth base U > G > A > C (Fig. 12A). However, from the statistical analysis alone the other set of signals, UAGN, might be predicted to be poor, but this was not the case, because UAGN signals were as efficient as the other two sets (35).TAGN occurs with very low frequency throughout the whole genome, and the reason for this could be that UAGN stop signals might be mistakenly altered to a sense codon by the mechanism for vsp- or vsr-initiated DNA mismatch repair that operates at CTAG or TAGG sequences (134, 135). The usr gene product is a DNA mismatch endonuclease.

"

U

G UGAN

A

C

U

G A UGAN

C

FIG. 12. Influence of the fourth base in tetranucleotide stop signals on the efficiency of translational termination. In E. coli the relative strength of each of the four UGAN signals was determined as a competitor of fl frameshifting at the RF-2 frameshift site (35). In the mammalian system, the strength of each of the UGAN signals was determined in 5'-deiodinase mRNA as a competitor to selenocysteine incorporation at the site. The rat mRNA was transfected into human embryonic kidney cells for this experiment (36).

323

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

The rates of RF selection were calculated at each of the twelve termination signals (adapted from Ref. 136). The RF selection rate varied over a 50fold range, with that at UAAU the fastest and at UGAC the slowest (35). The influence of the fourth base has also been tested with mammalian termination signals in two ways. First, the recoding event at the internal UGA codon of the 5’-DI mRNA was used so that the strength of the termination signal could be matched against incorporation of selenocysteine at the codon in uivo. The base following this UGA was changed from the natural C to each of the other three bases. It was found that the ratio of termination to selenocysteine incorporation at this codon varied from 1:3 (C or U in the fourth position) to 3 : l (A or G in the fourth position) (see Fig. 12B). These U) as UGAN sequences had the same termination efficiency (A G > C was found in an in vitro termination assay that measured the release of a model peptide from the ribosome by the eRF. In addition, the fourth base in each of the UAAN and UAGN series also affected the efficiency of the release assay in the same order, varying over a 70-fold range for the UAAN series, and an %fold range for the UGAN and UAGN series (36).Again, as in the prokaryotic case, there was a strong correlation between the “strength’ of the signals and how frequently they occur at natural termination sites, but with eukaryotes it was the purines, A and G, rather than the pyrimidine, U, that boosted the efficiency of the signal. UGAU was determined to be a weak signal from in vitro termination studies (36), and a fourth base U is found at the frameshift site in the mammalian ornithine decarboxylase antizyme mRNA. Alteration of this fourth base showed that purines in the fourth position make a stronger competing stop signal for the undefined frameshifting mechanism (102).The data from this example provide modest support for this model of termination signal strength. However, it should be realized that there are several stimulating elements for frameshifting at this site, and although the stop signal is essential, its relative strength may be a less dominant influence than that of the stem-loop structure that follows. Suppression studies indicate that the measurement of termination rate in vitro, and its support from the statistical analysis, predicts very well the outcome of competition between termination and other events at some sites, but not at others. A natural stop signal, UGAC (the signal also found at the selenocysteine incorporation site of the 5’-DI mRNA) permits 10% readthrough in reticulocyte lysates when there are no apparent competing events (29). Suppression of UAGN in human cells by a mutated tRNA showed UAGC is the signal most readily suppressed. However, UAGU is poorly suppressed (137),which is not what would be predicted with the statistical analysis and the in uitro analysis of stop-signal strength. Translational termination in yeast has been reviewed recently (60). Ter-

-

-

324

WARREN P. TATE ET AL.

mination is most efficient at internal UAA stop codons followed by a purine, when suppression by noncognate tRNAs is excluded (138, 139). These observations correlate with the strong use of UAAR (R = purine) signals in highly expressed genes in S. cerevisiae (127), and suggest that yeast and mammals may have a similar hierarchy of termination signal strengths. Significantly, the proteins from yeast and mammals with properties of the decoding molecule e R F l have strong homology (61), and the equivalent protein from X. laevis functions in yeast (140). Clearly, the fourth base modulates the efficiency of the termination signal quite markedly throughout the prokaryotic and eukaryotic kingdoms, and consideration must be given now as to how one defines the signal for termination. Should the fourth base be regarded as a strong context influence or part of the signal itself? Two distinct models are possible for the effect of the fourth base on termination efficiency. Either this base may influence the conformation of the stop codon and thereby influence its recognition by the RF, or it may be recognized directly by the factor along with the codon. If direct recognition can be shown, there seems to be a good case for including this base as part of the termination signal rather than thinking of the signal as only the triplet termination codon.

D. Recognition of the Fourth Base by the Release Factor? When thioU was used to cross-link from stop signals in small ribosomebound mRNAs to the bacterial RF, the yield of the RF.mRNA complex depended upon the identity of the fourth base for the UGAN series of stop signals tested (92). This suggests that the fourth base of the signal affects the interaction between the factor and the stop codon, with purines at this position promoting more cross-linking than pyrimidines. Because there is no evidence that the complex is stabilized by fourth base purines, a more likely alternative explanation may be that the conformation of the thioU in the first position is altered by the fourth base purine so as to improve cross-linking. This may imply that the stop signal is in a stacked conformation during decoding, the effect being similar to that seen with a +4 purine on sense codons (141). A thioU in the fourth base position as well as the first did not increase cross-linking to RF. This may be because of poor orientation, or the residue may have a closer orientation to rRNA than to the RF. Cross-linking to rRNA was increased when thioU was in both the first and fourth positions of the signal (92). The cross-linking of stop signals to the 1400 region of the rRNA and the RF is consistent with either of two orientations for the bases of the stop signal. They could be oriented toward the RF, which then makes direct interactions with them. For example, the common keto hydrogen-bond ac-

INFIDELITIES O F TRANSLATIONAL STOP SIGNALS

325

FIG. 13. The UGAG stop signal modeled as an A helix. The large arrow indicates the proximity of the photoactivated thioU to any part of RF-2 interacting with the N-7 atom of the adjacent G . Small arrows indicate potential stacking interactions. Coordinates were generated by the program MC-SYM (153).

326

WARREN P. TATE ET AL.

ceptor and the imino hydrogen-bond donor groups of the fourth base U or G (see Fig. 2) might be hydrogen bonded to an amino acid in the RF structure. Equally possible, the backbone of the mRNA could make interactions with the RF leaving the bases to pair with the rRNA. The fact that the substitution of dU for U in the first position of a termination codon prevents in oitro termination supports some kind of backbone interaction of the mRNA with one of the other components of the termination complex (64).The important primary structural determinants in the stop signal are consistent with either RNA or protein recognition of stop codons, as either could form hydrogen bonds to the key positions in the signal. The secondary structure of the stop signal in the mRNA is unknown, but RNA is conformationally flexible. If it is modeled as a single-stranded A-helix, the cross-linking moiety of the thioU-containing signals would be located immediately over the common N 7 of the second base A or G in UAAN and UGAN, the signals recognized by RF-2 (Fig. 13). Hydrogen bonding may occur from this N7 to the RF and this could explain why cross-linking from the thioU to the RF is possible, because any part of the RF molecule contacting N7 would be close to the activated ring. The challenge now is to identify the sites on RF-2 that make contact with the stop signal.

VII. Physiological Advantages of Stop Signals Decoded with Varying Efficiencies A. The Inefficient Stop Signal 1. GENE EXPRESSION AND SHIFTS IN PHYSIOLOGICAL CONDITIONS

If a termination signal can be multifunctional and perhaps influenced by cell physiology, then an extra layer of cellular regulation is possible (142). Examples of this are the internal TGAC in the RF-2 gene and in the formatedehydrogenase gene of E . coli. In the first case, the synthesis of RF-2 can be regulated according to the cellular concentration of the factor, because RF-2 functions at the site of the internal UGA codon in the mRNA to release a short nonfunctional premature termination product. Premature termination is in competition with translational frameshifting to avoid the stop codon and allow complete synthesis of the protein. Hence, there is an autoregulatory circuit that operates according to the physiological requirements of the cell (98, 101). In the second example, formate-dehydrogenase expression has captured a niche to utilize the micronutrient selenium in an oxidation/reduction reaction with the selenium atom as part of the active center of the enzyme. However, the synthesis of three isoenzymes of formate dehydrogenase in E.

INFIDELITIES O F TRANSLATIONAL STOP SIGNALS

327

coli depends on the availability of selenium and the physiological state of the cell. The three isoenzymes are not synthesized simultaneously. In fact, there is a low carbon flux into selenocysteine biosynthesis (143).

2. SELENOPROTEIN SYNTHESIS IN MAMMALS AND SELENIUM AVAILABILITY Mammals have developed a hierarchy of selenium distribution between and within different tissues (144, 145). The distribution is most likely controlled at the translational level. In tissue culture, 5’-DI is synthesized preferentially over glutathionine peroxidase (GP) because it presumably competes better for the available selenium. 5’-DI is synthesized when selenium concentrations are below 2 nM and saturates at 10 nM, whereas G P is not synthesized below 2 nM and saturates at approximately 1 pM. It is interesting that 5’-DI has a poor stop signal, UGAC, and G P has a strong stop signal, UGAG, at the site of selenocysteine incorporation. The intriguing possibility arises that selenium utilization is controlled by the relative efficiency of the stop signal in the selenoprotein mRNAs, and the unfaithful stop codon may be playing a highly significant role in important physiological processes.

B. The Highly Efficient Stop Signals 1. GENE EXPRESSION AND GROWTHRATEI N BACTERIA

Protein synthesis becomes a more dominant activity of the cells as the growth rate of bacteria increases (146-148). Maximal growth rate is achieved by increasing the eficiency of translation and by increasing the concentrations of the components of the translational apparatus. The number of the RFs per cell increases with growth rate. However, as cell volume changes, this increase is sufficient only to maintain cellular concentration, and does not account for an increase in the rate of any steps of translational termination (148). The predominant genes that are expressed at maximal growth rates are the “highly expressed” genes. Such genes almost exclusively use two four-base signals for translational stop, UAAU and UAAG (Table 111). These signals are decoded at a rate many-fold higher than the average (35). Hence, the fourth base can influence the translational rate in a way that is significant for the organism, with the more efficient stop signals potentially providing an increase in the termination rate. This phase of translation would not then limit the increase in protein synthesis that is necessary for an increase in growth rate of the organism.

MAMMALIANGENES 2. HIGHLYEXPRESSED Genes such as those for globins, histones, actins, immunoglobulins, and albumins, which can be regarded as the equivalent group to the highly

328

WARREN P. TATE ET AL.

TABLE 111

THERELATIVE FREQUENCY OF OCCURRENCE OF TETF~ANUCLEOTIDE STOP SIGNALSIN E . coli AND MAMMALIANGENES“ Highly expressed (%)

Total (%)

Codon

E . coli

Mammals

E . coli

Mammals

UAAA UAAG UAAU UAAC UAGA UAGG UAGU UAGC UGAA UGAG UGAU UGAC

5.3 21.3 52 4.5 0 0.4 1.2 0 2.5 1.2 10.7 0.8

22 20.7 3.7 4.9 8.5 12.2 0 0 9.8 18.3 0 0

13.7 12.8 25.5 9.8 1.6 1.4 2.4 2.4 7.4 5.9 12.6 4.5

10.5 7.9 5.9 5.2 6.7 6.5 3.5 5.2 13.8 19.4 6.9 8.6

a A total of 2455 E. coli genes were analyzed; the highly expressed subset numbered 250. Out of a total of 5208 mammalian genes analyzed, the highly expressed subset numbered 82. The signals that are used at a higher frequency in highly expressed genes are shown in bold type (35. 65).

expressed genes of bacteria, use a subset of efficiently decoded termination signals, for example, UAAA and UAAG. Signals of low efficiency are avoided, such as UGAC and UGAU (Table 111). This may be significant physiologically, not only in terms of rates of synthesis, but also to avoid recoding of weak stop signals by noncognate tRNAs (36).In this regard, it is interesting that premature termination mutations in a conserved region of the gene for the transmembrane conductance regulator involved in cystic fibrosis can be less severe. This may relate to relatively high levels of noncognate tRNA recoding, and therefore readthrough at some loci. This occurs in the homologous gene in yeast, SteGp, when a stop codon is put in a context whereby C is the fourth base of the signal (139).

C. Is Translational Infidelity an Early or Late Event? 1. RECODING“EARLY”

Originally, translational stopping may have been a default mechanism when an unspecified or true “nonsense” codon entered the decoding site of the protoribosome. At this stage of evolution, the process would have been mediated independent of any protein factor. The product may have either

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

329

fallen off the ribosome, or an alternative event may have occurred during the prolonged pause at the nonsense codon. Indeed, this situation has been simulated by creating an artificial codon for a nonstandard amino acid in an mRNA for which there is no natural decoding tRNA (66). In this case, frameshifting events additional to termination are common in the absence of the decoding tRNA. This suggests that it would be possible for recoding events to arise at these true “nonsense” codons; that is, they could have been early events before a defined translational termination mechanism evolved. In most cases, the nonsense codon would have been captured subsequently for a specific translational termination mechanism. However, the more complex recoding events of frameshifting and translational hopping could also have been fine-tuned by the acquisition of a stop codon at the site. Pertinent to this argument is the status of selenocysteine as an ancient or as a relatively recent addition to the complement of amino acids found in proteins. Although this is somewhat contentious, it has been argued that a late addition of selenocysteine is more difficult to explain because all three lineages of eukaryotes, archaebacteria, and eubacteria have selenoproteins, which suggests that selenocysteine was present before these lineages separated (111).This implies that selenocysteine was one of the earliest amino acids and was encoded by UGA. Why then are there now only a few examples where UGA encodes selenocysteine? It has been suggested that the high susceptibility of this amino acid to oxidation (since the introduction of oxygen into the earth’s atmosphere) and its extreme sensitivity to metal ions have meant that the lower catalytic efficiency of cysteine became more acceptable because of its greater stability (111).Selenocysteine may have been retained only in special environments (19). UGA would then have acquired the function of translational stopping as a “sense codon takeover.” In a few cases, the original function of UGA as a selenocysteine codon might have been maintained by mechanisms that gave a clear competitive advantage to a sense codon function for UGA. Indeed, there appears to be a reverse precedent for this in the mammalian mitochondria, where UGA is now used as a sense codon for tryptophan and there is no RF that recognizes the codon as stop. The prokaryote-like RF-2 seems to have been lost whereas the prokaryote-like RF-1 (recognizing UAA and UAG) has been retained. Because it is believed that mitochondria evolved from the subdivision of purple bacteria (149), it is assumed that at some point both the RF-2 and tRNA,,, would have coexisted, with the cognate tRNA having a significant selective advantage over RF-2 for decoding UGA (150). 2.

RECODING

“LATE”

If the coding sequence of an mRNA is translated with high fidelity, a small loss of accuracy in reading the stop signal may not compromise the cell.

330

WARREN P. TATE ET AL.

Aberrant proteins could be degraded without obvious penalty, providing they are not inhibitory for any cellular process. The fact that many E . coli species harbor UAG suppressors without compromising growth, and that UGA is relatively leaky as a stop codon in this organism (151, 1521, supports this premise. Sites of translational stop signals where there was a more pronounced error frequency might have been good targets for “recoding takeover,” with the use of the codon for an alternative purpose such as selenocysteine incorporation, readthrough with a noncognate tRNA, or a frameshifting event. Although there may be examples of the inaccurate decoding of translational stop signals that have no physiological significance in cells today, there are clearly some situations where the infidelity has been captured for an event of physiological importance.

VIII. Conclusion The last decade of research has elevated the oft-forgotten translational stop codon into the realm of significant cell physiology. From a period in which “test-tube” infidelities in the decoding of the stop codon were only of esoteric interest, we have moved to an era in which the discovery of recoding events has revealed a new dimension in cellular regulation. New subtleties in the way the stop signal contributes to this are likely to emerge. There is a compelling need now to match these discoveries with a fundamental understanding of the normal mechanism of how the stop signal is decoded by the release factor on the ribosome, and why this decoding fails in the small number of cases where the signal has a dual function. ACKNOWLEDGMENTS Thanks to Dr. Chris Brown for helpful suggestions for the maniiscript. The authors are supported by an International Scholar award of the Howard Hughes Medical Institute to W.P.T., a Human Frontier Science Program grant (awarded to Yoshikazu Nakamura and W.P.T.),and grants from The Health Research Council of New Zealand and the NZ Lotteries Board.

REFERENCES 1. 2. 3. 4.

A. Caren, Science 160, 149 (1968). A. S . Sarabhai, A 0. W. Stretton, S. Brenner and A. Bolle, Nature 201, 13 (1964). M . G. Weigert, E. Gallucci, E. Lanka and A. Garen, CSHSQH 31, 145 (1966). A. Caren, S. Garen and R. C. Wilheim, ]MI3 14, 167 (1965).

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

33 I

E. Gallucci and A. Garen, JMB 15, 193 (1966). M . G. Weigert and A. Garen, Nature 206, 992 (1965). M. G. Weigert, E. Lanka and A. Garen, J M B 23, 391 (1967). S. Brenner, A. 0. W. Strettoii and S. Kaplan, Nature 206, 994 (1965). S. Brenner, L. Barnett, E . R. Katz and F. H. C. Crick, Nature 213, 449 (1967). W. Fiers, R. Contreras, F. Duerinck, 6. Haegeman, D. Iserentant, J. Merregaert, W. Min Jou, F. Molemans, A. Raeymaekers, A. Van den Berghe, G . Volckaert and M. Ysebaert, Nature 260, 500 (1976). 11. R. E. Marshall, C. T. Caskey and M . Nirenberg, Science 155, 820 (1967). 12. F. H. C . Crick, JMB 38, 367 (1968). 13. B. G . Barrell, A. T. Bankier and J. Drouin, Nature 282, 189 (1979). 14. T. H. Jukes and S. Osawa, Experientia 46, 1117 (1990). 15. J. E. Heckman, J. Sarnoff, B. Alzner-DeWeerd, S. Yin and U. L. RajBhandary, PNAS 77, 3159 (1980). 16. S. Osawa, T. H. Jukes, K. Watanabe and A. Muto, Microbiol. Reu. 56, 229 (1992). 17. N. Lehman and T. H. Jukes, J. Theor. Biol. 135, 203 (1988). 18. T. Oba, Y. Andachi, A. Muto and S. Osawa, PNAS 88, 921 (1991). 19. A. Bock, K. Forchhamtner, J. Heider, W. Leinfelder, G . Sawers, B. Veprek and F. Zinoni, Mol. Microbiol. 5, 515 (1991). 20. S . Osawa, A. Muto, T H. Jukes and T. Ohama, Proc. R. Soc. Lond. B 241, 19 (1990). 21. R. H. Buckingham, Experientia 46, 1126 (1990). 22. J. F. Atkins, R. B. Weiss and R. F. Gesteland, Cell 62, 413 (1990). 23. M. M. Fluck, W. Salser and R. H. Epstein, MGG 151, 137 (1977). 24. W. Salser, MCG 105, 125 (1969). 25. W. Salser, M. Fluck and R. Epstein, CSHSQB 34, 513 (1969). 26. R. H. Buckingham, E. J. Murgola, P. Sorensen, F. T. Pagel, K. A. Hijazi, B. H. Mims, N. Figueroa, D. Brechemier-Baey and E. Coppin-Raynal, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A. E. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger and J. R. Warner, eds.), p. 541. American Society for Microbiology, Washington, D.C., 1990. 27. R. H . Buckingham, P. Sorensen, F. T. Pagel, K. A. Hijazi, B. H. Mims, D. BrechemierBaey and E. J. Murgola, BBA 1050, 259 (1990). 28. A. M. Weiner and K. Weber, J M B 80, 837 (1973). 29. G. Li and C. M. Rice, J . Virol. 67, 5062 (1993). 30. J. F. Atkins and R. F. Gesteland, EJB 137, 509 (1983). 31. J. M. Skuzeski, L. M. Nichols, R. F. Gesteland and J. F. Atkins, J M B 218, 365 (1991). 32. G. Eggertsson and D. Soll, Microbiol. Rev. 52, 354 (1988). 33. A. L. Beaudet and C. T. Caskey, PNAS 68, 619 (1971). 34. W. P. Tate and C. M. Brown, Bchem 31, 2443 (1992). 35. E. S. Poole, C. M. Brown and W. P. Tate, EMBOJ. 14, 151 (1995). 36. K. K. McCaughan, C. M. Brown, M. E. Dalphin, M. J. Berry and W. P. Tate, PNAS 92, 5431 (1995). 37. E . J. Murgola, K. A. Hijazi, H. U. Goringer and A. E. Dahlberg, PNAS 85, 4162 (1988). 38. C. D. Prescott and H. C. Kornau, NARes 20, 1567 (1992). 39. H. Moine and A. E. Dahlberg, JIMB 243, 402 (1994). 40. C. D. Prescott and H. U. Goringer, NARes 18, 5381 (1990). 41. M. OConnor and A. E. Dahlberg, PNAS 90, 9214 (1993). 42. Z. Shen and T. D. Fox, NARes 17, 4535 (1989). 43. R. Rosset and L. Gorini, J M B 39, 95 (1969). 44. W. Piepersberg, A. Bock and H. 6 . Wittmann, MGG 140, 91 (1975). 5. 6. 7. 8. 9. 10.

332

WARREN P. TATE ET AL.

45. L. Gorini, in “Ribosomes” (M. Nomura, A. Tissieres and P. Lengyel, eds.), p. 791. CSHLab, Cold Spring Harbor, New York, 1974. 46. L. R. Topisirovic, M. Villarroel, M. De Wilde, A. Herzog, T. Cabezon and A. Bollen, MGG 151, 89 (1977). 47. L. A. Kirseborn and L. A. Isaksson, PNAS 82, 717 (1985). 48. M. C. Ganoza, CSHSQB 31, 273 (1966). 49. M. R. Capecchi, PNAS 58, 1144 (1967). 50. C. T. Caskey, R. Tompkins, E. Scolnick, T. Caryk and M. Nirenberg, Science 162, 135 (1968). 51. E. Scolnick, R. Tompkins, T. Caskey and M. Nirenberg, PNAS 61, 768 (1968). 52. G. Milman, J. Coldstein, E. Scolnick and T. Caskey, PNAS 63, 183 (1969). 53. M. R. Capecchi and H. A. Klein, C S H S Q B 34, 469 (1969). 54. J. Goldstein, G. Milman, E. Scolnick and T. Caskey, PNAS 65, 430 (1970). 55. D. S. Konecki, K. C. Aune, W. P. Tate and C. T. Caskey, JBC 252, 4514 (1977). 56. S. M. Ryden and L. A. Isaksson, MGG 193, 38 (1984). 57. C. C. Lee, Y. Kohara, K. Akiyama, C. L. Smith, W. J. Craigen and C. T. Caskey,]. B a t . 170, 4537 (1988). 58. K. Kawakami, Y. H. Jonsson, G. R. Bjiirk, H. Ikeda and Y. Nakamura, PNAS 85, 5620 (1988). 59. 0. Mikuni, K. Ito, J. Moffat, K. Matsumura, K. McCaughan, T. Nobukuni, W. Tate and Y. Nakarnura, PNAS 91, 5798 (1994). 59a. 6. Grentzmann, D. Brechemier-Baey, V. Heurgue, L. Mora and R. Buckingham, PNAS 91, 5848 (1994). 60. I. Stansfield and M. F. Tuite, Curr. Genet. 25, 385 (1994). 61. L. Frolova, X. Le Goff, H. H. Rasmussen, S. Cheperegin, G. Drugeon, M. Kress, I. Arman, A-L. Haenni, J. E. Celis, M. Philippe, J. Justesen and L. Kisselev, Nature 372, 701 (1994). 62. J. Smrt, W. Kernper, T. Caskey and M. Nirenberg, JBC 245, 2753 (1970). 63. W. P. Tate, B. Greuer and R. Brimacombe, NARes 18, 6537 (1990). 64. R. D. Ricker and A. Kaji, NARes 19, 6573 (1991). 65. C. M. Brown, P. A. Stockwell, M. E. Dalphin and W. P. Tate, NARes 22, 3620 (1994). 66. J. D. Bain, C. Switzer, A. R. Chamberlin and S. A. Benner, Nature 356, 537 (1992). 67. W. Saenger, “Principles of Nucleic Acid Structure.” Springer-Verlag, New York, 1984. 68. E. J. Murgola, A. E. Dahlberg, K. A. Hijazi and A. A. Tiedernan, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A . E. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger and J. R. Warner, eds.), p. 402. American Society for Microbiology, Washington, D.C., 1990. 69. C. D. Prescott. B. Kleuvers and H. U. Goringer, Biochimie 73, 1121 (1991). 70. C. Prescott, L. Krabben and K. Nierhaus, NARes 19, 5281 (1991). 71. H. A. Raue, W. Musters, C. A. Rutgers, J. Van’t Riet and R. J. Planta, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A. E. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger and J. R. Warner, eds.), p. 217. American Society for Microbiology, Washington, D.C., 1990. 72. P. B. Moore, in “The RNA World (R. F. Gesteland and J. F. Atkins, eds.), p. 119. CSHLab, Cold Spring Harbor, New York, 1993. 73. J. Shine and L. Dalgarno, PNAS 71, 1342 (1974). 74. P. Purohit and S. Stern, Nature 370, 659 (1994). 75. H. F. Noller, V. Hoffarth and L. Zimniak, Science 256, 1416 (1992). 76. W. P. Tate, C. D. Ward, C. N. A. Trotman, R. LiihrmannandG. Stoffler, Biochem. Znt. 7, 529 (1983).

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

333

77. C. T. Caskey, L. Bosch and D. S. Konecki, J B C 252, 4435 (1977). 78. J. M. Neefs, Y. Vandepeer, P. Derijk, A. Goris and R. Dewachter, NARes 19, 1987 (1991). 79. G. J. Olsen, R. Overbeek, N. Larsen, T. L. Marsh, M. J. McCaughey, M . A. Maciukenas, W. M. Kuan, T J. Macke, Y. Q. Xing and C. R. Woese, NARes 20, 2199 (1992). 80. R. Brimacombe, Bchein 27, 4207 (1988). 81. H . U . Goringer, K. A. Hijazi, E . J. Murgola and A. E. Dahlberg, PNAS 88, 6603 (1991). 82. G. J. Olsen, N. Larsen and C. R. Woese, NARes 19, 2017 (1991). 83. S. Stern, T. Powers, L.-M. Changchien and H. F. N o h , Science 244, 783 (1989). 84. P. R. Cunningham, K. Nurse, C . J. Weitzmann, D. Negre and J. Ofengand, Bchern 31, 7629 (1992). 85. P. R. Cunningham, K. Nurse, C. J. Weitzniann and J. Ofengand, Bchern 32, 7172 (1993). 86. H. F. Noller, ARB 60, 191 (1991). 87. B. Kastner, C. N. A. Trotman and W. P. Tate, J M B 212, 241 (1990). 88. J. Rinke-Appel, N . Jiinke, R. Brimacomhe, S. Dokudovskaya, 0. Dontsova and A. Bogdanov, NARes 21, 2853 (1993). 89. C. M. Brown, K. K. McCaughan and W. P. Fate, NARes 21, 2109 (1993). 90. J. E. G. McCarthy and R. Brimacombe, Trends Genet. 10, 402 (1994). 91. 0. Dontsova, S. Dokudovskaya, A. Kopylov, A. Bogdanov, J. Rinke-Appel, N . Junke and R. Brimacombe, EMBO J . 11, 3105 (1992). 92. C. M. Brown and W. P. Tate, JBC 269, 33164 (1994). 93. W. P. Tate, A. L. Beaudet and C. T. Caskey, PNAS 70, 2350 (1973). 94. D. L. Hatfield, D. W. E. Smith, B. J. Lee, P. J. Worland and S. Oroszlan, Crit. Aeu. Biochem. Mol. Biol. 25, 71 (1990). 95. W. P. Tate, F. M. Adamski, C. M . Brown, M. E. Dalphin, J. P. Gray, J. A. Horsfield, K. K. McCaughan, J. G. MoKat, R. J. Powell, K. M. Timms and C. N. A. Trotman, in “The Translational Apparatus: Structure, Function, Regulation, Evolution” (K. H. Nierhaus, F. Franceschi, A. R. Subramanian, V. A. Erdmann and 8. Wittmann-Liebold, eds.), p. 253. Plenum, New York and London, 1993. 96. J. G. Moffat and W. P. Tate, ]BC 269, 18899 (1994). 97. H. J. Pel, M. Rep and L. A. Grivell, NARes 20, 4423 (1992). 98. W. J. Craigen, R. G. Cook, W. P. Fate and C. T. Caskey, PNAS 82, 3616 (1985). 99. R. B. Weiss, D. M. Dunn, A. E . Dahlberg, J. F. Atkins and R. F. Gesteland, E M B O J . 7, 1503 (1988). 100. W. J. Craigen and C. T. Caskey, Natui-e 322, 273 (1986). 101. B. C. DonIy, C. D. Edgar, F. M . Adamski and W. P. Ete, NARes 18, 6517 (1990). 102. S. Matsufuji, T. Matsufuji, Y. Miyazaki, Y. Murakanii, J. F. Atkins, R. F. Gesteland and S. Hayashi, Cell 80, 51 (1995). 103. E. Rom and C. Kahana, PNAS 91, 3959 (1994). 104. R. B. Weiss, D. M. Dunn, J. F. Atkins and R. F. Gesteland, C S H S Q B 52, 687 (1987). 105. T. Jacks, H. D. Madhani, F. R . Masiarz and H. E. Varmus, Cell 55, 447 (1988). 106. R. 8. Weiss, D. M. Dunn, M. Shuh, J. F. Atkins and R. F. Gesteland, New Biologist 1, 159 (1989). 107. J. A. Horsfield, D. N. Wilson, S. A. Mannering, F. M. Adamski and W. P. Tate, NARes 23, 1487 (1995). 108. W. M. Huang, S:Z. Ao, S. Casjens, R. Orlandi, R. Zeikus, R. Weiss, D. Winge and M. Fang, Science 239, 1005 (1988). 109. K. L. Herbst, L. M. Nichols, R. F. Gesteland and R. B. Weiss, PNAS 91, 12525 (1994). 110. S . C. Wong and A. T. Abdelal, J. Bnct. 172, 630 (1990). 111. A. Bock, K. Forchharnmer, J. Heider and C. Baron, Trends Biochern. Sci. 16, 463 (1991). 112. M. J. Berry, L. Banu, J. W. Harney and P. R. Larsen, E M B O ] . 12, 3315 (1993).

334

WARREN P. TATE ET AL.

113. M. J. Berry and P. R. Larsen, Biochem.’ Soc. Trans. 21, 827 (1993). 114. M. J. Berry, L. Banu, Y. Chen, S. J. Mandel, J. D. Kieffer, J. W. HarneyandP. R. Larsen, Nature 353, 273 (1991). 115. K. E. Hill, R. S. Lloyd and R. F. Burk, PNAS 90, 537 (1993). 116. P. J. Farabaugh, Cell 74, 591 (1993). 117. H. Hofstetter, H.-J. Monstein and C. Weissman, BBA 374, 238 (1974). 118. M. Ishikawa, T. Meshi, F. Motoyoshi, N. Tdkamatsu and Y. Okada, NARes 14,8291 (1986). 119. R. C. Nutter, K. Scheets, L. C. Panganiban and S. A. Lonimel, NARes 17, 3163 (1989). 120. E. G. Strauss, C. M. Rice and J. H. Strauss, PNAS 80, 5271 (1983). 121. K. Takkinen, NARes 14, 5667 (1986). 122. N . M. Wills, R. F. Gesteland and J. F. Atkins, PNAS 88, 6991 (1991). 123. 8 . Marshall and S. B. Levy, Nature 286, 524 (1980). 124. J. Sambrook, E. F. Fritsch and T. Maniatis, “Molecular Cloning: A Laboratory Manual,” 2nd Ed. CSHLab, Cold Spring Harbor, New York, 1989. 125. P. M. Sharp and M. Bulmer, Gene 63, 141 (1988). 126. C. M. Brown, P. A. Stockwell, C. N. A. Trotman and W. P. Tate, NARes 18, 2079 (1990). 127. C. M. Brown, P. A. Stockwell, C. N. A. Trotinan and W. P. Tate, NARes 18,6339 (1990). 128. J. Kohli and H. Grosjean, MGG 182, 430 (1981). 129. D. R. Cavener and S. C. Ray, NARes 19, 3185 (1991). 130. P. M. Sharp, C. J. Burgess, E. Cowe, A. T. Lloyd and K. J. Mitchell, in “Transfer RNA in Protein Synthesis” (D. L. Hatfield, B. J. Lee and R. M. Pirtle, eds.), p. 397. CRC Press, Boca Raton, FL, 1992. 131. M. Kozak, NARes 15, 8125 (1987). 132. M. Kozak, J . Cell. B i d . 115, 887 (1992). 133. P. M. Sharp, M . Stenico, J. F. Peden and A. T. Lloyd, Biochem. SOC. Trans. 21, 835 (1993). 134. M. McClelland and A. S. Bhagwat, Nature 355, 595 (1992). 135. G. Gutierrez, J. Casadesus, J. L. Oliver and A. Marin, J. Mol. E d . 39, 340 (1994). 136. W. T. Pedersen and J. F. Curran, JMB 219, 231 (1991). 137. R. Martin, M. K. Phillips-Jones, F. J. Watson and L. S. Hill, Biochem. Soc. Trans. 21,846 (1993). 138. J. B. Kopczynski, A. C. Raffand J. J. Bonner, MGG 234, 369 (1992). 139. K. Fearon, V. McClendon, B. Bonetti and D. M. Bedwell, JBC 269, 17802 (1994). 140. J. P. Tassan, K. Le Guellec, M. Kress, M. Faure, J. Camonis, M. Jacquet and M. Philippe, MCBiol 13, 2815 (1993). 141. M. Yarus and J. Curran, in “Transfer RNA in Protein Synthesis” (D. L. Hatfield, B. J. Lee and R. M. Pirtle, eds.), p. 319. CRC Press, Boca Raton, FL, 1992. 142. H. Engelberg-Knlka and R. Schoulaker-Schwarz, Trends Biochem. Sci. 13, 419 (1988). 143. C. Baron and A. Biick, JBC 266, 20375 (1991). 144. D. Behne, H. Hilniert, S. Scheid, H. Gessner and W. Elger, BBA 966, 12 (1988). 145. D. Behne, S . Scheid, A. Kyriakopoulos and H. Hilmert, BBA 1033, 219 (1990). 146. M. Ehrenberg and C. G. Kurland, Q. Reu. Biophys. 17, 45 (1984). 147. H. Bremer and P. P. Dennis, in “Escherichia coli and Salmonella typhimurium” (F. C. Neidart, ed.), p. 1527. American Society for Microbiology, Washington, D.C., 1987. 148. F. M. Adamski, K. K. McCaughan, F. J~rgensen,C. 6 . Kurland and W. P. Tate, J M B 238, 302 (1994). 149. D. Yang, Y. Oyaizu, H. Oyaizu, G. J. Olsen and C. R. Woese, PNAS 82, 4443 (1985). 150. C. C. Lee, K. M . Tinims, C. N. A. Trotman and W. P. Tate, JBC 262, 3548 (1987). 151. D. Hirsh and L. Gold, JMB 58, 459 (1971). 152. P. Model, R. E. Webster and N. D. Zinder, JMB 43, 177 (1969).

INFIDELITIES OF TRANSLATIONAL STOP SIGNALS

335

153. F. Major, M. Turcotte, D. Gautheret, 6. Lapalme, E. Fillion and R. Cedergren, Science 253, 1255 (1991). 154. T. Caskey, E . Scolnick, R. Tompkins, J. Goldstein and G. Milman, CSHSQB 34, 479 (1969). 155. H . J. Pel, C . Maat, M. Rep and L. A. Grivell, NARes 20, 6339 (1992). 156. J. L. Goldstein, A. L. Beaudet and C. T. Caskey, PNAS 67, 99 (1970). 157. M. F. Tuite and I. Stansfield, Nature 372, 614 (1994).

Structure of Replicating Chroma tin CLAUDIAGRUSSAND ROLF KNIPPERS Fakultat f u r Biologie Unioersitat Konstanz 0-78434 Konstanz, Germany I. 11. 111. IV. V. VI. VII. VIII.

IX. X.

Structure of Chromatin: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assembly in Vitro: Reconstitution of Chromatin .................... Experimental Systems for the Study of Replicative Chromatin Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Replicating Chromatin: Basic Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . Nucleosome-free Origins . . . . . . . . . . . . . . . . . . . . . . . . . . Chromatin Structure on Replicated DNA Strands . . . . . . . . . . . . . . . . . . Histone H1 and the Folding of Replicating Chromatin . . . . . . . . . . . . . . Replication-dependent Histone Modifications ...................... Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

338 342 345 347 349 350 352 357 358 359 36 1

The large genome of a eukaryotic cell is organized in the nucleus as a complex nucleoprotein structure, chromatin. The eukaryotic genome is thus condensed several orders of magnitude over B-form DNA, but in spite of this, chromatin must be available as a substrate for transcription and replication. The structure of transcribed chromatin has attracted considerable attention during the past few years and has been the subject of several recent reviews (1-8). However, the most dramatic changes in chromatin structure occur during genome replication, when advancing replication forks invade the parental chromatin and when new chromatin is assembled on emerging progeny D N A strands. In this essay, we summarize present knowledge describing structural changes in chromatin as replication-dependent events. However, we do not discuss the process of semiconservative DNA replication itself, and we do not describe the function of replication enzymes or other proteins involved in eukaryotic DNA replication. The replication of eukaryotic DNA is well described in several recent reviews (9-12). To provide an appropriate basis of the arguments elaborated below, we Progress in Nucleic Acid Research and Molecular Biology, Vol. 52

337

Copyright 0 1YY6 by Academic Press, Inc. All rights of reproduction in any form reserved.

338

CLAUDIA GRUSS AND ROLF KNIPPERS

begin with a brief overview describing some structural features of chromatin relevant to the topic of this essay (for details, see Refs. 13 and 14).

1. Structure of Chromatin: An Overview In higher eukaryotes, virtually all of the genome is packed into nucleosomes. A nucleosome is composed of a histone octamer, one molecule of histone H1, and 200 (215) base-pairs (bp) of DNA. The primary chromatin subunit is the nucleosome core particle, which consists of a histone octamer and a 146-bp segment of DNA wrapped around its surface. The nucleosome cores are distributed along the DNA, separated from one another by linker DNA segments.

A. Histones Histones have at least two structural domains, a central structured globular domain and an amino-terminal flexible basic extension or arm (Fig. 1). The C-terminal domains involved in histone-histone interactions inside nu-

FIG. 1. Structure of histones. This diagram emphasizes the general architecture of histones with their central globular domain and their flexible basic arms. N, Amino terminus; K, lysine; R, arginine; 0 , other amino-acid residues. The bracketed numbers refer to the total number of amino acids in mammalian histones. Histones H2A and H2B form dimers, and histones H3 and H4 form tetramers in solution.

339

STRUCTURE O F REPLICATING CHROMATIN

Histone

H3

H4

approximate size of 1 carboxyterminal arm 25

-

H2A

16

H2B

23

H1

92

~

modifications AC-K9; P-S10; Ac-Kl4; AC-Kl8; Ac-K23; k K 2 7 P-S]; Ac-KS; Ac-KB; AcK12; Ac-Kl6; Ac-K20 P-S1 ; AC-K5; AC-K9

P-Sl45; P-Sl73;P-

S180;

mitosis: P-Sl6; P-Tl7;P-Tl36;

FIG. 2. Properties of mammalian histones. We summarize the total size (in number of amino-acid residues), the lengths of the flexible arms (see Fig. l), and the sites of reversible amino-acid modifications. Ac, Acetylation; P, phosphorylation; K, lysine; R, arginine; T,threonine; S, serine: The numbers refer to the positions of the respective amino-acid residues in the polypeptide chain. For example, Ac-K9 indicates that the site chain of lysine residue at position 9 in the amino-acid sequence of histone H3 can be modified by acetylation.

cleosomal DNA are predominantly a-helical, with a long central helix bordered on each side by a loop segment and a shorter helix (15).All known sites of reversible modifications, such as acetylations and phosphorylations, are located in the flexible basic domains (Fig. 2) (16). Significantly, histone H3 and histone H4 can exist as stable tetramers, and histones H2A and H2B as stable dimers in solution. A treatment of native chromatin with increasing salt concentrations causes the initial release of H2A/H2B dimers and the subsequent release of H3/H4 tetramers (17).

B. Core Particles A nucleosome core particle has the form of a wedge-shaped disk, about 11 nm in diameter and 5.5 nm high. DNA forms 1.75 left-handed turns around the outside of the histone octainer with a pitch of roughly 3 nm (18) (Fig. 3). Hydroxy-radical cleavage and other methods show that the DNA on nucleosome cores is slightly overwound by 0.25-0.35 bp/turn, resulting in an average helical pitch of about 10.2 bp/turn compared to about 10.5 bp/turn for B form DNA in solution (19, 20). Structural analyses also indicate that the H3/H4 tetramer is in the center of the nucleosome core, flanked on each side by an H2A/H2B dimer (18).

340

CLAUDIA GRUSS AND ROLF KNIPPERS nuclemme

histone

octarner

HYH4

tetramer

C

t

t

v H2NH2B

dimer

3 0 nm pitch

FIG. 3. Nucleosome core particle. Histone H I seals two turns of DNA (168 bp) around the histone octamer. This model describes the octamer as a short central cylinder, formed by the histone H3/H4 tetramer, laterally covered by histone HZA/H2B dimers.

The stability of the core particle is determined by interactions between the inner globular histone domains because the flexible histone arms can be removed by mild protease treatment without much effect on the general stability of the core particle or the helical periodicity of nucleosomal DNA (20). However, the amino-terminal histone arms also appear to bind to specific sections of the DNA. For example, the amino-terminal domains of histones H 3 bind to DNA sections at the entry and exit from the core particle, whereas the amino-terminal domains of histones H4 interact with the internal 90 bp of nucleosomal DNA (21).

C. Histone H1 and Higher Order Chromatin Structures

Chromatin is a dynamic structure. Experiments in uitro show an extended 10-nm filament at low ionic strength (
STRUCTURE OF REPLICATING CHROMATIN

341

specific family member of mammalian cells that has close structura1 similarity to the histone H 5 found in nucleated erythrocytes (27-30). All histone H1 subtypes possess a conserved globular central domain flanked by extended basic amino- and carboxy-terminal arms (Fig. 1).The globular domain is essential for the location of histone H1 in the nucleosome at the exit and entry point of nucleosomal DNA, where it stabilizes the two negative supercoiled turns of DNA wound around the histone octamer (Fig. 3). When nucleosome core particles are prepared by extensive digestion of chromatin with micrococcal nuclease, histone H I is released from its binding sites as the ends of the two turns of DNA are degraded (31, 32). The precise arrangement of nucleosomes in the 30-nm filament is still controversial. Recent neutron scattering data show that histone H1 is located in the interior of chromatin and, furthermore, that there are about six nucleosomes per cross section of the 30-nm filament (33).These data are in good agreement with the solenoid model of 30-nm chromatin as originally proposed (34, 22). However, data obtained by scanning-force microscopy suggest that the 30-nm filament exists at low ionic strength as an irregular helix. If confirmed under more physiological conditions, all elaborate models proposed over the years would not be justified (35). To fit the 30-nm chromatin fiber inside the nucleus, a higher order folding is necessary, and evidence suggests that chromatin is divided into several thousand topologically independent loops (containing an average of 85 kb of DNA) attached to a nuclear structure (36).This level of chromatin organization is not considered in this review.

D. Nonhistone Chromatin Proteins The overall structure of chromatin is influenced by a number of nonhistone proteins. Most abundant and best known are the high-mobility group (HMG) proteins, which fall into three subclasses, namely, HMG proteins 14/17, HMG proteins 1/2, and HMG I/Y (37). HMG 14/17 are highly charged proteins of 10 and 12 kDa with basic amino-terminal and acidic carboxy-terminal sections that selectively bind to nucleosomal DNA in preference to protein-free DNA. Nucleosome core particles can carry two HMG 14/17 molecules that bridge two adjacent DNA strands on the surface of the particle (38).HMG 14/17 may replace histone H1 in actively transcribed genes (39, 40). Estimates suggest that about 10% of all nucleosome cores may contain HMG 14/17. It is not known whether HMG 14/17 affect the replication of chromatin. HMG 1/2 are abundant nuclear proteins of 25-30 kDa with basic aminoterminal and acidic carboxy-terminal regions. Their specific binding to single-stranded and cruciform DNA (41, 42) is consistent with a possible function in recombination, repair, or replication. It has also been deter-

342

CLAUDIA GRUSS AND ROLF KNIPPERS

mined that HMG 1/2 facilitate nucleosome assembly in vitro (43), but the physiological significance of this finding is uncertain because many other macromolecules with high concentrations of negatively charged residues promote nucleosome assembly in uitro (see Section 111,A). T)-rich DNA. This is a HMG I / Y proteins preferentially bind to (A property shared by histone H1, and so it has been suggested that HMG I/Y compete with histone H1 for the binding to promoter or origin sequences (44). Thus, HMG proteins may profoundly affect the local structure of chromatin, but their possible effects on chromatin replication have yet to be explored.

+

II. Methodology Several methods are particularly popular among investigators studying the replication of chromatin. A technically simple and widely used method is the supercoiling assay. Assembly of nucleosomes on relaxed closed circular D N A involves the wrapping of the D N A double helix 1.75 times around each histone octamer in a left-handed, negative direction. The compensating pos-

2 nddirnension

A

B

I

ALK

-7

FIG. 4. Chromatin assembly investigated by the supercoiling assay: two-dimensional gel electrophoresis. (A) Radioactively labeled, protein-free plasmid DNA as treated with DNA topoisomerase I. First dimension, standard gel electrophoresis; second dimension, agarose gel electrophoresis in the presence of 0.35 pA4 chloroquine. Different topoisomers of closed circular DNA are labeled by linkingnumber differences (ALk = +l;ALk = +4).11, Nicked circular DNA; 111, linear DNA. (B) The DNA substrate, shown in A, was used for chromatin assembly in uitro: negative supercoils are introduced in topologically closed supercoiled DNA (79).

343

STRUCTURE OF REPLICATING CHROMATIN

B

A 0.3

e

In

N

Q

11 1

0.2

1353 1078 872 603

0.1

310 1 94 118 72 top

10

20

fraction FIG. 5. Analysis of in vitro assembled chromatin. (A) SV40 DNA was used as a substrate for chromatin assembly in uitro. The assembled minichromosomes were analyzed by sucrose gradient centrifugation (47). Protein-free DNA has sedimentation coefficients of 21 S (supercoiled form I DNA) or 16 S (relaxed form I1 DNA). Reconstituted chromatin sediments at 45 S. Polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate confirms the presence of equimolar amounts of t h e four core histones (insert). (B) Reconstituted minichromosomes were treated with inicrococcal nuclease for 2, 8, and 32 minutes as indicated. T h e DNA was extracted and investigated by agarose gel (1.5%)electrophoresis. Marker DNA fragments were separated on the same gel. Their position and size (in base-pairs) are shown.

itive superhelical turns must be relaxed by topoisomerases, which are present in the extract when the assembly reaction is pelformed in unfractionated cell-free systems. Alternatively, topoisomerases must be added when the assembly is performed with purified compounds. Topoisomerases relax the DNA helix around each nucleosome and in the linker region. Consequently, approximately one negative supercoil will arise when a nucleosome core is removed from the DNA (45). Thus, the extent of assembly can be measured by determining the average number of superhelical turns following deproteinization. Because more than 15 supercoils cannot normally be resolved by one-dimensional agarose gel electrophoresis, it is often useful to quantitate the number of supercoils by two-dimensional electrophoresis, running the second dimension in the presence of chloroquine (46). Intercalating chloroquine changes the topology of closed circular D N A molecules by introducing positive superhelical turns and, consequently, affects their electrophoretic mobilities depending on the number of supercoils originally present (Fig. 4). The supercoiling assay is limited to closed circular DNA. Furthermore,

344

CLAUDIA GRUSS AND ROLF KNIPPERS

although supercoiling is indicative of chromatin assembly, it must be recognized that additional direct measurements of nucleosome structure and arrangement are needed. Thus, if experimentally possible, each chromatin assembly reaction should be controlled by the determination of bound histones to make sure that the four core histones are present in stoichiometric amounts (Fig. 5A). Important information concerning the structure of chromatin is obtained through limited digestion with micrococcal nuclease. At controlled enzyme concentrations and. incubation times, micrococcal nuclease randomly cuts the linker DNA between nucleosomes, producing chromatin fragments of various lengths; each fragment carries the corresponding number of nucleosomes. An experimental evaluation usually involves the extraction of D N A fragments and their separation by agarose gel electrophoresis (47). Chromatin with regularly spaced nucleosomes is digested to give DNA fragments that are multiples of a unit length of 180-210 bp (Fig. 5B). A more direct visualization of assembled chromatin may be achieved by electron microscopy. Chromatin, spread under standard conditions in lowionic-strength buffers, give pictures of assembled nucleosomes in the familiar beads-on-a-string arrangement (Fig. 6A). This method has its limitations, because nucleosomes tend to form clusters that are often dimcult to analyze

A

B

FIG.6. Electron microscopy of SV40 minichromosomes. (A) Direct visualization. Minichromosomes were fixed with 0.1% glutaraldehyde and spread with BAC (22). Individual nucleosomes, separated by small pieces of linker DNA, are seen as small spheres. Bar = 0.1 km. (B) Psoralen-cross-linked minichromosomes. SV40 minichromosomes were treated with 4,5',8trimethylpsoralen, deproteinized, relaxed with DNase I, and spread for electron microscopy under denaturing conditions (50). The location of individual nucleosomes is seen as an array of single-stranded bubbles with a size of 140-180 nucleotides. The nucleosome-free region appears as double-stranded DNA (between arrowheads). Bar = 500 nucleotides. From Ref. 102, with permission.

STRUCTURE OF REPLICATING CHROMATIN

345

for a quantitative evaluation. A useful alternative is the electron-microscopy examination of chromatin treated with cross-linking derivatives of psoralen. The compound trimethylpsoralen intercalates into DNA of the linker region, but not into DNA on the nucleosomal core (48-50). For examination, crosslinked chromatin is deproteinized, and spreading of deproteinized DNA under denaturing conditions allows a visualization of well-spread individual nucleosomes as arrays of single-stranded bubbles of about 160 bases (Fig. 6B).

111. Assembly in Vitro: Reconstitution of Chromatin A brief consideration of replication-independent chromatin assembly systems may be interesting (51, 52), because some fundamental principles, first observed in these studies, recur in replication-associated assembly of nucleosomes.

A. Self-assembly of Nucleosome Cores As already mentioned, histones exist as stable histone H3/H4 tetramers and H2A/H2B dimers in solution at high salt concentrations. As the salt concentration is lowered by gradient dialysis, histone H3/H4 tetramers first form stable interactions with DNA, followed by the binding of H2A/H2B dimers to complete the formation of nucleosome core particles; these are, in several aspects, identical to core particles prepared from native chromatin by nuclease degradation of linker DNA (53-55). However, chromatin reconstituted by salt dialysis from purified core histones usually has a random distribution of nucleosomes along the DNA, either irregularly spaced or closely packed, but it does not possess the internucleosomal packing of about 200 bp of DNA that is characteristic for native chromatin (56-58). Similar results were obtained using negatively charged carrier molecules, such as polyglutamic acids which transiently bind histones and promote their transfer to DNA (59, 60). Thus, salt dialysis and related in vitro methods promote the assembly of nucleosome cores from isolated histones, but not of native chromatin structures. In fact, specific nonhistone proteins are required to regenerate the native internucleosomal repeating pattern. Proteins with this function have been identified in extracts from oocytes or eggs of the frog Xenopus laevis and also in extracts from cultured mammalian cells.

B. Chromatin Assembly in Extracts from Xenopus Eggs Xenopus eggs contain a large pool of stored histones equivalent to the histone content ofmore than 10,000diploid nuclei (61).Essentially all histone W 3

346

CLAUDIA GRUSS AND ROLF KNIPPERS

and histone H4 are bound to one of a pair of chaperone proteins N 1 and N2, single acidic polypeptides of 105and 110 kDa (62,63). Histones H2A and H2B are either free (64) or in complexes with another acidic chaperone protein, nucleoplasmin, a pentamer of 29-kDa subunit molecular mass (65, 66). Supernatants, prepared by differential centrifugation at 150,000 g from Xenopus oocytes or eggs, have been most successfully used to reconstitute physiologically spaced nucleosomes in vitro (51). Addition of double-stranded DNA to the unfractionated egg extract leads to nucleosome assembly in a complex process that requires ATP and long incubation times. Other variables, such as temperature, pH, magnesium ions, and salt concentration, also affect the reaction, which proceeds in two consecutive steps. In a first step, histone H3/H4 tetramers are deposited on DNA. This reaction involves a transfer of histones from proteins N1 and N2 to DNA. In a second step, histones H2A and H2B are added to complete the formation of regularly spaced nucleosomes (67, 68). Purified complexes of protein N1-(histones H3/H4) and nucleoplasmin(histones H2A/H2B) also promote the assembly of nucleosomes, but these nucleosomes are not periodically spaced as in native chromatin (64, 69, 70). This indicates that an ATP-requiring function in unfractionated egg extracts is required for the regular spacing of assembled nucleosomes (71, 72). This function could be a protein kinase that specifically phosphorylates one of the histone H2A subtypes (histone H2A.X) during the initial phases of chromatin assembly in Xenopus egg extracts. Phosphorylation of this histone could affect the nucleosome packing density by steric hindrance or charge reduction. The later removal of phosphate from histone H2A.X parallels the gradual maturation of chromatin in egg extracts, and the dephosphorylation reaction could influence nucleosome sliding and alignment (73). Another reaction accompanying chromatin assembly in unfractionated extracts is deacetylation of histone H4. Virtually all histone H4 in Xenopus eggs exists in a diacetylated form, and diacetylated histone H4 is gradually demodified to the unacetylated form as chromatin is assembled. But inhibition of deacetylation by treatment with sodium butyrate does not affect the formation of regularly spaced nucleosomes. This seems to exclude a major function of acetylation and deacetylation for chromatin assembly in Xenopus egg extracts (74). Physiologically spaced nucleosomes, assembled in Xenopus egg extracts, do not contain histone H1.

C. Chromatin Assembly

in Extracts from Somatic

Mammalian Cells

In contrast to oocytes and eggs, somatic cells contain only limited amounts of free histones whose synthesis is precisely regulated and mostly

STRUCTURE OF REPLICATING CHHOMATIN

347

coupled to the S phase of the cell cycle (75, 76). However, even in the absence of exogenous histones, addition of DNA to whole extracts from human HeLa or 293 cells induces a replication-independent assembly of chromatin when the reaction is performed under carefully controlled conditions with respect to magnesium ions, monovalent salts, and ATP (77, 78). However, the addition of exogenous histones greatly promotes chromatin formation in these extracts (77, 79). When carried out in the presence of ATP, the assembly reaction leads to regularly spaced nucleosomes with a periodicity of about 185 bp, as in native chromatin. The reconstitution of native chromatin in somatic cell extracts is correlated with an initial phosphorylation of core histones H2A and H3. Analyses suggest that the phosphorylation of histone H2A is required for the alignment of regularly spaced nucleosomes like H2A phosphorylation in Xenopus egg extracts. The phosphorylation of histone H3 appears to be connected with the stabilization of assembled nucleosomes (80). HeLa cell extracts contain an acidic nucleosome assembly protein, termed NAP-1 (nucleosome assembly protein-1) (81, 82). NAP-1 has been identified in the nuclei of a variety of human and mouse cells and also of other organisms, including Drosophilia and yeast (83). The active protein is probably a dimer that binds all four core histone proteins in uitro. Nucleosomes are formed by a transfer of histones from the loaded NAP-1 factor to DNA (84). Thus, NAP-1 may be the somatic equivalent of frog egg proteins N1 and nucleoplasmin. The physiological function of NAP-1 has yet to be determined, but it could be involved in an exchange of histones H2A and H2B that actively occurs during transcription at all stages of the cell cycle (85).The amount of NAP-1 increases during the progression of resting cells to S phase, and antisense oligonucleotides inhibit the proliferation of lymphocytes (86).

IV. Experimental Systems for the Study of Replicative Chromatin Assembly

We describe a few experimental systems that are widely used for the study of chromatin replication in uitro and in uiuo.

A. Structural Changes in Newly Replicated Chromatin

Numerous reports in the decade between 1975 and 1985 described the structure of newly replicated chromatin in various eukaryotic cells. The standard experimental approach used was a brief labeling of proliferating cells with [3H]thymidine, followed by the preparation of nuclei or chromatin

348

CLAUDIA GRUSS AND ROLF KNIPPERS

and a treatment with micrococcal nuclease to compare newly synthesized (pulse-labeled) chromatin with bulk chromatin (87-93). The solubilized chromatin fragments were separated either directly by sucrose gradient centrifugation or after deproteinization by gel electrophoresis of DNA.

B. Nucleosome Assembly during Complementary Strand Synthesis

Single-stranded phage M13 DNA, added to a high-speed supernatant of a Xenopus egg extract (Section III,B), is rapidly converted to a doublestranded form when incubated with nucleotides under appropriate conditions (94, 95). The necessary RNA primers are synthesized by the DNA primase activity of DNA polymerase OL, and elongated in an ATP-dependent reaction. The deproteinized double-stranded reaction products are supercoiled, and digestion with micrococcal nuclease confirms that complementary strand synthesis is accompanied by an assembly of nucleosomes aligned at the native spacing of about 200 bp (96, 97). Double-stranded DNA is not replicated under these conditions, but, as described in Section III,B, nucleosome assembly also occurs on doublestranded DNA in these extracts, However, the two assembly reactions differ in at least one important way, because chromatin assembly during complementary-strand synthesis is completed within 20 minutes whereas an incubation of several hours is required for nucleosome assembly on nonreplicating DNA. As shown more recently (98), unfractionated extracts from mammalian cells also promote the synthesis of complementary strands on singlestranded DNA templates. When performed in the presence of a nuclear chromatin assembly factor (see Section VII, D), the DNA synthesis reaction is accompanied by nucleosome assembly just as in the Xenopus system (98).

C. Viral Minichromosomes Mammalian DNA viruses are a model system for numerous studies in transcription and replication. A particularly useful system is simian virus 40 (SV40) because the viral genome is composed of only 5243 bp and is therefore small enough to be handled in standard biochemical procedures without much danger of damage and breakage. SV40 DNA is organized in infected cells as chromatin that has many structural features in common with cellular chromatin. It is therefore frequently referred to as the “viral minichromosome” (99). When extracted from nuclei of infected cells under low-salt conditions, SV40 minichromosomes are densely packed nucleoprotein particles contain-

STRUCTURE OF REPLICATING CHROMATIN

349

ing the four core histones as well as the linker histone H1 and many nonhistone proteins. Treatment with 0.5 M NaCl removes histone H 1 and most of the nonhistone proteins and converts the dense nucleoprotein particle into an extended beads-on-a-string structure (Fig. 6) (25). Biochemical and electron-microscopy examinations show 26-28 evenly spaced nucleosomes in most minichromosomes. This value is consistent with the number of superhelical turns in deproteinized minichromosomal DNA, as expected because the release of one nucleosome core causes the induction of one superhelical turn in topologically fixed DNA (45). With 26-28 nucleosomes per SV40 DNA, the nucleosomal repeat length is 185-200 bp. However, about a quarter of all minichromosomes have an average of only about 25 nucleosomes (100-102). The nucleosomes in these minichromosomes are not evenly distributed, but leave a nucleosome-free region of about 400 bp (Fig. 6B), including the genomic control region with two divergent promoter/enhancers and the origin of replication. Investigations of SV4O minichromosomes, replicating in vivo and in vitro, provide the most detailed information about the structure of replicative chromatin.

V. Replicating Chromatin: Basic Questions Chromatin in the vicinity of replication forks may have an extended conformation, but it contains a dense array of nucleosomes similar to the array of nucleosomes in bulk chromatin. This was first demonstrated by electron microscopy of chromatin from Drosophila blastoderm embryos (103). Early Drosophila embryos are particularly useful for this purpose because replication is initiated at many sites along a chromatin fiber, thereby facilitating the visualization of replicons. In these experiments, chromatin was prepared for electron microscopy by hypotonic dispersion to visualize nucleosomes in an extended 10-nm filament. Importantly, no difference was found in nucleosomal packing on replicated and unreplicated chromatin. This result is interesting for two reasons: first, it excludes an extended disassembly of nucleosomes ahead of replication forks; second, it demonstrates that new nucleosomes are soon assembled on the newly replicated DNA branches (103). These early results were later confirmed and refined by electron microscopy of replicating SV40 minichromosomes (102) (see Section VI1,A). The description of replicating chromatin provides a useful basis for several pertinent questions: Do nucleosomes on unreplicated DNA affect the establishment and progression of replication forks? Are prefork nucleosomes released and dissociated when invaded by replication forks, or are they

350

CLAUDIA GRUSS AND ROLF KNIPPERS

transmitted as units to one or both replicated D N A branches? Are new nucleosomes assembled from newly synthesized histones in a replicationdependent process, or later on mature D N A at some distance from the replication fork? What role, if any, do post-translational modifications of aminoacid side-chains play in the process of replicative chromatin assembly?

VI. Nucleosome-free Origins As we shall see in Section VII,A, nucleosomes on the unreplicated chromatin stem do not impede the progression of replication forks. In fact, largeT antigen, the replicative D N A helicase of the SV40 replicon, unwinds D N A on prefork nucleosomes as efficiently as protein-free D N A ( 1 0 3 ~ It ) . has also been shown that bacterial recBCD DNA helicase is capable of unwinding nucleosomal D N A and displacing the nucleosomes (104). This result does not imply that the structure of chromatin has no effect on replication. For example, it has been known for some time that densely packed heterochromatin replicates later during S phase than does euchromatin. This has been interpreted to mean that initiation of replication may be influenced by chromatin structure. Initiation of D N A synthesis requires the binding of protein factors to origin sequences, the localized unwinding of the DNA, and the synthesis of RNA primers. Experiments show that one important factor regulating the binding of initiation factors is the underlying chromatin structure of the origin sequences. This was first shown by studies with the high-copy-number TRP1-ARS 1 plasmid of Saccharomyces cereoisae, where the ARS origin region is located near the edge of a positioned nucleosome (105). By specific deletions the core origin sequence was translocated into the center of the nucleosome core particle. A nucleosome, positioned on the ARS origin, results in a marked decrease of replication efficiency. The result indicates that the ARS origin region must be free of nucleosomes to allow an establishment of D N A replication forks. This conclusion is supported by experiments with SV40 minichromosomes. As mentioned above, about 25% of intracellular SV40 minichromosomes have a nucleosome-free regulatory region. This region includes the two divergent promoters of SV40 that partially overlap with the origin of replication, and it has long been suspected that a nucleosome-free region is not only required for promoter activity, but also for the initiation of replication. This point was investigated in in uitro systems. For this purpose, chromatin was assembled on SV40 DNA in Xenopus oocyte extracts (106).

STRUCTURE OF REPLICATING CHROMATIN

35 1

Reconstituted minichromosomes carrying the maximal number of regularily aligned nucleosomes were inactive as a template for in uitro replication. However, prior binding of transcription factors prevents an assembly of nucleosomes on the promoter/origin region and leads to reconstituted minichromosomes that are competent for replication. The conclusion is that origin sequences must be free of bound nucleosomes to serve as a site for replication initiation (107).Experiments in uiuo with polyoma virus DNA are entirely consistent with this conclusion (108). Reconstitution of SV40 minichromosomes was also performed in the presence of large-T antigen, the SV40 initiator protein, known to interact specifically with elements of the viral origin. However, prebound T-antigen alone did not create a replication-competent origin (107,109). Instead, replication has to be initiated on protein-free DNA (using three of the four deoxynucleotides) to keep the origin free of nucleosomes during in uitro chromatin assembly in extracts from Xenopus oocytes (109). In contrast, when the assembly reaction is performed in mammalian cell extracts, prebound T-antigen keeps the origin free of nucleosomes and competent for replication (110). Thus, subtle differences in the structure of assembled chromatin may be important determinants of replication efficiencies. The question arises of how origin sequences are cleared of nucleosomes in uiuo. One possibility is that transcription factors may play a role in this process. It has been suggested (11Oa)that transcription factors may compete successfully with histones for DNA binding during replication. They could then activate SV40 DNA replication either by direct interaction with replication proteins (111, 112) or by an indirect mechanism involving a change in chromatin structure. Some evidence for the competition model was obtained in in uiuo studies showing that minichromosomal replicative intermediates possess a nucleosome-free promoter/origin region (102). It remains an open question whether these studies with viral model systems have implications for the organization of chromatin on cellular origins. Recent data obtained through studies with S. cerevisue show that ORC initiator proteins remain associated with ARS-1 origin sequences throughout most of the cell cycle (113).It is thus possible that nucleosomes never have a chance to bind because initiator proteins are always on target. However, other origins may require a remodeling of chromatin structure to prepare the DNA binding sites for an interaction with replication initiation factors. This could be achieved, for example, through binding of transcription factors in the proximity of origin sequences. Recent experiments with S. cweuisue show that the active chromatin structure of ribosomal RNA genes is not directly inherited by the daughter strands, indicating that the regeneration of the active chromatin structure is a post-replicative process involving disruption of preformed nucleosomes (114).

352

CLAUDIA GRUSS AND ROLF KNIPPERS

VII. Chromatin Structure on Replicated DNA Strands The formation of chromatin on replicated DNA depends on two different processes, namely the transmission of "old" prefork nucleosomes from unreplicated to replicated DNA, and the assembly of new nucleosomes from newly synthesized histones. We first summarize present knowledge of the fate of prefork nucleosomes.

A. Prefork Nucleosomes are Transferred to Newly Replicated DNA Electron microscopy of cross-linked replicating SV40 minichromosomes (Fig. 6) offered the unique possibility to map precisely the position of nucleosomes on in vivo replicated chromatin (102).An analysis of more than 200 replicative intermediates showed that nucleosomes are densely and regularly aligned on unreplicated and on replicated sections of replicating minichromosomes. This result excluded a release of nucleosomes ahead of the advancing replication fork, and showed furthermore that replication forks can move up to and even into prefork nucleosomes. On the replicated DNA branches, intact nucleosomes were found at some distances behind the fork, with averages of 285 (+120) bases between the branch point and the first nucleosome on the lagging and of 225 (2145) bases on the leading strand (Fig. 7). On the basis of this observation, it was suggested that prefork nucleosomes dissociate when invaded by replication forks and reassociate again at some distance behind the fork (102). This question was reinvestigated in several in vitro replication systems. In the first of a series of reports, chromatin was assembled in vitro on phage M13 DNA and used as a template for replication with bacteriophage T4 replication enzymes (115). It was shown that nucleosomes are not displaced by replications forks. Because the phage T4 enzymes never encounter chromatin in nature, it was important to repeat these experiments with SV40 minichromosomes as templates for mammalian replication functions (46, 109, 116, 117). SV40 minichromosomes replicate well under these in oitro conditions and give rise to replication products with nucleosomes that are almost exclusively inherited from the parental chromatin template. The crucial point here is whether prefork histone octamers are directly transferred to replicated DNA or whether they are released and transiently appear free in solution. This was investigated by studying minichromosome replication in the presence of competing protein-free nonreplicating or replicating DNA. If nucleosomes transiently dissociate, competitor DNA should

353

STRUCTURE OF REPLICATING CHROMATIN

B

leading strand

lagging strand FIG. 7. Electron microscopy of replicating SV40 minichromosomes. (A) Psoralen-crosslinked replicating minichromosome. Note that the molecule in this particular example was not nicked with DNase I (as in the example, shown in Fig. 6B) and therefore remained supercoiled in the unreplicated section. Arrows denote single-stranded DNA on the lagging strand; arrowheads denote double-stranded DNA on the leading strand. (B) Location of nucleosomes: replication forks move up to the next prefork nucleosome; nucleosomes appear at some distance behind the advancing replication fork. From Ref. 102, with permission.

act as a “trap” catching the dissociated histones. It is important to note though that the local concentration of the replicated daughter strands greatly exceeds the concentrations of competitors used in most experiments. Estimates suggest that a competitor concentration of 1-10 mg/ml would be required to mimic the local concentration of DNA at the forks of replicating SV40 DNA (118).Clearly, these requirements are difficult to fulfill in in vitro

354

CLAUDIA GRUSS AND ROLF KNIPPERS

assay systems. In the presence of a relatively low amount of protein-free competitor DNA (molar ratio of 0.3- 1 DNAichrornatin), competing DNA failed to induce the dissociation of parental nucleosomes from replicating parental SV40 minichromosomes (46, 116, 117). However, a 5- to 10-fold excess of competitor DNA over replicating minichromosomes induced a decrease in the number of nucleosomes on newly replicated DNA, as demonstrated by micrococcal nuclease digestion and by electron microscopy of psoralen cross-linked molecules (109). These observations indicate that the parental histones are only in loose contact or may even transiently dissociate during the passage of the replication fork, to be trapped by the excess of competing protein-free DNA. An interesting possibility to explain existing inconsistencies between the data from various laboratories is that minichromosomes or replication functions may be differently prepared, and that some preparations may contain factors that enhance the displacement of histones during replication (119). How might the transfer of the parental nucleosomes be accomplished? One model suggests that advancing replication forks cause a displacement of DNA from the histone octamer (117). Positively charged arms of histones would then be available to contact protein-free DNA segments on progeny strands behind the fork. (This step could clearly be sensitive to high concentrations of competing DNA.) A breakage of contacts between histones and DNA in the prefork region would be accompanied by the formation of new contacts behind the fork. This transfer reaction may require the establishment of DNA loops linking prefork and postfork DNA segments. The size of the loops can be estimated to be about 300 bp separating the prefork nucleosome and the first nucleosomes on replicated DNA as determined by the psoralen cross-linking studies of replicating SV40 minichromosomes in viva (102) (Fig. 7). A transient nucleosome-free region of this kind may also provide an opportunity for specific transcription factors to bind to their recognition sites before the DNA is sequestered into nucleosomes.

B. The Transferred Unit Is a Histone H3/H4 Tetramer Early experiments with density-labeled amino acids indicated that old and new histones do not intermix, and that old nucleosomes are transferred as intact units to the daughter strands (120, 121). However, this conclusion was later questioned, because it could be shown that old nucleosomes dissociate during replication to give histone H3/H4 tetramers and histone H2A/H2B dimers. Old histone H3/H4 tetramers remain on DNA and combine on replicated DNA with either old or new histone H2A/H2B dimers to form postreplicative nucleosomes (85). This conclusion is entirely consistent with the results of electron micros-

STRUCTURE OF REPLICATING CHROMATIN

355

copy of replicating minichromosoinal templates in vitro. Psoralen crosslinking revealed single-stranded postfork bubbles of 80-100 bases, characteristic for DNA stretches protected by bound histone H3/H4 tetramers (109).In addition, micrococcal nuclease digestions showed that the immediate replication products contain protected DNA segments of 80-100 b p that were later converted to the normal fragment size of 160-200 bp. Thus, the transfer of old nucleosomes appears to be a two-step process with the transmission of histone H3/H4 tetramers followed by the addition of histone H2A/H2B dimers. The histone H2A/H2B dimers are derived from a common pool containing newly synthesized histones as well as histones H2A and H2B released during replication and during transcription (85, 122, 123).

C. Parental Nucleosome Cores Segregate in a Nonconservative Manner A fundamental question that has attracted considerable attention is whether parental nucleosomes (or H3/H4 tetramers) are conservatively transferred to only one of the two emerging D N A strands, or whether they segregate dispersively to both strands. The behavior of parental nucleosomes during replication was studied with cells treated with cycloheximide to prevent the synthesis of new nucleosomes. Alternatively, isolated nuclei were used that continue the replication of in vivo initiated replicons when incubated under proper in vitro conditions (88, 124, 125). Chromatin, replicated under these conditions, is about twice as sensitive to nuclease digestion than control chromatin, due to the lack of half the normal amount of nucleosomes. The nuclease-resistant chromatin fractions from cycloheximide-treated cells contain true nucleosomes (including histone H1) that are aligned in clusters on replicated DNA. In addition, long stretches of nncleosome-free DNA have been observed in the electron microscope with chromatin of cycloheximide-treated cells (126). These experimental results were interpreted to indicate that parental nucleosomes are transmitted conservatively to only one of the two emerging strands of replicating DNA. This interpretation has been challenged by others, doing methodologically similar experiments that appear to support a dispersive mode of nucleosome segregation. As an example, we consider an experiment with cells grown in the absence of cycloheximide. Cellular chromatin, carrying density-labeled proteins, was digested to give fragments composed of 5-9 monomeric nucleosomes. These chromatin fragments were of hybrid density, and nucleosomes labeled in one cell cycle are distributed equally to both nascent DNA strands during the succeeding cell cycle (127, 128). This result is compatible with a dispersive mode of nucleosome segregation. In vivo replicating SV40 minichromosomes have also been examined to

356

CLAUDIA GRUSS AND ROLF KNIPPERS

obtain more information about the mode of nucleosome segregation. Biochemical analysis clearly shows that parental nucleosomes are randomly distributed to both the leading and the lagging strand of replicating SV40 DNA (129).These results are supported by electron microscopy using the psoralen cross-linking technique (102).In these experiments, infected cells were briefly treated with cycloheximide to prevent the synthesis of new histones. Minichromosomes, replicating under these conditions, contain a dense array of nucleosomes in their unreplicated section, but half the normal number of nucleosomes on the replicated branches, frequently aligned in clusters. Importantly, transmitted old nucleosomes appear on both DNA branches without preference for one or the other replicated DNA strand. The random transfer of parental nucleosomes leads to a broad distribution of nucleosome numbers on progeny minichromosomes: some replication products have only very few and others almost 28 nucleosomes, but most minichromosomes, replicated in the presence of cycloheximide, carried 12-15 nucleosomes, about half the maximal number. This experiment was repeated using emetine, another protein-synthesis inhibitor (130), and the results also showed that the nucleosomes are randomly distributed on both arms of the replication fork. More recently, in vitro systems were used to reinvestigate the mode of nucleosome distribution. In an artificial system, composed of phage T4 replication proteins and a reconstituted chromatin template, parental histone octamers were transferred to either arm of replication forks (115). More histone octamers remained with the forward arm, but this bias probably resulted from the presence of large single-stranded regions on the retrograde arm. More pertinent are experiments with in vitro replicating SV40 minichromosomes using either purified replication proteins (116 ) or unfractionated cytosolic extracts (46, 109). The experimental results obtained in different laboratories by various procedures are entirely consistent and show that parental nucleosomes (or rather their histone H3/H4 tetramer units) are randomly distributed to the emerging daughter DNA strands.

D. New Nucleosomes Are Assembled

in a Replication-dependent Reaction

Viral minichromosomes, treated with 0.5 M NaCl to release histone H 1 and associated nonhistone chromatin proteins, replicate in vitro to give replication products with an average of about half the maximal number of nucleosomes, indicating that parental nucleosomes are randomly distributed to progeny DNA without concomitant assembly of new nucleosomes (46,109, 116 , 11 7). However, in the presence of a nuclear protein extract, replication products are formed that carry the maximal amounts of about 26-28 nucleosomes, including parental and newly assembled nucleosomes (131-133).

STRUCTURE OF REPLICATING CHROMATIN

357

The nuclear factor responsible for the assembly reaction is the mammalian chromatin assembly factor, CAF-1 (134). This factor is a heterotrimeric protein composed of subunits of 150, 60, and 50 kDa. The two largest protein subunits are subject to modifications by phosphorylation with as yet unknown physiological consequences (135).Purified CAF-1 binds to histones in vitro, but not to DNA (134).In in vitro replication experiments with unfractionated cytosolic extracts, CAF-1 uses soluble and presumably newly synthesized histones for the assembly of new nucleosomes (134).The assembly reaction proceeds in two steps: CAF-1 first promotes the formation of a stable complex of histones H3 and H4 with DNA; the bound factor is then exchanged against histones H2A and H2B, thereby completing the nucleosome structure (136). . A remarkable feature of CAF-1 is that its function is strictly coupled to DNA synthesis (98, 134-136) and promotes the assembly of nucleosomes only as long as single-stranded DNA template regions are available. Because purified CAF-1 binds poorly to DNA, it remains an interesting question of how its interaction with single-stranded DNA at replication forks is mediated. In Section II1,C we described a mammalian assembly function, NAP-1, which participates in a nucleosome assembly reaction on mature doublestranded DNA. It would be interesting to know whether NAP-1 cooperates with CAF-1 in vivo, or whether it performs unrelated functions affecting the structure of chromatin for different purposes. In any case, all known properties of CAF-1 strongly suggest that it may be a prime candidate for promoting replicative chromatin assembly in vivo. To summarize this section, we note that the two assembly reactions at replication forks appear to be fundamentally similar: parental as well as new nucleosomes are built in two steps, and the deposition of histone H3/H4 tetramers on DNA is followed by the addition of two histone H2A/H2B dimers.

VIII. Histone H1 and the Folding of Replicating Chromatin Many earlier experiments clearly show that the structure of newly replicated chromatin differs from that of bulk chromatin. This conclusion is based on the fact that newly replicated chromatin is highly sensitive to micrococcal nuclease attack. Under conditions of mild nuclease treatment, when bulk chromatin is degraded to give relatively large supranucleosomal structures with groups of six or more nucleosomes, newly synthesized chromatin is digested to monomeric nucleosomes. After more extensive nuclease treat-

358

CLAUDIA CRUSS AND ROLF KNIPPERS

ment, when much of the bulk chromatin is converted to nucleosome cores with DNA fragments of about 145 bp, newly assembled chromatin fragments are cut at internal sites to give subnucleosomal DNA of lengths 60-80 bp. These shorter DNA fragments are only loosely bound to histones and dissociate from histone octamers in 0.5 M NaCl when typical nucleosome cores remain intact (88, 92, 93,137-141). These changes are reversible within 1020 minutes (137),but, importantly, the structure of newly formed chromatin persists when DNA replication is inhibited. Thus, maturation of chromatin depends on ongoing DNA replication (142, 143). An interpretation of these early observations is that the production of subnucleosomal DNA fragments after extensive nuclease treatment may indicate an early deposition of histone H3/H4 tetramers on nascent DNA, whereas the overall nuclease sensitivity of replicating chromatin could be due to an extended conformation with exposed linker DNA. A reason for this could be a lack of histone H1; at least one report describes the absence of histone H1 in nascent chromatin (139).However, more recent data indicate that histone H1 is deposited briefly after or even simultaneously with the assembly of nucleosome cores on newly replicated DNA (144). It is not known whether histone H1, associated with nascent chromatin, differs in its binding properties from histone H1 in bulk chromatin. The behavior of histone H1 during chromatin replication has yet to be investigated in more detail. Recent results suggest that the extensive literature on the effect of histone H1 on transcription (145-149) is an inappropriate basis for speculations on its role in replication. In contrast to its inhibitory effect on the initiation of transcription, histone H1 does not affect the in vitro replication of viral minichromosomes. Moreover, histone Hl/DNA complexes replicate as efficiently in oitro as protein-free SV40 DNA, suggesting that bound histone H1 does not interfere with the establishment and progression of replication forks (150).

IX. Replication-dependent Histone Modifications Newly synthesized soluble histone H4 has acetyl groups at specific lysine residues in the amino-terminal arm (see Fig. I). This modification seems to facilitate the formation of histone octamers because new histones H2A, H2B, and H3 preferentially assemble with diacetylated histone H4 to form new nucleosome cores (151).Assembled histone H4 is deacetylated within minutes of core particle formation (152). Interestingly, the process of deacetylation correlates with the stable binding of histone H1 and the formation of more compact chromatin structures. In fact, when replication occurs in the presence of sodium butyrate (which inhibits histone deacetylation), nascent

STRUCTURE OF REPLICATING CHROMATIN

359

chromatin fails to mature fully and is depleted of histone H1 relative to control chromatin (153).Biochemical experiments with isolated chromatin show directly that histone acetylation alters the capacity of histone H1 to induce a higher order chromatin structure (154, 155). It has long been established that histone H 1 undergoes reversible phosphorylations of serine residues in the carboxy-terminal arm during the S phase of the cell cycle (Fig. 2) (156, 157). However, it is still unknown whether and how this modification affects the structure of replicating chromatin.

X. Conclusions It now appears that, just as in transcriptionally active chromatin, where sites in the promoter region are often free of nucleosomes, replication origins are normally cleared of nucleosomes to be accessible for replication functions. However, once initiated, replication continues undisturbed by nucleosomes on the parental DNA stem, and, unlike transcribed chromatin, histone H 1 has probably no negative effect on the progression of replication forks. It is likely, but not proved, that histone H 1 dissociates from chromatin in the vicinity of replication forks, or, if it remains bound to chromatin, its binding must differ from that in bulk chromatin. This can be concluded because chromatin on nascent DNA is less compact than mature bulk chromatin, and because H1 binds less well to chromatin with the acetylated histones present in newly assembled chromatin. Chromatin assembly on nascent DNA consists of two fundamental processes, namely, transmission of parental old histones and the assembly of new histones. The structure of parental nucleosome core particles most probably changes when invaded by the approaching replication fork, because experiments reveal the presence of old histone H3/H4 tetramers on nascent DNA closest to the replication fork. This implies that histone H2A/H2B dimers dissociate when replication points reach the next prefork nucleosome. Thus, the unit directly transferred to either one of the emerging replicated DNA strands is a parental histone H3/H4 tetramer (Fig. 8). Similarly, assembly of nucleosomes from newly synthesized histones begins with the initial deposition of new histone H3/H4 tetramers most likely mediated by protein CAF-1. It follows that nascent DNA close to replication forks transiently carries intermingled old and new histone H3/H4 tetramers, which are soon completed by an association of histone H2A/H2B dimers. The dimers are taken from a common pool of mixed old and new dimers. Possibly, new dimers preferentially bind to DNA-bound new tetramers directed by newly assembled acetylated histone H4, but, at least in principle,

360

CLAUDIA GRUSS AND ROLF KNIPPERS

FIG. 8. Current model of the processes occurring at the replication fork. The replication machinery moves into the prefork nucleosome, which is believed to dissociate into H2A/H2B dimers and H31H4 tetramers. The H3/H4 tetramers are transferred to the daughter strands by an unknown mechanism. The assembly of nucleosomes occurs at some distance behind the fork in two consecutive steps, starting with the deposition of H3/H4 tetramers, followed by an association of H2A/H2B dimers. Newly formed chromatin is composed of a random mixture of parental and new nucleosomes. A nucleosome contains an H3/H4 tetramer of either old or newly synthesized histones plus old or new H2A/H2B dimers. The chromatin structure in front of the replication fork is still hypothetical.

both old and new dimers are used for an association with either old or new tetramers (Fig. 8). Thus, the process of replicative chromatin assembly can now be described in outline, but a number of important questions remain unanswered:

1. Does histone H 1 dissociate at some distance ahead of the replication fork or when the nucleosome is invaded by the advancing replication apparatus? Which role, if any, does the well documented S-phaselinked phosphorylation of histone H1 play in the regulation of chromatin replication? 2. How are D N A contacts opened and closed when prefork histone H3/H4 tetramers are transmitted as units to replicated DNA?

STRUCTURE O F REPLICATING CHROMATIN

36 1

3. How do assembly functions and replication proteins interact at replication forks, or more specifically, how is CAF-1 tethered to replicating DNA? It should be recalled that the components of the replication and the assembly machineries are both located on a stretch of DNA probably not larger than about 300 bp. 4. How do HMG proteins (37), normal constituents of chromatin, behave at replication forks? 5. Which mechanisms regulate the complex topological changes accompanying replicative DNA unwinding and the concomitant wrapping of DNA around nucleosomes?

ACKNOWLEDGMENTS We thank Eric Carstens for critically reading the manuscript. The work done in our laboratory was supported by Deutsche Forschungsgemeinschaft through SFB 156.

REFERENCES 1 . M. Grunstein, Annu. Reu. Celt B i d . 6, 6643 (1990). 2 . K. E. vanHolde, D. E. Lohr and C. Robert, JBC 267, 2837 (1992). 3. 6. Felsenfeld, Nature 355, 219 (1992). 4. J. J. Hayes and A. P. Wolffe, BioEssays 14, 597 (1992). 5. J. Svaren and W. Horz, Curr. Opin. Gen. Deu. 3, 219 (1993). 6. A. P. Wolffe, Curr. B i d . 4, 245 (1994). 7. S. M. Paranjape, R. T. Kamakaka and J. T. Kadonaga, ARB 63, 265 (1994). 8. S. Dimitrov and A. P. Wolffe, BBA 1260, l(1995). 9. M. D. Challherg and T. Kelly, ARB 58, 671 (1989). 10. B. Stillman, Annu. Reu. Cell Biol. 5, 197 (1989). 11. M. L. DePamphilis, ARB 62, 29 (1993). 12. D. Coverly and R. A. Laskey, ARB 63, 745 (1994). 13. K. E. Van Holde, “Chromatin” (Alexander Rich, ed.). Springer Verlag, New York, 1988. 14. A. Wolffe, “Chromatin, Structure and Function.” Academic Press, London, 1992. 15. D. Pruss, J. J. Hayes and A. P. Wolffe, BioEssays 17, 161 (1995). 16. M. Bradbury, BwEssays 14, 9 (1992). 17. D. R. Burton, M. J. Butler, J. E. Hyde, D. Philips, C. J. Skidmore and I. 0. Walker, NARes 5, 3643 (1978). 18. T. J. Richmond, J. T. Finch, B. Rushton, D. Rhodes and A. Klug, Nature 311,532 (1984). 19. J. J. Hayes, T. D. Tullius and A. P. Wolffe, PNAS 87, 7405 (1990). 20. J. J. Hayes, D. J. Clark and A. P. Wolffe, PNAS 88, 6829 (1991). 21. C. S. Hill and J. 0. Thomas, EJB 187, 145 (1990). 22. F. Thoma, T. Koller and A. Klug, JCB 83, 402 (1979). 23. J. D. McGhee, J. M. Nickol, 6. Felsenfeld and D. C. Rau, Cell 33, 831 (1983). 24. J. C. Hansen and J. Ausio, Trends Biochem. Sci. 17, 187 (1992).

362

CLAUDIA GRUSS AND ROLF KNIPPERS

25. U. Muller, H. Zentgraf, I. Eicken and W. Keller, Science 201, 406 (1978). 26. D. Doenecke, Histones, histone variants and postsynthetic histone modifications, in ‘Architecture of Eucaryotic Genes” (G. Kahl, ed.), p. 123. Weinheim, VCH Verlagsgemeinschaft, 1988. 27. M. A. Billet and J. Hindley, EJB 28, 451 (1972). 28. R. Appels and J. R. E. Wells, JMB 70, 425 (1972). 29. A. Ruiz-Caririllo, L. J. Wangh, V. C. Littau and V. G. Allfrey, JBC 249, 7358 (1974). 30. M . Molter, J. Cote, J. Renaud and A. Ruiz-Carrillo, MCB 7, 3663 (1987). 31. T. Boulikas, J. M. Wiseman and W. T. Garrard, PNAS 77, 127 (1980). 32. J. Allan, P. G. Hartman, C. Crane-Robinson and F. X. Aviles, Nature 288, 675 (1980). 33. V. Graziano, S. E. Gerchman, 13. K. Schneider and V. Ramakrishnan, Nature 368, 351 (1994). 34. J. T. Finch and A. Klug, PNAS 73, 1897 (1976). 35. J. Zlatanova, S. H. Leuba, 6. Yang, C. Bustamante and K. VanHolde, PNAS 91, 5277 (1994). 36. P. R. Cook, Cell 66, 627 (1991). 37. M . Bustin, D. A. Lehn and D. Landsman, BBA 1049, 231 (1990). 38. P. J. Alfonso, M. P. Crippa, J. J. Hayes and M. Bustin, JMB 236, 189 (1994). 39. Y. V. Postnikow, D. A. Lehn, R. C. Robinson, F. K. Friedman and J. B. Shiloch M., NARes 22, 4520 (1994). 40. H.-F. Ding, S. Rimsky, S. C. Batson, M. Bustin and U. Hansen, Science 265, 796 (1994). 41. M. E. Bianchi, M. Beltrame and G . Paonessa, Science 243, 1056 (1989). 42. H. Hamada and M. Bustin, Bchem 24, 1428 (1985). 43. C. Bonne-Andrea, F. Harper, J. Sobczak and A. M. DeRecondo, EMBOJ. 3, 1193(1984). 44. K. Zhao, E. Kas, E. Gonzalez and U . K. Laemmli, E M B O J . 12, 3237 (1993). 45. J. E. Germond, B. E. Hirt, P. Oudet, M . Gross-Bellard and P. Chambon, PNAS 72, 1843 (1975). 46. T. Krude and R. Knippers, MCB 11, 6257 (1991). 47. C. Gruss and R. Knippers, The SV40 minichromosome, i n “Methods in Molecular Cenetics” (K. W. Adolph, ed.), Vol. 7. p. 101 Academic Press, Orlando, 1995. 48. C. V. Hanson, C.-K. J. Shen and J. E. Hearst, Science 193, 62 (1976). 49. T. R. Cech, D. Potter and M. L. Pardue, C S H Q B 42, 191 (1977). 50. J. M. Sogo, P. J. Ness, R. M. Widmer, R. W. Parish and T. Koller, J M B 178, 897 (1984). 51. D. Rhodes and R. A. Laskey, Assembly of nucleosomes and chromatin in oitro, in “Methods of Enzymology,” P. M . Wassarman and R. D. Kornberg, eds.), Vol. 170, p. 575. Academic Press, San Diego, 1989. 52. A. P. Wolffe and C. Schild, Methods. Cell. Biol. 36, 541 (1991). 53. K. Tatchell and K. E. van Holde, Bchem 16, 5295 (1977). 54. A. W. Fulmer and 6 . D. Fasman, Bchem 18, 659 (1979). 55. J.-R. Daban and C. R. Cantor, JMB 156, 771 (1982). 56. L. A. Burgoyne, D. R. Hewish and J. Mobhs, BJ 143, 67 (1974). 57. M. Noll, Nature 251, 249 (1974). 58. R. D. Kornberg, Science 184, 868 (1974). 59. T. Nelson, R. Wiegand and D. Brutlag, Bchem 20, 2594 (1981). 60. A. Stein and M. Bina, JMB 178, 341 (1984). 61. H. R. Woodland and E. D. Adamson, Deo. B i d . 57, 118 (1979). 62. J. A. Kleinschmidt and W. W. Franke, Cell 29, 799 (1982). 63. J. A. Kleinschmidt, S. Fortkamp, G . Krohne, H. Zentgrafand W. W. Franke, JBC 260, 1166 (1985). 64. K. Zucker and A. Worcel, JBC 266, 14487 (1990).

STRUCTURE OF REPLICATING CHROMATIN

363

R. A. Laskey, B. M. Honda, A. D. Mills and J. T. Finch, Nature 275, 416 (1978). W. C. Earnshaw, B. M. Honda, R. A. Laskey and J. 0. Thomas, Cell 21, 373 (1980). S. M. Dilworth and C. Dingwall, BioEssays 9, 44 (1988). R. A. Laskey, A. D. Mills, A. Phillpott, 6. H . Lena, S. M. Dillworth and C . Dingwall, The role of nucleoplasmin in chromatin assembly and disassembly, in “Molecular Chaperones” (R. J. Ellis, R. A. Laskey and 6. H. Lorimer, eds.), p. 7. Chapman & Hall, London, 1994. 69. J. A. Kleinschmidt, A. Seiter and H. Zentgraf, EMBO J. 9, 1309 (1990). 70. W. Sapp and A. Worcel, JBC 265, 9357 (1990). 71. 6. Almouzni and M. Mechali, E M B O J . 7, 4355 (1988). 72. G. Sessa and I. Ruberti, NARes 18, 5449 (1990). 73. J. A. Kleinschmidt and H. Steinbeisser, EMBO J. 10, 3043 (1991). 74. A. Shimamura and A. Worcel, JBC 264, 14524 (1989). 75. R. S. Wu and W. M. Banner, Cell 27, 321 (1981). 76. M . A. Osley, ARB 60, 827 (1991). 77. S. Banerjee and C . R. Cantor, MCB 10, 2863 (1990). 78. C. Gruss, C. Gutierrez, W. C. Burhans, M. L. DePamphilis, T. Koller and J. M. Sogo, EMBO J. 9, 2911 (1990). 79. M. Lsssle, A. Richter and A. Knippers, RBA 1132, 1 (1992). 80. S. Banerjee, G. R. Bennion, M . W. Goldberg and T. D. Allen, NARCS 19, 5999 (1991). 81. Y. Ishimi, H . Yasuda, J. Hirosumi, F. Hanaoka and M. Yamada, JB 94, 735 (1983). 82. Y. Ishimi, J. Hirosumi, W. Sato, K. Sugasawa, S. Yokota, F. Hanaoka and M. Yamada, EJB 142, 431 (1984). 83. Y. Ishimi and A. Kikuchi, J B C 266, 7025 (1991). 84. Y. Ishimi, M. Kojima, M. Yamada and F. Hanaoka, EJB 162, 19 (1987). 85. V. Jackson, Bchem 29, 719 (1990). 86. H.-U. Simon, 6. B. Mills, M. Kozlowski, D. Hogg, D. Branch, Y. Ishimi and K. A. Siminovitch, BJ 297, 389 (1994). 87. D. Hewish, NARes 4, 1881 (1977). 88. E.-J. Schlaeger and K.-H. Klenipnauer, E J B 89, 567 (1978). 89. T. Senshu, M . Fukuda and M. Ohashi, J B 84, 985 (1978). 90. A. Worcel, S. Han and M. L. Wong, Cell 15, 969 (1978). 91. A. Levy and K. M. Jakob, Cell 14, 259 (1978). 92. M. L. DePamphilis and P. M. Wassarman, ARB 49, 627 (1980). 93. A. T. Annunziato and R. L. Seale, Bchetn 21, 5431 (1982). 94. H. Kiinig, H. D. Riedel and R. Knippers, EJB 135, 435 (1983). 95. M. Mechali and R. Harland, Cell 30, 93 (1982). 96. 6. Almouzni and M. Mechali, EMBO J. 7, 665 (1988). 97. 6. Almouzni, D. J. Clark, M. Mechali and A. P. Wolffe, NARCS18, 5767 (1990). 98. T. Krude and R. Knippers, JBC 268, 14432 (1993). 99. M. L. DePamphilis and M . K. Bradley, Replication of SV40 and polyoma virus chromosomes, in “The Papoouuiridae (N. P. Salzman, ed.), p. 99. Plenum, New York, 1986. 100. S. Saragosti, G. Moyne and M. Yaniv, Cell 20, 65 (1980). 101. E. B. Jakobovits, E. Bratosin and J. Aloni, Nature 285, 263 (1980). 102. J. M. Sogo, H. Stahl, T. Koller and R. Knippers, J M B 189, 189 (1986). 103. S. L. McKnight and 0. L. Miller, Cell 12, 795 (1977). 103a. U. Ramsperger and H. Stahl, EMBO 14, 3215 (1995). 104. A. K. Eggleston, T. E. O’Neill, E. M . Bradbury and S. T. Kowalczykowski, J B C 270, 2024 (1995). 105. R. T. Simpson, Nature 343, 387 (1990).

65. 66. 67. 68.

364

CLAUDIA GRUSS AND ROLF KNIPPERS

106, L. Cheng and T. J. Kelly, Cell 59,541 (1989). 107. L. T. Cheng and T. Kelly, Cell 59, 541 (1989). 108. E. R. Bennett-Cook and J. A. Hassell, EMBO J. 10, 959 (1991). 109. C. Gruss, J. Wu, T. Koller and J. M. Sogo, E M B O J . 12, 4533 (1993). 110. Y. Ishimi, JBC 267, 10910 (1992). 1lOa. M. L. DePamphilis, Trends Cell Biol. 3, 161 (1993). 111. Z. He, B. T. Brinton, J. Greenblatt, J. A. Hassell and C. J. Ingles, Cell 73, 1223 (1993). 112. R. L. Li and M. R. Botchan, Cell 73, 1207 (1993). 113. J. F. X. Diffley and J. H. Cocker, Nature 357, 169 (1992). 114. R. Lucchini and J. M. Sogo, Nature 374, 276 (1995). 115. C. Bonne-Andrea, M . L. Wong and B. M. Alberts, Nature 343, 719 (1990). 116. K. Sugasawa, Y. Ishimi, T. Eki, J. Hurwitz, A. Kikuchi and F. Hanaoka, PNAS 89, 1055 (1992). 117. S . K. Randall and T. J. Kelly, JBC 267, 14259 (1992). 118. R. D. Kornberg and Y. Lorch, Cell 67, 833 (1991). 119. P. D. Kaufman and M. R. Botchan, Curr. Biol. 4, 229 (1994). 120. I. M. Leffak, R. Grainger and H. Weintraub, Cell 12, 837 (1977). 121. I. M. Leffak, Nature 307, 82 (1984). 122. V. Jackson,B c h m 26, 2315 (1987). 123. V. Jackson, Bchern 27, 2109 (1988). 124. H. Weintraub, Cell 9, 419 (1976). 125. R. L. Seale, Cell 9, 423 (1976). 126. D. Riley and H. Weintraub, PNAS 76, 328 (1979). 127. G. Russev and R. Hancock, PNAS 79, 3143 (1982). 128. V. Jackson, Bchern 24, 6930 (1985). 129. M. E. Cusick, M. L. DePamphilis and P. M. Wassarman, JMB 178, 249 (1984). 130. W. C. Burhans, L. T. Vassilev, J. Wu, J. M. Sogo, F. S. Nallaseth and M. L. DePamphilis, E M B O J. 10, 4351 (1991). 131. B. W. Stillman and Y. Gluzman, MCB 5, 2051 (1985). 132. B. Stillman, Cell 45, 555 (1986). 133. T. Krude, C. DeMaddalena and R. Knippers,MCB 13, 1059 (1993). 134. S. Smith and B. Stillman, Cell 58, 15 (1989). 135. S. Smith and B. Stillman, JBC 266, 12041 (1991). 136. S. Smith and B. Stillman, E M B O J . 10, 971 (1991). 137. K.-H. Klempnauer, E. Fanning, B. Otto and R. Knippers, J M B 136, 359 (1980). 138. G. Galili, A. Levy and K. M. Jakob, NARes 9, 3991 (1981). 139. E. J. Schlaeger, Bchern 21, 3167 (1982). 140, R. Fotedar and J. M. Roberts, PNAS 86, 6459 (1989). 141. E.-J. Schlaeger and R. Knippers, NARes 6, 645 (1979). 142. E. J. Schlaeger, W. Piilm and R. Knippers, FEBS Lett. 158, 281 (1983). 143. W. Puhn and R. Knippers, Chromatin structure and DNA replication, in “Proteins Involved in DNA Replication” ( U . Hiibscher and s. Spadari, eds.), p. 127. Plenum, New York, 1984. 144. S. Bavykin, L. Srebreva, T. Banchev, R. Tsanev, J. Zlatanova and A. Mirzabekov, PNAS 90,3918 (1993). 145. M. S. Schlissel and D. D. Brown, Cell 37, 903 (1984). 146. A. Wolffe, E M B O J. 8, 527 (1989). 147. A. Shimamura, M. Sapp, A. Rodriguez-Campos and A. Worcel, MCB 12, 5573 (1989). 148. G. E. Croston, L. A. Kerrigan, L. L. Lira, D. R. Marshak and J. T. Kadonaga, Science 251, 643 (1991).

STRUCTURE OF REPLICATING CHROMATIN

365

149. P. J. Laybourn and J. T. Kadonaga, Science 254, 238 (1991). 150. L. Halmer and C. Gruss, NARes 23, 773 (1995). 151. C. A. Perry, C. A. Dadd, D. Allis and A. T. Annunziato, Bchem 32, 13605 (1993). 152. V. Jackson, A. Shires, N. Tanphaichitr and R. Chalkley, / M B 104,471 (1976). 153. C. A. Perry and A. T. Annunziato, NARes 17, 4275 (1989). 154. J. A. Ridsdale, M. J. Hendzel, 6. P. Delcuve and J. R. Davie, JBC 265, 5150 (1990). 155. C. A. Perry and A. T.Annunziato, Exp. Cell. Res. 196, 337 (1991). 156. P. Hohmann, MCBchem 57, 81 (1983). 157. R. W. Lennox and L. H. Cohen, Histone phosphorylation, in “Chromosomes and Chromatin” (K. W. Aldolph, ed.), p. 33. CRC Press, Boca Raton, FL, 1988.

Index

A Arnpligen, 55 Antigene oligonucleotide base triad formation, 276-278 homopurine tract, targeting of interruptions, 282-285 mechanism of action, 262, 276 8-oxoadenine, targeting of 6 - C base pairs, 278-281 triplex formation by oligonucleoside methylphosphonates, 285-287 Antisense oligonucleotide herpes simplex virus, effect on replication, 273-276 mechanism of action, 261-262 methylphosphonate analogs cell uptake, 270-272 hybridization stability, 265-267 nuclease resistance, 262-263 psoralen conjugation, 267-270 structure, 263 synthesis, 264-265 raos inhibition, 275 Antisense RNA ColE1 plasmid replication control, 28-29 design strategies, 32-33, 56 FtsZ protein, regulation of expression, 32 IS10 transposable expression control, 31 lysogeny regulation in bacteriophage A, 31 OmpF, regulation of expression, 32 plasmid killer gene expression, regulation, 30-31 ribozyme function, 33-34 R1 plasmid replication control, 29-30

B Bacteriophage A, lysogeny regulation, 31

BC1 RNA biological functions, 85-86 3' end amplification role, 78

367

rodent specificity, 69 subcellular localization, 79, 81 tissue distribution, 79-81 transcription, 68 BC1 RNA gene conservation between rodent species, 72-73 duplication in guinea pig, 71-72 evolutionary origin, 69-71 identifier-element, 75-77 master gene controlling ID family amplification, 74-75 times of amplification, 76 transcriptional regulation, 81-84 bla, mRNA decay mechanisms, 175-176

C Chromatin assembly, reconstitution systems, 345-346 electron microscopy, 344-345 fiber condensation, role of linker histones, 218-220, 236-237, 243, 245, 254 histone H1 and folding of replicating chromatin, 357-359 histone modification and replication, 358-359 limited digestion, 344 nucleosome assembly during complementary strand synthesis, 348 nucleosome-free origins, 350-351 structure chromatin transcription implications, 245-246 core particles, 339-340 diameter, 247 dynamics, 340-341 effect of linker histone removal, 243, 245 high mobility group proteins, 341-342 histones, 338-339 linker DNA, 248-250

368

INDEX

low ionic strength, 242-243 mass per unit length, 247 newly replicated chromatin, 347-348 nucleosome orientation, 247-248, 341 replicated DNA strands, 352-357 supercoiling assay, 342-344 viral minichromosome model, 348-349 Cobra venom ribonuclease, see Ribonuclease "1

D Double-stranded RNA, see also Antisense RNA; Heterogeneous nuclear RNA atomic structure, 6, 8, 10-11 biological origins, 2-3 cell proliferation moderation, 53-54 chemical stability, 17-18 enzymatic cleavage, 18-25 experimental identification, 3-4 molecular properties, 11-13 protein recognition, 13, 15-17, 56 purification, 3 receptor, 51 secondary structure, 5-6 signal transduction in mammalian cells, 51-53 therapeutic applications, 54-55, 57 viral infection role, 53 Double-stranded RNA adenine deaminase biological role, 43-44 physical properties, 43 substrate recognition, 42-44

F Fatty acid chick model and synthesis, 90-91 malic enzyme gene regulation, 92, 104108, 119-120 nutritional state and synthesis, 90 FV3 ribonuclease, properties, 40

G gulK operon, mRNA decay mechanisms, 175 Glucagon, malic enzyme gene regulation, 91-92, 112

H Heterogeneous nuclear RNA double-stranded RNA component, 34-35 experimental identification, 34 High mobility group protein linker histone interactions, 235-236 structure, 341-342 Histones, see Chromatin; Linker histones Human immunodeficiency virus type 1 double-stranded RNA therapy, 55 ribonuclease D properties, 41-42

I Insulin, malic enzyme gene regulation, 91-92

L lac operon, mRNA decay mechanisms, 168-172 Linker histones chromatin fiber condensation, 218-220, 236-237, 243, 245, 254 DNA interactions binding sequence, 231, 233-235 cooperativity, 227 crossover binding, 227-231 linear DNA, 225-226 high mobility group protein interactions, 235-236 location condensed chromatin fiber, 250-253 nucleosome, 237-242, 254 primary structure, 221-223 synthesis, 225 tertiary structure, 223-225 types, 217-218

M Malic enzyme concentration regulation, 94 gene chromatin structure, 108-109, 111-112 cis-acting elements, 112-119 DNase-I hypersensitivity, 108-109, 111 fatty acids and transcription, 92, 104108, 119-120

369

INDEX

hormonal control, 91-92, 102-104, 112, 115-117, 119-122 transcription and protein phosphorylation, 99, 101-102 hepatocyte cell culture system, 92-93 messenger RNA degradation, 97 transcription, 97-99 synthesis in hepatocytes, 94-95 Messenger RNA assay in bacteria, 165-166 binding proteins in bacteria, 205 hulk decay rate, 164-165 chemical decay, 154 assay, 161-163 mechanism, 163-164 decay mechanisms in bacteria, 167-176, 205-208 determinants of decay in bacteria, 179-181 3' determinants, 184-189 5' determinants, 181-184, 194 internal determinants, 190- 193 discovery, 155 endonucleases in bacteria, 196-205 functional decay, 154 assay, 157-159 factors affecting, 159-161 half-life, range of values, 155, 164 pulse-chase experiments, 156-157 regulation in bacteria, 166-167 antibiotic resistance genes, 177-178 cell growth rate, 178-179 ribonuclease genes, 179 ribosomal proteins, 176-177, 205-206 translation role in decay, 193-196

N Nuclease DI, properties, 38 Nuclease DII, properties, 38-39 Nuclease PCI, properties, 39 Nuclease PCII, properties, 39 Nucleosome, see Chromatin

0 (2'-5')01igo(A) synthetase activation by double-stranded RNA, 49

activity in HIV-infected cells, 55 biological role, 49-50 isoforms in humans, 50 ompA, mRNA decay mechanisms, 175, 178

P Pacl ribonuclease, role in yeast, 36, 38 1,10-Phenanthroline-copper,DNA scission chemical mechanism, 125-126 clinical applications, 128-130 footprinting applications, 131-133 inhibition of Escherichia coli DNA polymerase, 124 inhibition of reaction, 127-128 nucleic acid-directed scission, 138-142 protein targeting, 142-145, 147-148 sensitivity to DNA conformation, 130131, 148 site-specific targeting, 138 specificity for metal ion and phenanthroline structure, 126-128 transcription inhibition of analogs, 133-137 virus inactivation, 129-130 Polynucleotide phosphorylase, role in messenger RNA decay, 197-199, 208 Protein kinase, double-stranded RNAdependent interferon response role, 46 mechanism of activation, 48-49 phosphorylation, 47-48 signal transduction in mammalian cells, 51-53 substrates, 46-47, 52-53

R Ribonuclease 11, role in mRNA decay, 197199, 208 Ribonuclease I11 catalytic mechanism, 19 catalytic sites, 22-23 gene, 19, 26-28 phosphorylation, 28 role in mRNA decay, 203-204 substrate recognition, 19-22, 24 Ribonuclease A, double-stranded RNA as substrate, 25

370

INDEX

Ribonuclease BS-1, double-stranded RNA as substrate, 25 Ribonuclease D, properties HIV-I, 41-42 Krebs I1 ascites cells, 38 Ribonuclease DS, properties, 40 Ribonuclease E cleavage sites, 201-202 gene identification, 199-200 purification, 200-201 role in mRNA decay, 196-197, 201-203, 206, 208 Ribonuclease K, role in mRNA decay, 203 Ribonuclease V,/inf, substrate specificity, 24 Ribosomal RNA, translational stop signal decoding, 303-308 Ribozyme, efficiency in prokaryotes, 33-34 RNA, see Antisense RNA; BC1 RNA; Double-stranded RNA; Heterogeneous nuclear RNA; Messenger RNA; Ribosomal RNA RNA annealing protein, mechanism of action, 45-46 RNA helicase, 44-45

S Stop codon, see Translational stop signal

T Thyroid hormone, malic enzyme gene regulation, 92, 102-104, 112, 115-117, 119-120 Translational stop signal amino acid-coding exceptions, 294-296, 316 decoding context effects in mRNA, 297-299 coupling to polypeptide release, 310 models, 303 primary structure of stop signal, 301-303 protein factors, 300-301 release factor role, 308, 310-311, 324, 326 ribosomal RNA role, 303-308 ribosome decoding, 299 template for the decoding event, 309 discovery, 293-294 evolution, 296-297 four-base signals, 319-324 natural stop codons, 294-296, 318-319 physiological advantages, 326-330 recoding at stop codon, 311-318 trp operon, mRNA decay mechanisms, 172-174